How can DS listen/loop on a directory for coming files?
Moderators: chulett, rschirm, roy
How can DS listen/loop on a directory for coming files?
Hi all,
Is it possible for DS to loop or listen for new files coming though FTP to a given directory and pick only the successful FTPed files (i.e. should not pick not fully transferred files )
Personally, I used to use C++ code that do listen and trigger the DS job for every files.
The major issue "for me" is how I can use a timeout or whatever in order to determine if the file is successfully landed or not where I'll ignore it and continue looking for others files.
any comments please!!
Is it possible for DS to loop or listen for new files coming though FTP to a given directory and pick only the successful FTPed files (i.e. should not pick not fully transferred files )
Personally, I used to use C++ code that do listen and trigger the DS job for every files.
The major issue "for me" is how I can use a timeout or whatever in order to determine if the file is successfully landed or not where I'll ignore it and continue looking for others files.
any comments please!!
-
- Participant
- Posts: 85
- Joined: Fri Jun 04, 2004 2:30 am
- Location: Melbourne, Australia
- Contact:
What you need is a control file for each file you are looking for. Once a file is FTPed (file1.csv) another control file is FTPed (file1.ctl) to indicate the transfer has completed. In the control file you could even have a row count to check if the correct number of records have been FTPed across.
Cheers,
Dave Nemirovsky
Dave Nemirovsky
yes, "almost" I did what you said by using C++ code as I mentioned , but can I use DS to do it ?is it by using BASIC language?adamski wrote:We have used both the control file method and a delay that analyses the timestamp.
Read the files timestamp, wait a pre-determined amount of time. Wake up and read it again. If it has not changed, assume the file has landed, and then compare the row count with the control file.
many thanks
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
I asked this on the course, and was rather astonished that the tutor didn't think it was possible to process unknown file names. It isn't unusual to have to process files that are named with some kind of incrementing sequence, maybe with a date and time included in the name, but I can't see how DataStage could do this. Any suggestions?alraaayeq wrote:yes, "almost" I did what you said by using C++ code as I mentioned , but can I use DS to do it ?is it by using BASIC language?
In general, I would recommend the control-file approach over polling for timestamp changes.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
Sure, it's possible. It's a simple matter to issue a 'dir' or 'ls', whether you match a regular expression as part of it or look for all files, and capture the output. A call to DSExecute will do that for you. Then you can loop thru the output and do what is needed - check the size, run a job with that filename, whatever.
Agreed on the control file. We call it a 'semaphore' file and I've used it for years at multiple sites. Just make sure the people sending the files understand they need to send the control file last.
The 'polling for changes' approach can be... problematical.
Agreed on the control file. We call it a 'semaphore' file and I've used it for years at multiple sites. Just make sure the people sending the files understand they need to send the control file last.
The 'polling for changes' approach can be... problematical.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Here's one (it only takes one pass, but is executed by an external job running the loop - you could wrap the loop around this code if you preferred).
From the General tab:
Returns -1 if the directory does not exist or could not be opened.
Note that no check is made to verify that the directory pathname is, indeed, that of a directory, so this routine could work just as effectively with hashed files or B-tree files.
To check whether the file is a directory, check the value returned by STATUS() within the THEN clause of the OPENPATH statement. This will be 19 or 1 if the pathname is that of a directory.
Code to handle this check has been included, but is disabled. It may be enabled by defining the token called CheckingForDirectoryOnly.
Code: Select all
FUNCTION FilesInDirectory(Directory, WildCard, OutputFile)
* History (most recent first)
* Date Programmer Version Details of Modification
* ---------- --------------- ------- -------------------------------------
* 27/09/2003 Ray Wurlod 2.0.0 Initial coding
*
$INCLUDE UNIVERSE.INCLUDE FILEINFO.H
DEFFUN OpenTextFile(FileName, OpenMode, AppendMode, Logging) Calling "DSU.OpenTextFile"
* The following token can be defined to restrict the code to handling directories only.
* See comments on General tab.
$UNDEFINE CheckingForDirectoryOnly
* Take copy of argument so as to avoid side effects if changing value.
argWildcard = Wildcard
* Open output file for writing, overwriting if it exists
If Len(OutputFile)
Then
Output.fvar = OpenTextFile((OutputFile), "W", "O", "Y")
Reporting = FileInfo(Output.fvar, FINFO$IS.FILEVAR)
End
* Substitute generic wildcard if none provided. Handle multiple and asterisk wildcards.
If argWildcard = "" Then argWildcard = "..."
Convert "~" To @VM In argWildcard
argWildcard = Ereplace(argWildcard, "*", "...", -1, 0)
* Open the directory as if it were a table.
OpenPath DirectoryPath To Directory.fvar
On Error
Ans = -Abs(Status())
End
Then
$IFDEF CheckingForDirectoryOnly
FileType = Status()
If FileType = 19 Or FileType = 1
Then
$ENDIF
* Establish Select List #9 as a sorted list of file names in the directory.
ClearSelect 9
SSelect Directory.fvar To 9 ; * SSelect generates sorted list
* Initialize count of file names in directory.
Ans = 0
* For each file name increment answer if file name matches desired pattern.
Loop
While ReadNext FileName From 9
If FileName Matches argWildcard
Then
Ans += 1
If Reporting
Then
WriteSeq FileName To Output.fvar Else NULL
End
End
Repeat
If Reporting
Then
CloseSeq Output.fvar
End
$IFDEF CheckingForDirectoryOnly
End
Else
Ans = -99 ; * pathname is not that of a directory
End
$ENDIF
* Close the directory to free resources, as file unit no longer needed.
Close Directory.fvar
End
Else
Ans = -Abs(Status())
End
RETURN(Ans)
Returns -1 if the directory does not exist or could not be opened.
Note that no check is made to verify that the directory pathname is, indeed, that of a directory, so this routine could work just as effectively with hashed files or B-tree files.
To check whether the file is a directory, check the value returned by STATUS() within the THEN clause of the OPENPATH statement. This will be 19 or 1 if the pathname is that of a directory.
Code to handle this check has been included, but is disabled. It may be enabled by defining the token called CheckingForDirectoryOnly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 85
- Joined: Fri Jun 04, 2004 2:30 am
- Location: Melbourne, Australia
- Contact:
In reply to Phil Hibbs post:
I'm not sure where you did your course or who took the course but obviously it wasn't Ray!I asked this on the course, and was rather astonished that the tutor didn't think it was possible to process unknown file names. It isn't unusual to have to process files that are named with some kind of incrementing sequence, maybe with a date and time included in the name, but I can't see how DataStage could do this. Any suggestions?
Cheers,
Dave Nemirovsky
Dave Nemirovsky
yah, it works when other people (where files come form) ready to help or participate with youchulett wrote:...
Agreed on the control file. We call it a 'semaphore' file and I've used it for years at multiple sites. Just make sure the people sending the files understand they need to send the control file last.
Aah, I hate using it, but sometime you do not have any other choices specially when many other departments and legacy system are involved.chulett wrote:...
The 'polling for changes' approach can be... problematical.
I found it better to use two loops
1- outer loop that is infinite loop and has
2- inner loop that every time get the list of coming files
please see how to get the list of files in a directory here
viewtopic.php?t=90528&highlight=
1- outer loop that is infinite loop and has
2- inner loop that every time get the list of coming files
please see how to get the list of files in a directory here
viewtopic.php?t=90528&highlight=