Processing missing files in a directory.

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Shailendra
Participant
Posts: 40
Joined: Tue Nov 15, 2005 11:53 pm

Processing missing files in a directory.

Post by Shailendra »

Hi,

I have a scenario where I have to process sysdate-1 file on daily basis. That is taken care of. Sometimes the file is missing and it is placed there on the next day. Now, I have to check daily if the file is missing and if it is missing, then process 2 files the following day sysdate-2 and sysdate-1 files.

Eg:

Files
-------
20060301
20060302
20060303
20060304


On 20060302, I process 20060301 file. Suppose on 20060303, 20060302
file is missing, then on 20060304 I have to check and process both
20060302 and 20060303 files.

Apologies if it is confusing:)

Any help is appreciated.

Thanks,
Shailendra
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The only way you're going to know if a 'missing' file shows up out of sequence is to track the filenames that you have processed. Any kind of 'day - x' sweeping is going to be prone to error.

I would suggest you capture the filenames as you process them. Check 'new' files against the list and only process them if you haven't seen them before.

There are several options for processing multiple files. Concatenation is one. Version 7.5.x Sequence jobs have Loop stages and between them and the User Variables stage, it's pretty straight forward to build a loop that reads a directory and runs your processing job, once for each file found. If you are comfortable writing your own 'job control' you could roll something up to accomplish this as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Shailendra
Participant
Posts: 40
Joined: Tue Nov 15, 2005 11:53 pm

Post by Shailendra »

Thanks chulett, will try it out.

Shailendra
Shailendra
Participant
Posts: 40
Joined: Tue Nov 15, 2005 11:53 pm

Post by Shailendra »

Chulett,

You said to use the loop stages, I am really not sure where to start to search for the files and how to use the loop stages.

Once we keep track of the processed files, what is the next step?


Can you explain it?

Thanks a bunch,
Shailendra
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Shailendra

Loop stage can be used inside Sequencer jobs. What you need to do is to get the list of file names and supply this list to loop start stage with some delimiter between each files. And you can have your job to execute inside your loop. Read manual for more documentaion about how to use loop stage.

Thanks
Regards
Siva

Listening to the Learned

"The most precious wealth is the wealth acquired by the ear Indeed, of all wealth that wealth is the crown." - Thirukural By Thiruvalluvar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Specifically, page 6-31 of the Designer's Guide (version 7.5.1A) is where you want to start reading. There is a section on the Start Loop Activity, End Loop Activity and User Variables Activity stages with examples of how they can be used in a job.

Don't be afraid to experiment. Not everything is explained in the manual. Come back here with specific questions. As noted, the User Variables stage can be used to (amongst other things) gather up a list of filesnames in a directory. The Loop Start stage can then take that delimited list and pull things off one by one, passing a filename (for example) downstream to your processing job inside the loop. The Loop End stage checks to see if there is more to do and passes control back or onwards as appropriate.

Good luck! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply