Page 1 of 1

checking for file availability in a folder

Posted: Tue Aug 21, 2007 8:51 am
by bgs_vb
Hi Gurus,

I have a case, our team has designed jobs to fetch data from a folder and put it in a staging table and from there they load it to DW. All these are available in sequence.

The initial part will be a file (eg: xxxxx.D.MI5.010101120345) will be dropped by another team into our folder. we pick the file and do the rest of the processing. This seq fails when more than 1 file is available in the folder. one of the solution proposed was to write a shell script to check for file availablity in the folder. once the file is available we should trigger the sequence to execute.

note - i know the design is not good. but we need to handle this case.

pls help me out in solving this case. i dont have exposure in shell scripting. if any script is availble pls share.

Advance Thanks,
bgs_vb

Re: checking for file availability in a folder

Posted: Tue Aug 21, 2007 2:38 pm
by ds_is_fun
Did you try this-
In the properties of Seq file stage.
Click on Read method under Source and select "File pattern" from the pop down menu.
I strongly feel that this should work in your case.
Good luck!

Sequence has both parallel and server jobs

Posted: Tue Aug 21, 2007 4:05 pm
by bgs_vb
in my case i have almost 8 jobs both parallel & sequence combined and put in the SEQUENCER. 1st job is a sequence job which fetches file from folder stage and processes it.

as i said earlier, a file is dropped in the folder by another program anytime of the day and i should 1st check if any file has come into the respective folder, if so available then execute the whole sequence.

one more option i tried was with START_LOOP, End_LOOP stage. i had issue there as i could not fetch the file name, again, as it is dropped in a folder.


start_loop 1---> start_loop 2---> wait_for_file_stage ---> sequencer ---> end_loop 1 |
|
end_loop 2 (goes to start_loop2)

in the wait_for_file stage my file is returned as 'sssss.p.qs9.123456789' and not as ssssss.p.qs9.123456789

Posted: Tue Aug 21, 2007 7:23 pm
by JoshGeorge
When the 'wait_for_file_stage' detects a file in the waiting folder, move the file into a temporary folder if you want. Then call the sequence which is having the job with folder stage where you can 'fetch the file name'. Put the above sequence in a loop to go back and wait for another file.

Posted: Tue Aug 21, 2007 7:54 pm
by dsrules
bgs_vb,

Write a small shell script continously sweeping the folder for the file with the given specific pattern, if the file is present you can trigger the sequence. In korn shell use if [[ -f *pattern ]]. You dont need Start Loop and End Loop stages, just a while loop that runs for a specific time with a 30 sec sleep time would be fine.

dsrules

Re: Sequence has both parallel and server jobs

Posted: Tue Aug 21, 2007 9:18 pm
by Yuan_Edward
I would run a shell script as a backgroup process to check the existing of the file and then trigger the DataStage to run if detected.
bgs_vb wrote: as i said earlier, a file is dropped in the folder by another program anytime of the day and i should 1st check if any file has come into the respective folder, if so available then execute the whole sequence.

as i said earlier i have very little exposure to unix script

Posted: Wed Aug 22, 2007 8:59 am
by bgs_vb
if anyone has that script could share.

Thanks
bgs_vb