Page 1 of 1

Runnning a data stage job continiously

Posted: Mon May 18, 2009 11:17 pm
by sjordery
Hi All,

Set of files lands on a defined directory on data stage server.
The names of the files are

EM01MMDD.M
EM11MMDD.M
EM13MMDD.M

The requirements are like; the data stage process should run continuously to check for any of the above three files and it should process once it finds any of the three files.

I am thinking how to run the data stage job continuously?
I am aware of scheduling it through UNIX script but we can only schedule them to run in a particular time, is it possible to make that run continuously?
And how the three file names will be passed to a single multi instance process?

Any help on this will be appreciated.

Regards,
Sjordery

Posted: Mon May 18, 2009 11:54 pm
by mahadev.v
A sequence in infinite loop. Check if file exists, then trigger the job. Else Check again. You would also need to handle the abort condition for the job.

Posted: Tue May 19, 2009 12:12 am
by ray.wurlod
It would probably be a sequence, rather than an individual job, that runs continuously. What happens if another of the files appears while one is being processed? You have to be able to handle rapid arrival of files. Don't forget to be able to trigger the job to shut down gracefully, perhaps by checking for existence of another file (perhaps called .shutdown).

Posted: Tue May 19, 2009 4:20 am
by nagarjuna
You can use the wait for file activity stage within a sequence to see for the existance of the file and depending on that trigger the datastage jobs or you can write a unix script to verify for the file existance and write dsjob command .
while [ $variable ]
do
if [ check for existance of files ]
then
dsjob
else
sleep 2 ( wait sometime )
fi
done

Posted: Tue May 19, 2009 6:17 am
by sjordery
Mahadev,

If I am not wrong you are talking about the start loop and end loop activity stage in the sequence job.
But how to make this to run continiously and since the file names we cant hard code as date in the file name chnages every day so how to make this generic?

Ray,

No that is not the case,all the 3 files lands in different time.

Any suggestions?

Posted: Tue May 19, 2009 6:56 am
by priyadarshikunal
If you want to process those files one by one then you can use the diagram below

Code: Select all

start loop-------ls -lt EM01*|wc -l ----0-----Sleep 60-----End loop
                                  |
                                  |
                                  |
                               take the first file(is more than 1)
                                  |
                                  |
                             Process the job
                                  |
                                  |
                              Rename/Move the file
                                  |
                                  |
                          End loop Activity
Else you can tweak it a bit to work as you want. And this can also be derived from the earlier posts.

Posted: Tue May 19, 2009 8:32 am
by sjordery
Thanks Priyadarshi.
I will update on this.

Regards,
Sjordery.

Posted: Tue May 19, 2009 8:35 am
by ShaneMuir
Not sure if that will make it run continuously as I am pretty sure there is a limit to the number of iterations that a sequence can make.

You can however have a multiple instance sequence. Then just have the sequence call itself with a different instance. You only need 2 instances for this to work. Ie after a specified number of iterations, instance 0 can start instance 1 and vice versa. That way the sequence is continously running.

Posted: Tue May 19, 2009 9:45 am
by chulett
The only thing that is safe to run 'continuously' is an RTI enabled / SOA / WISD job. I wouldn't even consider anything else. At the very least, give it a breather, a chance to stop and restart again - for example, run it over 'business hours' and then let it sit idle for some period at night... or vice versa... even if that 'rest period' is only a few minutes.