Runnning a data stage job continiously
Moderators: chulett, rschirm, roy
Runnning a data stage job continiously
Hi All,
Set of files lands on a defined directory on data stage server.
The names of the files are
EM01MMDD.M
EM11MMDD.M
EM13MMDD.M
The requirements are like; the data stage process should run continuously to check for any of the above three files and it should process once it finds any of the three files.
I am thinking how to run the data stage job continuously?
I am aware of scheduling it through UNIX script but we can only schedule them to run in a particular time, is it possible to make that run continuously?
And how the three file names will be passed to a single multi instance process?
Any help on this will be appreciated.
Regards,
Sjordery
Set of files lands on a defined directory on data stage server.
The names of the files are
EM01MMDD.M
EM11MMDD.M
EM13MMDD.M
The requirements are like; the data stage process should run continuously to check for any of the above three files and it should process once it finds any of the three files.
I am thinking how to run the data stage job continuously?
I am aware of scheduling it through UNIX script but we can only schedule them to run in a particular time, is it possible to make that run continuously?
And how the three file names will be passed to a single multi instance process?
Any help on this will be appreciated.
Regards,
Sjordery
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It would probably be a sequence, rather than an individual job, that runs continuously. What happens if another of the files appears while one is being processed? You have to be able to handle rapid arrival of files. Don't forget to be able to trigger the job to shut down gracefully, perhaps by checking for existence of another file (perhaps called .shutdown).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
You can use the wait for file activity stage within a sequence to see for the existance of the file and depending on that trigger the datastage jobs or you can write a unix script to verify for the file existance and write dsjob command .
while [ $variable ]
do
if [ check for existance of files ]
then
dsjob
else
sleep 2 ( wait sometime )
fi
done
while [ $variable ]
do
if [ check for existance of files ]
then
dsjob
else
sleep 2 ( wait sometime )
fi
done
Nag
Mahadev,
If I am not wrong you are talking about the start loop and end loop activity stage in the sequence job.
But how to make this to run continiously and since the file names we cant hard code as date in the file name chnages every day so how to make this generic?
Ray,
No that is not the case,all the 3 files lands in different time.
Any suggestions?
If I am not wrong you are talking about the start loop and end loop activity stage in the sequence job.
But how to make this to run continiously and since the file names we cant hard code as date in the file name chnages every day so how to make this generic?
Ray,
No that is not the case,all the 3 files lands in different time.
Any suggestions?
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
If you want to process those files one by one then you can use the diagram below
Else you can tweak it a bit to work as you want. And this can also be derived from the earlier posts.
Code: Select all
start loop-------ls -lt EM01*|wc -l ----0-----Sleep 60-----End loop
|
|
|
take the first file(is more than 1)
|
|
Process the job
|
|
Rename/Move the file
|
|
End loop Activity
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.![Wink :wink:](./images/smilies/icon_wink.gif)
Genius may have its limitations, but stupidity is not thus handicapped.
![Wink :wink:](./images/smilies/icon_wink.gif)
Not sure if that will make it run continuously as I am pretty sure there is a limit to the number of iterations that a sequence can make.
You can however have a multiple instance sequence. Then just have the sequence call itself with a different instance. You only need 2 instances for this to work. Ie after a specified number of iterations, instance 0 can start instance 1 and vice versa. That way the sequence is continously running.
You can however have a multiple instance sequence. Then just have the sequence call itself with a different instance. You only need 2 instances for this to work. Ie after a specified number of iterations, instance 0 can start instance 1 and vice versa. That way the sequence is continously running.
The only thing that is safe to run 'continuously' is an RTI enabled / SOA / WISD job. I wouldn't even consider anything else. At the very least, give it a breather, a chance to stop and restart again - for example, run it over 'business hours' and then let it sit idle for some period at night... or vice versa... even if that 'rest period' is only a few minutes.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers