trigger condition

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

trigger condition

Post by prasannak »

I have a question regarding polling within datastage.

The specific criteria being that we are required to trigger our datastage jobs(a master sequence rather) based on the successful completion of certain jobs
from an existing application that is built using oracle as the database.
Our master sequence is triggered from an external scheduler tool called autosys.
The external jobs are pl/sql based and run using a scheduler tool called autosys.
We are looking at options like polling for a status table for these external jobs and when the status indicates "success", the corresponding relevant datastage jobs would need to get triggered.
Is it possible to do a database table polling within datastage? Like for eg: keep querying the status table from a certain point every 5 minutes or so and then trigger our jobs when the desired value is present in the table...

One option that comes to mind involves having to build a dependancy within autosys scheduler so that when the autosys job for the
external job completes, we can trigger the master sequence datastage job from the autosys.
But, doing this would mean that we would have a very tight dependancy with the external application at the scheduler level and if possible, we would prefer to avoid it...

Any other options/inputs would be highly welcome...
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

Post by prasannak »

Nobody got any ideas :roll: :roll: :?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: trigger condition

Post by chulett »

prasannak wrote:One option that comes to mind involves having to build a dependancy within autosys scheduler so that when the autosys job for the external job completes, we can trigger the master sequence datastage job from the autosys. But, doing this would mean that we would have a very tight dependancy with the external application at the scheduler level and if possible, we would prefer to avoid it...
:? Why in the world would you want to 'avoid' this? That is exactly what an Enterprise Scheduler is all about. Make it so, Number One.
-craig

"You can never have too many knives" -- Logan Nine Fingers
shamshad
Premium Member
Premium Member
Posts: 147
Joined: Wed Aug 25, 2004 1:39 pm
Location: Detroit,MI

Post by shamshad »

(1)
When your so called "External Process" completes successfully, you can create a zero byte file like externaljobcompleted.txt.

(2)
The PL/SQL that triggers tha AUTOSYS that in turns executes the MASTER SEQUENCER can check for the existence of the above file and if present execute the master sequencer. Meaning in AUTOSYS, instruct to check for above text file every 10-15 minutes or as you desire and then if AUTOSYS finds the file execute the MASTER SEQ.

(3)
Design a job that DELETE the above externaljobcompleted.txt file once the MASTER SEQUENERS completes SUCCESSFULLY
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Why go through all those gyrations? When Autosys completes running the 'external job' successfully, have it trigger the DataStage job next. Done.
-craig

"You can never have too many knives" -- Logan Nine Fingers
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

Post by prasannak »

thanks for all the responses...
I have thought about doing what Craig had mentioned...
But, its a requirement that enforces us not to have a cross application dependancy at the scheduler level directly...
text file option has also been thought about...i know we can use the WaitforfileActivity...but, seems like a crude way of implementing this kind of polling...
My question is: Is it possible to poll for a table value from datastage?


I searched through an old post from Criag where he talks about this:

viewtopic.php?t=83354&highlight=polling
Sure... and you can make it as simple or as complicated as you like. I have a job that polls for unprocessed rows in a control table, something like you indicate above. If found, I process some information to hash that the following jobs use. This initial job is launched from a 'Batch' in a loop that checks for processed rows using DSGetLinkInfo. If none are processed, it sleeps and loops back around for another go until it either finds what it expects or exceeds its polling window. Sleep time and number of times to poll are Job Parameters, and I get paged if nothing is found or the process is running 'late'. Once the flag(s) are found, the batch passes control on to the next series of jobs. Somewhere along the line (where appropriate) there is a job that resets the flags found.
But, I did not understand the inner details on this...!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

prasannak wrote:I have thought about doing what Craig had mentioned... But, its a requirement that enforces us not to have a cross application dependancy at the scheduler level directly...
Sorry, but this kind of stuff bothers me. :evil:

Who came up with this 'requirement'? Push back - this is exactly why your company bought an Enterprise Scheduler, they are built to handle scheduling depedancies across the Enterprise regardless of system, platform, application, etc. That's what they do.
-craig

"You can never have too many knives" -- Logan Nine Fingers
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

Post by prasannak »

Thanks Criag,

I guess you are bent on not answering my question! :roll: :wink:

I do not unfortunately participate in the decision making process of requirements...
I just try my best to abide by the decisions...and if they are silly, then I question them...but, this requirement is not something ridiculous in my opinion...alrite...to some extent, may be...
but not outright...!
Sometimes, applications are better served if cross applications dependancies are kept to a minimum...
and if there is a way to reduce this, then why not...

My question again was: Is there a way to poll for a table value from datastage...and keep polling until a particular column value is found...then trigger the sequencer from then on...
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

Post by prasannak »

Thanks Criag,

I guess you are bent on not answering my question! :roll: :wink:

I do not unfortunately participate in the decision making process of requirements...
I just try my best to abide by the decisions...and if they are silly, then I question them...but, this requirement is not something ridiculous in my opinion...alrite...to some extent, may be...
but not outright...!
Sometimes, applications are better served if cross applications dependancies are kept to a minimum...
and if there is a way to reduce this, then why not...

My question again was: Is there a way to poll for a table value from datastage...and keep polling until a particular column value is found...then trigger the sequencer from then on...
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

prasannak wrote:My question again was: Is there a way to poll for a table value from datastage...and keep polling until a particular column value is found...then trigger the sequencer from then on...
This seems like a simple thing (If I am not missing any major details).

Pre-requisite: A row in table 'X' gets updated with a value of 'Success'

Create a sequence which has:

1)A job which does your polling, You can simply query your table using a DB (maybe by using job parameters in the "where" clause)
SELECT 'SUCCESS' FROM TABLE_X WHERE COL_VALUE='SUCCESS'
You can write this to a file, say File "foo.txt".

2) A script which checks for rows in your file "foo.txt". If you find a row you trigger the next job, else you "sleep" for a specified interval and then loop back to the first job. If you find the record you trigger your sequence.

Code: Select all

Start_loop
 |
 |
V
Job which dumps data
 |
 |
V
Execute Command activity
 |
 |
V
Nested Condition---------> Trigger the Master sequence
 |
 |
V
End_Loop
 |
 |
V
Sleep
 |
 |
V
Go to the Beginning of the loop
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Of course there are all kinds of ways to do what you want, some more (unnecessarily) complicated than others. I guess you are bent on not accepting the One True Answer. :wink:

One 'polls' in a looping structure. My old post you linked to was all about a hand coded solution. Nowadays, Sequence jobs support loops with the Start and End Loop stages. Leverage those.

Build a small job to select this polled value from your control table. Write the results out somewhere, flat file for instance, only when found. Run this job inside a Loop in a Sequence.

Build a generic routine to get link row counts from a job using DSGetLinkInfo with the DSJ.LINKROWCOUNT InfoType. Run it inside the loop after the polling job to see if it found anything - i.e. when the link row count to the flat file is > 0. In that case, exit the loop and continue on to your processing job, otherwise loop back around and poll again.

You should also consider putting a 'sleep' into the loop so it doesn't just constantly run but instead only poll every x minutes. That and consider building a way to interrupt it and a way for it to stop after a certain amount of time or number of loops have gone by.

This could also be done in a shell script if you were more comfortable playing out in the korn.
-craig

"You can never have too many knives" -- Logan Nine Fingers
prasannak
Premium Member
Premium Member
Posts: 56
Joined: Thu Mar 20, 2008 9:45 pm
Contact:

Post by prasannak »

Thanks to Minhajudin and Craig...

Now, I have more than one idea to play with...
The reason why I detest cross application dependancies at scheduler level is the very very tight dependancy it creates on more than one application...but, some times, to keep things simple, this is the way to go...and it becomes highly cumbersome if there are so many jobs with so many interdependancies among them as well as across applications...and this is what we are endeavoring to minimize...to some extent...

anyway, thanks again for all your inputs...much appreciated... :)
Post Reply