Job Running Logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Job Running Logic

Post by pravin1581 »

Hi All,

We have a situation where we need to decide whether a job should run or not on certain condition.A job design is such that it is reading data from the database and after that the data is going through further processing.The requirement is to first check if there is any data at all in the table and then go ahead with the further processing of that data.
Basically the intention is to save the time of running the job if there is no data in the job.
We have thought of using UNIX to implement the above logic. We will create a file which will have the count of the records in the table and if the count is 0 then we dont execute the dsjob command for the subsequent job.
But I am looking for a way in DataStage itself.

Please help me out.

Thanks.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

You could use a Sequence to organize the job stream and run only on certain conditions. Either have the Sequence run a query to determine if it should run the transformation job. The Sequence could also separate the job design into two halves, one to run the source data extraction and the other to process the results on condition the extracted data is more than 0 rows, either by checking the link statistics or row counting the dataset.

If your job design is a highly complicated organization of multiple independent streams all running simultaneously for early portions of processing, you're going to have to delay those portions until your main source data extraction is completed. I imagine you might have a bunch of reference datasets being extracted and prepared for merges/joins/etc and don't need those running if the main source data ends up being 0 rows.

I think you need to clarify your job design if you want more specific answers.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I would either do an external "select count(*)" from a sequence and parse the results in order to conditionally call a job; or do that count in a DataStage job and set the user return code and use that as the condition in the sequence.
trokosz
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 16, 2004 6:38 pm
Contact:

Post by trokosz »

Another idea would be to place a Trabsform after the database read with a Constraint that said @INROWNUM=0 then you can abort or stop processing of the Job or even the Sequencer.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

trokosz wrote:Another idea would be to place a Trabsform after the database read with a Constraint that said @INROWNUM=0 then you can abort or stop processing of the Job or even the Sequencer.
How can we stop the processing of the job based on @INROWNUM.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

kcbland wrote:You could use a Sequence to organize the job stream and run only on certain conditions. Either have the Sequence run a query to determine if it should run the transformation job. The Sequence could also separate the job design into two halves, one to run the source data extraction and the other to process the results on condition the extracted data is more than 0 rows, either by checking the link statistics or row counting the dataset.

If your job design is a highly complicated organization of multiple independent streams all running simultaneously for early portions of processing, you're going to have to delay those portions until your main source data extraction is completed. I imagine you might have a bunch of reference datasets being extracted and prepared for merges/joins/etc and don't need those running if the main source data ends up being 0 rows.

I think you need to clarify your job design if you want more specific answers.

The job design is we are extracting data from the table and then that data goes through further processing such as Join,Lookup. But there are cases when there is no data in the table and hence the time taken for the whole process to run is completely wasted. Hence we need to first take the count of records extracted from the table, if that count is 0 then the process should stop there.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

I don't think "@INROWNUM" would help you here. That gives you just the "current" count of the row in a transformer (And since you are doing a database read, it will always return one row even if the count is "0"). And even if it did, any logic which involves "aborting a job" is not advisable.

As it has already been mentioned, you can come up with a sequence in which you decide upon running a job conditionally.

FirstJob (Reads data from your table Select count(*).. and dumps it into a file)
|
|
V
Nested Condition Activity
(This can read data from the file which you create in the first job using a small routine, and based on the data present in the file, continue with the execution or stop it or branch it to a different job)
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

Minhajuddin wrote:I don't think "@INROWNUM" would help you here. That gives you just the "current" count of the row in a transformer (And since you are doing a database read, it will always return one row even if the count is "0"). And even if it did, any logic which involves "aborting a job" is not advisable.

As it has already been mentioned, you can come up with a sequence in which you decide upon running a job conditionally.

FirstJob (Reads data from your table Select count(*).. and dumps it into a file)
|
|
V
Nested Condition Activity
(This can read data from the file which you create in the first job using a small routine, and based on the data present in the file, continue with the execution or stop it or branch it to a different job)
We are doing is we are taking the count in a file , reading that file in unix to get the value and based on the value next job is triggered.Basically the design is something like this:

Code: Select all

                                               !------Termination activity
  Job 1------- Exec Comd Stg ----!
                                               ! ------ Job 2
The Job 1 extracts the value of count of records and writes that value in the file.
The Exec comd stage executes the unix command of passing the value of the count in a variable that has been declared as a job parameter.

Code: Select all


     value=`cat file`;echo $value
Then the value is passed in the trigger of the 2 outgoing link as return value . If the return value =0 then the control should go to the terminator and if the return value >0 then the control should go to the Job 2.

But the problem is when the control goes to the Terminator link, the job is aborting.

Please suggest a way out or some other solution if you have a one.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Remove the Terminator. :?

If for some reason you still need the link but need to to do 'nothing', send it to a Sequencer.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

chulett wrote:Remove the Terminator. :?

If for some reason you still need the link but need to to do 'nothing', send it to a Sequencer.
But what was the problem with the Terminator, i guess Terminator is designed for such activities, any ways I have incorported Sequencer , but sequencer is used for running the outgoing link based on the sucess of the incoming link(all or any).
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sequencers are also used to terminate links if need be. And you only need a Terminator when you need something terminated - hence the name. Did you read the help for the stage? It aborts the current process and optionally can abort all other running jobs in the Sequence as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

chulett wrote:Sequencers are also used to terminate links if need be. And you only need a Terminator when you need something terminated - hence the name. Did you read the help for the stage? It aborts the current process and optionally can abort all other running jobs in the Sequence as well.
But it says that it sends stop signal to the jobs.Anyway thanks a lot for your suggestion.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It can optionally send 'stop' requests to other jobs that are still running in the Sequence when you hit the Terminator stage. Then it aborts the Sequence.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

chulett wrote:It can optionally send 'stop' requests to other jobs that are still running in the Sequence when you hit the Terminator stage. Then it aborts the Sequence.
Thanks a lot for all the time, I was just wondering whether the design that I have done is the correct one or there is a better solution of achieving what I wanted. Please give your inputs on that as well.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Advertisement

Post by ray.wurlod »

Do you seek something like a review of existing work?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply