Page 1 of 3

Job Running Logic

Posted: Wed Dec 05, 2007 5:11 pm
by pravin1581
Hi All,

We have a situation where we need to decide whether a job should run or not on certain condition.A job design is such that it is reading data from the database and after that the data is going through further processing.The requirement is to first check if there is any data at all in the table and then go ahead with the further processing of that data.
Basically the intention is to save the time of running the job if there is no data in the job.
We have thought of using UNIX to implement the above logic. We will create a file which will have the count of the records in the table and if the count is 0 then we dont execute the dsjob command for the subsequent job.
But I am looking for a way in DataStage itself.

Please help me out.

Thanks.

Posted: Wed Dec 05, 2007 5:19 pm
by kcbland
You could use a Sequence to organize the job stream and run only on certain conditions. Either have the Sequence run a query to determine if it should run the transformation job. The Sequence could also separate the job design into two halves, one to run the source data extraction and the other to process the results on condition the extracted data is more than 0 rows, either by checking the link statistics or row counting the dataset.

If your job design is a highly complicated organization of multiple independent streams all running simultaneously for early portions of processing, you're going to have to delay those portions until your main source data extraction is completed. I imagine you might have a bunch of reference datasets being extracted and prepared for merges/joins/etc and don't need those running if the main source data ends up being 0 rows.

I think you need to clarify your job design if you want more specific answers.

Posted: Wed Dec 05, 2007 5:40 pm
by ArndW
I would either do an external "select count(*)" from a sequence and parse the results in order to conditionally call a job; or do that count in a DataStage job and set the user return code and use that as the condition in the sequence.

Posted: Wed Dec 05, 2007 8:10 pm
by trokosz
Another idea would be to place a Trabsform after the database read with a Constraint that said @INROWNUM=0 then you can abort or stop processing of the Job or even the Sequencer.

Posted: Wed Dec 05, 2007 11:21 pm
by pravin1581
trokosz wrote:Another idea would be to place a Trabsform after the database read with a Constraint that said @INROWNUM=0 then you can abort or stop processing of the Job or even the Sequencer.
How can we stop the processing of the job based on @INROWNUM.

Posted: Wed Dec 05, 2007 11:27 pm
by pravin1581
kcbland wrote:You could use a Sequence to organize the job stream and run only on certain conditions. Either have the Sequence run a query to determine if it should run the transformation job. The Sequence could also separate the job design into two halves, one to run the source data extraction and the other to process the results on condition the extracted data is more than 0 rows, either by checking the link statistics or row counting the dataset.

If your job design is a highly complicated organization of multiple independent streams all running simultaneously for early portions of processing, you're going to have to delay those portions until your main source data extraction is completed. I imagine you might have a bunch of reference datasets being extracted and prepared for merges/joins/etc and don't need those running if the main source data ends up being 0 rows.

I think you need to clarify your job design if you want more specific answers.

The job design is we are extracting data from the table and then that data goes through further processing such as Join,Lookup. But there are cases when there is no data in the table and hence the time taken for the whole process to run is completely wasted. Hence we need to first take the count of records extracted from the table, if that count is 0 then the process should stop there.

Posted: Wed Dec 05, 2007 11:32 pm
by Minhajuddin
I don't think "@INROWNUM" would help you here. That gives you just the "current" count of the row in a transformer (And since you are doing a database read, it will always return one row even if the count is "0"). And even if it did, any logic which involves "aborting a job" is not advisable.

As it has already been mentioned, you can come up with a sequence in which you decide upon running a job conditionally.

FirstJob (Reads data from your table Select count(*).. and dumps it into a file)
|
|
V
Nested Condition Activity
(This can read data from the file which you create in the first job using a small routine, and based on the data present in the file, continue with the execution or stop it or branch it to a different job)

Posted: Thu Dec 06, 2007 11:15 am
by pravin1581
Minhajuddin wrote:I don't think "@INROWNUM" would help you here. That gives you just the "current" count of the row in a transformer (And since you are doing a database read, it will always return one row even if the count is "0"). And even if it did, any logic which involves "aborting a job" is not advisable.

As it has already been mentioned, you can come up with a sequence in which you decide upon running a job conditionally.

FirstJob (Reads data from your table Select count(*).. and dumps it into a file)
|
|
V
Nested Condition Activity
(This can read data from the file which you create in the first job using a small routine, and based on the data present in the file, continue with the execution or stop it or branch it to a different job)
We are doing is we are taking the count in a file , reading that file in unix to get the value and based on the value next job is triggered.Basically the design is something like this:

Code: Select all

                                               !------Termination activity
  Job 1------- Exec Comd Stg ----!
                                               ! ------ Job 2
The Job 1 extracts the value of count of records and writes that value in the file.
The Exec comd stage executes the unix command of passing the value of the count in a variable that has been declared as a job parameter.

Code: Select all


     value=`cat file`;echo $value
Then the value is passed in the trigger of the 2 outgoing link as return value . If the return value =0 then the control should go to the terminator and if the return value >0 then the control should go to the Job 2.

But the problem is when the control goes to the Terminator link, the job is aborting.

Please suggest a way out or some other solution if you have a one.

Posted: Thu Dec 06, 2007 11:22 am
by chulett
Remove the Terminator. :?

If for some reason you still need the link but need to to do 'nothing', send it to a Sequencer.

Posted: Thu Dec 06, 2007 11:27 am
by pravin1581
chulett wrote:Remove the Terminator. :?

If for some reason you still need the link but need to to do 'nothing', send it to a Sequencer.
But what was the problem with the Terminator, i guess Terminator is designed for such activities, any ways I have incorported Sequencer , but sequencer is used for running the outgoing link based on the sucess of the incoming link(all or any).

Posted: Thu Dec 06, 2007 11:38 am
by chulett
Sequencers are also used to terminate links if need be. And you only need a Terminator when you need something terminated - hence the name. Did you read the help for the stage? It aborts the current process and optionally can abort all other running jobs in the Sequence as well.

Posted: Thu Dec 06, 2007 11:48 am
by pravin1581
chulett wrote:Sequencers are also used to terminate links if need be. And you only need a Terminator when you need something terminated - hence the name. Did you read the help for the stage? It aborts the current process and optionally can abort all other running jobs in the Sequence as well.
But it says that it sends stop signal to the jobs.Anyway thanks a lot for your suggestion.

Posted: Thu Dec 06, 2007 12:11 pm
by chulett
It can optionally send 'stop' requests to other jobs that are still running in the Sequence when you hit the Terminator stage. Then it aborts the Sequence.

Posted: Thu Dec 06, 2007 1:18 pm
by pravin1581
chulett wrote:It can optionally send 'stop' requests to other jobs that are still running in the Sequence when you hit the Terminator stage. Then it aborts the Sequence.
Thanks a lot for all the time, I was just wondering whether the design that I have done is the correct one or there is a better solution of achieving what I wanted. Please give your inputs on that as well.

Advertisement

Posted: Thu Dec 06, 2007 3:45 pm
by ray.wurlod
Do you seek something like a review of existing work?