Running Multiple Jobs in Parallel in DataStage Sequencer

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Running Multiple Jobs in Parallel in DataStage Sequencer

Post by jerome_rajan »

Hi,

I created a sequencer that does something runs a job and upon completion triggers 2 other jobs in parallel. The problem is that the jobs do not trigger in parallel but in sequence. What am I doing wrong?

This is how the code looks. Job 2 and Job 3 should trigger at the same time since the trigger condition is the same in Job 1 for both downstream jobs

Code: Select all

Job 1 -------> Job 2
 |
 |
 v
Job 3
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Are you using the job activity stage to run your job 2 and job 3?

How much time elapses before the second job gets started?

The two jobs will never start at exactly the same instant. A few seconds apart would be normal, but on a heavily loaded system that could take a little longer.

A sequence job is a single-threaded BASIC program that only executes statements sequentially. You can see this if you look at the generated code.

A job activity stage simply starts a job, which will typically take a second or two.

So in your example, a job activity will start job 2, then once job 2 has started, a job activity will start job 3.

But if your job 2 is something other than a job activity stage (e.g. routine activity or execute command activity), then job 3 won't start until job 2 is completed.

In summary, even though things appear to be parallel on the design canvas, everything is really sequential in the generated code. Jobs only appear to start in parallel because the generated code does not wait for a job to finish before starting the next job.

Mike
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Been asked before, here for example.
Mike wrote:A job activity stage simply starts a job, which will typically take a second or two.

So in your example, a job activity will start job 2, then once job 2 has started, a job activity will start job 3.
Now it's been awhile but that's not how I remember it, nor what I thought the OP reported (assuming they are using Job Activity stages and what 'trigger in sequence' means). They start and wait for the associated job to complete before moving on, thus Job2 completing before Job3 starts. As Ray noted in the linked post, you would need to use a different stage (Execute Command, Routine Activity) to get a job started in the background for the Sequence to move on and be able to start other processes without waiting.

I've got no way to test any of this and happy to be proven wrong. Obviously, with answers that are polar opposites, one of us must be. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Mike wrote:.....
But if your job 2 is something other than a job activity stage (e.g. routine activity or execute command activity), then job 3 won't start until job 2 is completed.
...
Mike
Thank you for the elaborate response, Mike. Job 2, in my case is an 'Execute Command' activity which means that my design will never really achieve complete parallelism.
How can I make Job 2 and Job 3 run in parallel? Job 2 runs a script that does some validation on the tables loaded in Job 1. Job 3 reads from the table loaded in Job 1. But the validation process and Job 3 are independent and exclusive of each other.

Craig,
I haven't renewed my membership as yet. Hence can't see your reply. Thanks for the reply though.
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Could you run the script in the background so the EC stage completes immediately? You'd lose the ability to monitor its execution, unfortunately, so it would have to do its own notifications.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

I would go with Craig's suggestion to run your table validation in the background as that will be visible to someone supporting your application.

I'm going to propose a "trick" that might work for you, but if you go that route and find that it works, please be sure to clearly document it for your future application support.

You want to influence the job sequence's generated code so that the job 3 job activity comes before the job 2 execute command activity. I don't completely recall whether code generation follows a link creation sequence or a stage creation sequence, so you'll have to experiment with both.

For link creation order:
1) Delete both links
2) Add the link to job 3
3) Add the link to job 2

For stage creation order:
1) Delete both stages and their links
2) Add the job 3 job activity stage and link to it
3) Add the job 2 execute command stage and link to it

Compile the job sequence and investigate the generated code to see if job 3 comes before job 2.

Mike
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Awesome Awesome Awesome! The background execution did the trick for me.
Mike's first reply definitely made me a tad wiser than before. Thank you ! :D
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
Post Reply