Trigger Parallel Jobs and define Invocation ID in a Sequence

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Daddy Doma
Premium Member
Premium Member
Posts: 62
Joined: Tue Jun 14, 2005 7:17 pm
Location: Australia
Contact:

Trigger Parallel Jobs and define Invocation ID in a Sequence

Post by Daddy Doma »

Hi Guys,

(My question is similar to an existing thread on this forum (viewtopic.php?t=106299) but I didn't want to hijack that discussion.)

I am trying to get jobs to run in parallel in a sequence:

Code: Select all

                     |--> Job_Activity_2 --> Job_Activity_5a ---|
                     |                                          |
Job_Activity_1 ---------> Job_Activity_3 --> Job_Activity_5b -----> Sequencer (All) --> Report_Status
                     |                                          |
                     |--> Job_Activity_4 --> Job_Activity_5c ---|

When Job_Activity_1 is finished, jobs 2, 3 and 4 need to be run. There are no dependencies between these, so I would like to get them happening at the same time instead of sequentially. How?

I tried specifying multiple triggers from Job_Activity_1 that all use the same condition. In theory, the next three jobs should all fire at the same time. But according to the log in Director, they are occurring one after the other.

I also tried inserting a Branch stage with one output link from Job_Activity_1 splitting into output links for jobs 2, 3 and 4, to no avail.

Invocation ID

The other issue is that as Job_Activity_2, 3 and 4 finish, they all use the multi-instance Job_Activity_5 (designed using Runtime Column Propogation). I'd like to use an Invocation ID to distinguish between the different runnings of Job_Activity_5. How can I pass the invocation id to Job_Activity_5 within a sequence?

Regards,

Zac.
When you know that you are destined for greatness by virtue of your mutant heritage it is difficult to apply yourself to normal life. Why waste the effort when you know that your potential is so tremendous?
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

If the Invocation Is is not determined at runtime, you and pre determine the values and pass it as hardcoded or as parameter.
And what makes you to assert that Job_Activity_2,3 and 4 runs one after other. You may find an entry with JobControl (DSWaitForJob) with Job1+2+3. The other logs will show you the detail information of each job. After all the information are stored in a sequential manner. You may find the time of execution of each job with overlaping limit.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Trigger Parallel Jobs and define Invocation ID in a Sequ

Post by chulett »

Daddy Doma wrote:I tried specifying multiple triggers from Job_Activity_1 that all use the same condition. In theory, the next three jobs should all fire at the same time. But according to the log in Director, they are occurring one after the other.

I also tried inserting a Branch stage with one output link from Job_Activity_1 splitting into output links for jobs 2, 3 and 4, to no avail.
Either design should work fine... and I'm assuming a 'Branch' stage means a Sequencer? Or something else? :?

It cannot launch all three jobs 'at the same time' but it should have started them one after the other as quickly as it could. It's not like it is waiting for one to finish before it starts the next, so that's the best degree of 'parallelism' you can get from a Sequence job.

On the Invocation ID issue, there is an area in the Job Activity stage where the Invocation ID is entered for multi-instance jobs. As noted, unless these values need to be derived at runtime you can simply 'hard code' three distinct values across the three invocations. Otherwise, you need to explain where the values you want to use need to come from.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Daddy Doma
Premium Member
Premium Member
Posts: 62
Joined: Tue Jun 14, 2005 7:17 pm
Location: Australia
Contact:

Post by Daddy Doma »

...determine the values and pass it as hardcoded or as parameter.
I know the Invocation ID and would like to do this. But I am not sure how.

I should explain further - we are using the UtilityRunJob Routine Activity to call our parallel jobs. I'm sure there is a simple way to get the Invocation Id into this, but I don't know how.
You may find an entry with JobControl (DSWaitForJob) with Job1+2+3.
Nope. There is a seperate entry for each job, i.e.
  • - JobControl (DSWaitForJob): Waiting for job Job_Activity_2 to finish
    - JobControl (DSWaitForJob): Waiting for job Job_Activity_3 to finish
    - JobControl (DSWaitForJob): Waiting for job Job_Activity_4 to finish
The other logs will show you the detail information of each job...You may find the time of execution of each job with overlaping limit.
Again, no. Each of the logs has an initial Control entry for when the jobs were started. The start of Job 3 occurs after Job 2 has finished. Job 4 does not begin until Job 3 is complete.

It is actually more complicated than that. If you look at my original diagram, you see that each job is followed by an instance of Job 5.
  • - Well, once Job 2 finishes, it kicks off Job 5.
    - When Job 5 is complete, then Job 3 starts, folowed by running Job 5 again.
    - Finally, Job 4 runs and is followed by Job 5.
At the end of this, the All sequencer lets the job status know that all have run okay.
I'm assuming a 'Branch' stage means a Sequencer? Or something else?
No, by Branch I meant a Nested Condition stage - the help documentation mentions branching so I used that as an analogy. Sorry for confusion.
When you know that you are destined for greatness by virtue of your mutant heritage it is difficult to apply yourself to normal life. Why waste the effort when you know that your potential is so tremendous?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Daddy Doma wrote:
...determine the values and pass it as hardcoded or as parameter.
I know the Invocation ID and would like to do this. But I am not sure how.

I should explain further - we are using the UtilityRunJob Routine Activity to call our parallel jobs. I'm sure there is a simple way to get the Invocation Id into this, but I don't know how.
Yes, you should. So your design as posted is... misleading... as you are not using Job Activity stages as your naming implied. Why in the world would you use the UtilityRunJob routine in a Routine Activity stage when you've got a stage specifically built to run jobs? That fact alone explains your problem. :?

If you had used the Job Activity stage, it would have worked as you would have expected. And it would have been obvious how to handle the Invocation ID. To answer your question, using the routine you've chosen you pass the Invocation ID after a 'dot' as part of the job name. For example:

Job_Activity_5.InvocationA
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

For the record, that routine is meant to be used in a job - a Server or PX job, not a Sequence job - where you need to run a job using data from each incoming row from your source. For example, a sequential file containing filenames - read each record and run a job per record that loads that file.

The functionality was pretty much replaced by the Looping constructs available now in Sequence jobs.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Daddy Doma
Premium Member
Premium Member
Posts: 62
Joined: Tue Jun 14, 2005 7:17 pm
Location: Australia
Contact:

Post by Daddy Doma »

Point conceeded about the incorrect diagram. I adapted it from the other topic and didn't think to replace with JobRoutine.

We use the Routine Activity in sequence jobs because we are heavily reliant on RCP. We have many, many jobs that use the same parameters and follow the same design. But if we use a Job Activity stage then we need to specify the ETL job in the sequence, thereby requiring many different sequences. By using a Routine Activity, we only need to change the values of the arguments to pick up different jobs and one single sequence design can be reused for many jobs.

Do you believe that using the Job Activity stage would eliminate the issues about parallelism I am describing?
When you know that you are destined for greatness by virtue of your mutant heritage it is difficult to apply yourself to normal life. Why waste the effort when you know that your potential is so tremendous?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Daddy Doma wrote:Do you believe that using the Job Activity stage would eliminate the issues about parallelism I am describing?
Yes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply