Parallel jobs in sequencer

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
MAT
Participant
Posts: 65
Joined: Wed Mar 05, 2003 8:44 am
Location: Montréal, Canada

Parallel jobs in sequencer

Post by MAT »

Hi,

I am running my jobs with a routine controling them to be able to set parameters and get some info about links and other things. My problem is that when I create my sequences, I use the same routine to call all my job but I would like them to run in parallel.

When I run my jobs without a routine, I can easily make them parallel but with the the routine controling them, it seems only one job can run at a given time. The server waits for one to finish before starting another. Is it because DataStage cannot start multiple instances of a routine at the same time or am I missing something else? And if I can't run my jobs in parallel using a routine, is there another way of setting parameters into a job and getting stats from the execution then using a controling job or routine.

I am using DataStage 6.0 on a UNIX server

Thanks

MAT
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Post by bigpoppa »

DS 6.0 Server or PX? If it's PX, I think the limitation you mentioned is a known bug and will be fixed in future releases. I think you can make a copy of the job and run it simultaneously with the original.

- BP
MAT
Participant
Posts: 65
Joined: Wed Mar 05, 2003 8:44 am
Location: Montréal, Canada

Post by MAT »

thanks for a quick reply!!

I am using Server. The problem is not about running jobs in parallel, it's when running them from a routine. It this case, they always run one after another.

Also, my company is moving to PX for the next project, are you saying there is a bug about running parallel jobs in the parallel extender. That would be strange!!!! [?]


MAT
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Post by bigpoppa »

no, you can run PX jobs in parallel, but you can't use the same PX job twice in one sequence (I'm pretty sure).. this 'feature' might be fixed in the version that you will be getting.

- BP
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Could you clarify, are you trying to run all your jobs from a single routine or are you running them from parallel routine stages?

If you are running all your jobs from a single routine then you need to work out where your WaitforJob statements are being placed. These are the statements that put your routine into wait mode and prevent further processing.

It is usually easier to build parallel processing with multiple routine stages passing a job name to the run job routine. That way you can see the parallel paths on the Sequence palette.

Vincent McBurney
Data Integration Services
www.intramatix.com
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DSAttachJob() - and, presumably, UtilityRunJob() - can use a job name that includes an invocation ID, for example MyJob.Instance2

To run in parallel using DSRunJob(), simply issue all the DSRunJob() calls before waiting for any of the jobs to finish. This is not possible making separate calls to UtilityRunJob().


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
MAT
Participant
Posts: 65
Joined: Wed Mar 05, 2003 8:44 am
Location: Montréal, Canada

Post by MAT »

Thank you all,

Here is some precision,
I am using a single routine similar to UtilityRunJob made by the sdk. Using "Routine Activity" Jobs in a Job Sequencer on DS Server. I call the same routine every time with the name of the job I want to run. I do not run the same job twice in one sequencer. I think the problem comes most certainly from the DSWaitForJob(). What you are saying is that this call prevents DataStage from running anything else before the job ends, is it?
What I was trying to do with my routine was to develop a "container" in which the developpers could place their jobs and that would take care of setting parameters from a file and getting stats after the job run without them having to change anything about their methods of developement.
The way around this, for keeping the ability to set params and get stats, would be to write a bigger routine containing every job in my sequence. Call DSRunJob at the same time for the jobs I want to run in parallel. What do you think?

Regards

MAT
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

What you are doing is correct, using a single routine to run any job is a good practice as it centralised your job preparation and post processing. You *should* be able to run multiple copies of this routine at the same time as long as they are not linked within the sequence job.

Don't know why this is not working for you, perhaps there is some code within your routine that is locking a table or file. If you open any files within your routine make sure they are closed before the DSWaitForJob is executed.

I suggest you open a call with Ascential tech support and send them the routine code since the method you are trying to use is a good approach and should work.

Vincent McBurney
Data Integration Services
www.intramatix.com
MAT
Participant
Posts: 65
Joined: Wed Mar 05, 2003 8:44 am
Location: Montréal, Canada

Post by MAT »

Thanks Vincent,
I am now fairly certain that my problem was coming from the calls to DSWaitForJob() within my routine. With this structure, I could not ask for DS to run many jobs at the same time, the server was always waiting for a job to finish. Problem is I had to use DSWaitForJob() to get the stats I wanted from my jobs after a run.

I thought about it all day and I came with an alternate solution. It may not be of much use to all you pros out there but still...a couple of newbies like me might find it interesting.

My main goal was: Run all my jobs by setting all job parameters during run time. The source for param values being a Seq file. I also need to gather stats about a job: Start time, link row count, etc.
Running my jobs from a routine was my first solution but you lose the ability to run them in parallel (DSWait). Instead, I decided to run my sequence from a routine since my job sequences can be exclusive. This routine would simply assign all parameters needed by every job inside to the sequence parameters (from the file). I can then tell my jobs to fetch the value of each param in the job sequence list. To get stats about a particular run, I found you can attach a job after it was executed and what you get are the infos about the last run. I needed only an after sub-routine to do that. It attaches to my job, gather every info, writes to my stats files and detach. It works a lot faster then before since all my jobs are started as soon as they are ready.

Some of you may find it of some use...maybe

Regards

MAT
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The usual alternative to DSWaitForJob is to write a "busy loop" that can do other things (run other jobs even) while waiting.

ErrorCode = DSRunJob(hJob, DSJ.RUNNORMAL)

* Replace DSWaitForJob with the following loop
Loop
* do other stuff
JobStatus = DSGetJobInfo(hJob, DSJ.JOBSTATUS)
While JobStatus = DSJS.RUNNING
* maybe do other stuff
Sleep 30 ; * seconds
Repeat

* at this point, the job is no longer running.

You can extend the algorithm to kick off N jobs, and keep looping until none of them is in a running state.


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
Post Reply