Transfering data from one job to another in a sequencer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Transfering data from one job to another in a sequencer

Post by splayer »

I am trying to transfer some rows between 2 jobs in a job sequencer. Let's say, there are some rows which have been acted upon by JobA. I would like JobB to see these changes. I want to do this in memory so using a data set file is not an option. I can use parameters of the job sequencer but it'll be cumbersome. Also, the number of rows will change.
Can anyone suggest a solution?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, unless you're prepared to write a pair of Custom stages to manage the memory. DataStage has nothing out of the box.

Using a Sequential File stage to write to a named pipe in one job and one to read from that named pipe in the other job is a possibility, but co-ordination of the jobs is crucial. Unlike server Sequential File stage (which can handle named pipes automatically) you will need to create and manage the named pipe yourself in a parallel job. Use mknod or mkfifo to create it, for example.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
thompsonp
Premium Member
Premium Member
Posts: 205
Joined: Tue Mar 01, 2005 8:41 am

Post by thompsonp »

What is the purpose of the job sequence in this case?

Both jobs will have to be running simultaneously for the data to be held 'between them' in memory.

Why not just put it all in one job? If that makes for a complicated looking job try using containers.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Job sequences manage flow of control, not flow of data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
thompsonp
Premium Member
Premium Member
Posts: 205
Joined: Tue Mar 01, 2005 8:41 am

Post by thompsonp »

Why do rows acted on by JobA need to be accessed by JobB without first landing them to disk?

I think the fact that a job sequence is used to control these two jobs is diverting attention from the real issue of how to get data from one job to another. Does it make any difference if the jobs are started from a job seqeunce, from Director or from the command line?

There are a variety of ways that data can pass from one job to another most of which land to disk. As Ray metioned named pipes are the obvious exception and you could perhaps argue MQ be included as well.

I think you need to reconsider what problem you are trying to solve and why keeping data in memory is so important in this case.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Named pipes are usually used for simple interprocess communication. DataStage supports named pipes.
Klaus Schaefer
Participant
Posts: 94
Joined: Wed May 08, 2002 8:44 am
Location: Germany
Contact:

Re: Transfering data from one job to another in a sequencer

Post by Klaus Schaefer »

If you store the records it in a EE DataSet - quick and VERY performant - they will be in memory if you act on them with a lookup in job b.

Klaus
Post Reply