Can I get one stream to wait for another stream to finish?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
HSBCdev
Premium Member
Premium Member
Posts: 141
Joined: Tue Mar 16, 2004 8:22 am
Location: HSBC - UK and India
Contact:

Can I get one stream to wait for another stream to finish?

Post by HSBCdev »

I'm trying to 1. Read my file
2. send one field to be processed down one stream
3. send all the fields to be procesed down another stream
4. Wait for the row to have been processed down both streams then continue.

Is this possible? It's the knowing that both streams have finished processing the row that I'm finding tricky.

I think I'd originally been thinking of something like the sequencer in a sequence- waiting for two jobs to finish before calling the next job - is there an equivalent to this?


The reason I'm doing this is I would send the natural key down one stream which would go and check if a surrogate key already exists and generate one if required. I would want my other stream to wait until it knew that the surrogate key had been found or generated before it tried to look it up.

What is the best way of doing this?


I had thought of these alternatives, either

1 - putting the key generation in a separate job. This would mean that I'd have to process my whole file doing the key generation before I started processing any records in the key lookup part.

or

2 - just have one stream - going 1st to the key generation and then going to the key lookup. - The problem with this is that the key generation is in a shared container and so I can't have all my fields defined in its links - just the natural key and surrogate key. I was thinking about creating another field on the shared container for 'mergedcolumns' and putting my whole record into one field which could be passed in and out of the shared container (putting a Merge and RowSplitter stage at either side of the call to shared container.
This would have the advantage that I wouldn't have to do the extra lookup after I know the surrgote key has been created - I could have all the fields I need coming out of the shared container.


Which is the best way to go about this?

Thanks for putting the effort into reading through this long question!
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Can I get one stream to wait for another stream to finis

Post by ogmios »

I would prefer solution 1... but then again I like it KISS

Ogmios
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

And what a question indeed. To tell you the truth, I would probably go for two seperate streams, it just sounds (and probably looks) more manageable.

I dont want to send you beating around the bush, and i might add I have not tested the following, but you are welcome to test what I propose...

by coding a job control in BASIC (or using a sequencer by using a routine)you could get the stage info (DSGetStageInfo) or link information (DSGetLinkInfo) that might help you in getting some sort of control within the job before continuing. I am speculating, but maybe

DSGetLinkInfo(DSJ.ME, DSJ.ME, DSJ.ME, DSJ.LINKROWCOUNT)

equals the previous (if you were in a loop process)

DSGetLinkInfo(DSJ.ME, DSJ.ME, DSJ.ME, DSJ.LINKROWCOUNT)

will let you know that the link has finished processing (current.rowcount = previous.rowcount) and you may continue.

grief. There must be an easier way!
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
chucksmith
Premium Member
Premium Member
Posts: 385
Joined: Wed Jun 16, 2004 12:43 pm
Location: Virginia, USA
Contact:

Post by chucksmith »

Regardless of the relative processing of each stream, at the end, you still will do a reference lookup to associate the generated/existing key with the input row. With this in mind, I suggest you do not worry about synchronizing at a row level, but instead develop an easy way to identify rows that need surrogate key resolution, and leverage that information as a final job.
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

If you are using routines from the SDK or have your own routines to assign surrogates then it is a simple matter of stream the data, do the lookup, check the results and assign a surrogate if necessary based on the results of the lookup. This is pretty straight forward and something we all do in most implementations.

Regards,
Post Reply