DSXchange

Posted: **Wed Sep 08, 2004 3:51 am**

I'm trying to 1. Read my file
2. send one field to be processed down one stream
3. send all the fields to be procesed down another stream
4. Wait for the row to have been processed down both streams then continue.

Is this possible? It's the knowing that both streams have finished processing the row that I'm finding tricky.

I think I'd originally been thinking of something like the sequencer in a sequence- waiting for two jobs to finish before calling the next job - is there an equivalent to this?

The reason I'm doing this is I would send the natural key down one stream which would go and check if a surrogate key already exists and generate one if required. I would want my other stream to wait until it knew that the surrogate key had been found or generated before it tried to look it up.

What is the best way of doing this?

I had thought of these alternatives, either

1 - putting the key generation in a separate job. This would mean that I'd have to process my whole file doing the key generation before I started processing any records in the key lookup part.

or

2 - just have one stream - going 1st to the key generation and then going to the key lookup. - The problem with this is that the key generation is in a shared container and so I can't have all my fields defined in its links - just the natural key and surrogate key. I was thinking about creating another field on the shared container for 'mergedcolumns' and putting my whole record into one field which could be passed in and out of the shared container (putting a Merge and RowSplitter stage at either side of the call to shared container.
This would have the advantage that I wouldn't have to do the extra lookup after I know the surrgote key has been created - I could have all the fields I need coming out of the shared container.

Which is the best way to go about this?

Thanks for putting the effort into reading through this long question!

Posted: **Wed Sep 08, 2004 9:30 am**

I would prefer solution 1... but then again I like it KISS

Ogmios

Posted: **Wed Sep 08, 2004 9:41 am**

And what a question indeed. To tell you the truth, I would probably go for two seperate streams, it just sounds (and probably looks) more manageable.

I dont want to send you beating around the bush, and i might add I have not tested the following, but you are welcome to test what I propose...

by coding a job control in BASIC (or using a sequencer by using a routine)you could get the stage info (DSGetStageInfo) or link information (DSGetLinkInfo) that might help you in getting some sort of control within the job before continuing. I am speculating, but maybe

DSGetLinkInfo(DSJ.ME, DSJ.ME, DSJ.ME, DSJ.LINKROWCOUNT)

equals the previous (if you were in a loop process)

DSGetLinkInfo(DSJ.ME, DSJ.ME, DSJ.ME, DSJ.LINKROWCOUNT)

will let you know that the link has finished processing (current.rowcount = previous.rowcount) and you may continue.

grief. There must be an easier way!

Posted: **Wed Sep 08, 2004 10:57 am**

Regardless of the relative processing of each stream, at the end, you still will do a reference lookup to associate the generated/existing key with the input row. With this in mind, I suggest you do not worry about synchronizing at a row level, but instead develop an easy way to identify rows that need surrogate key resolution, and leverage that information as a final job.

Posted: **Wed Sep 08, 2004 1:23 pm**

If you are using routines from the SDK or have your own routines to assign surrogates then it is a simple matter of stream the data, do the lookup, check the results and assign a surrogate if necessary based on the results of the lookup. This is pretty straight forward and something we all do in most implementations.

Regards,

DSXchange

Can I get one stream to wait for another stream to finish?

Can I get one stream to wait for another stream to finish?

Re: Can I get one stream to wait for another stream to finis