Hash file output to input

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Hash file output to input

Post by Kryt0n »

Hi,
Apologies if this has been raised and answered before, I have tried searching but lost for a good query line...

I have a job that has two input files, one writes to a hash file and the other uses this hash file as a reference, something like the below:

Input -> TX -> HF
. . . . . . . . . . . . . |
. . . . . . . . . . . . .\/ (apologies about the '.', only wat I could get the formatting)

Input . . . -> . . TX -> DB

Now my understanding is that the two input streams will get kicked off at the same time, hopefully that is correct, please correct me if wrong.

On this assumption, will the second stream wait for the hash file to be loaded before using it as a reference? Are there any settings you can make to the hash file to ensure the second stream waits?

My view is that the second stream will run a check against the first and it is effectively pot-luck as to whether the reference is there in time.

Can someone please confirm or enlighten me?
Thanks
Ryan
ranga1970
Participant
Posts: 141
Joined: Thu Nov 04, 2004 3:29 pm
Location: Hyderabad

Post by ranga1970 »

I am totally consfused, could you be more clear in what you want.....
RRCHINTALA
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

Input1 ---->  T1  ---->  HF
                          |
                          V
Input2  ----------------> T2  ------> Target
A passive stage (in this case the Hashed File stage) can not open its outputs until its inputs are closed.

Therefore the lower Transformer stage (T2) cannot process the first row from Input2 until Input1 is completely processed (and the hashed file fully populated).

Therefore, in turn, your assumption is not correct, and the reference will be there in time. Yay!

(You can use the Code tags to get the format right -- use Preview until it is right, then Submit. See above.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Thanks for that!

I will learn how to format eventually... :lol:
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

ray.wurlod wrote:A passive stage (in this case the Hashed File stage) can not open its outputs until its inputs are closed.
Right, being a bright new day and with my tendency to make simple situations complex, I have a further query...

With the example above, would the hash file (or at least the DS Engine) be clever enough to know it has an input and an output and therefore refuse to open the output until the input has given it relevant instructions?

Or does it only open its input (or output) when requested to do so (and as such, could receive an open request on output first)

If this is in the manual, please feel free to send me away to find it, just never seen it addressed and would like to ensure I have an understanding of the process.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Kryt0n wrote:With the example above, would the hash file (or at least the DS Engine) be clever enough to know it has an input and an output and therefore refuse to open the output until the input has given it relevant instructions?
Basically, yes. It is 'clever enough' to understand the dependancies between the different segments of your jobs and knows it need to complete the hash file build (the 'Input') before it can make the 'Output' available as a lookup. So, the writes would complete, the stage would close the hash and then turn around and open it for reading, caching it into memory if requested.

BTW, you can see all of this happening in the job's log. The stage start and finish operations are logged, so you can see the order it happens in and rows counts for each 'finished' stage right there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply