Difference between Lookups in Server jobs and Parallel jobs.

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Difference between Lookups in Server jobs and Parallel jobs.

Post by Minhajuddin »

Hi,


I am confused as to how lookups in Server jobs are different from lookups in a Parallel job.

And to be specific let's say we have the following server job.....

Code: Select all

                                        OracleStage
                                             |
                                             |
                                             |
                                             |
                                             V
SequentialFile------------->Transformer--------------->SequentialFile




And we have a Parallel job with a single node configuration file.

Code: Select all

                                        OracleStage
                                             |
                                             |
                                             |
                                             |
                                             V
SequentialFile--------------->Lookup----------------->SequentialFile
                                        (Sparse)



How does the performance of these jobs differ assuming that they work on the same DATA. Is the Server job better or the parallel?

And in the documentation it says that when we do a Sparse lookup a select query if fired for every input record. Does the Transformer in the Server jobs use the same logic?


Thank you very much :!:
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, at a conceptual level.

In practice the server job may be able to use host array processing if row buffering is enabled. Parallel jobs will need to buffer the stream input to the Lookup stage (that is, to insert a buffer operator) to manage the difference in flow rates into the Lookup stage. Dump and inspect the score to see what I mean by that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

Thanks for the information Ray...........

Can we do buffering in Parallel jobs with Sparse lookups too?

I heard this from a guy from IBM.
"Don't mess with Buffering, It has drastic impact on your performance"
Just wanted to know if we could tweak our jobs by giving the proper amount of Buffer memory. And how we can do this(Where do we set these things up).

Thank you.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Of course you can tweak your jobs by tuning buffering. This includes tweak in both directions!

Ask "this guy from IBM" to amplify the comment. Ask why buffering is tunable if it's not supposed to be tuned.

Information about how to tune buffers and transport blocks is given in IBM's Advanced DataStage class; you might mention that also.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply