Difference between Lookups in Server jobs and Parallel jobs.

Minhajuddin · Post by **Minhajuddin** » Sat Jul 14, 2007 10:10 am

Hi,

I am confused as to how lookups in Server jobs are different from lookups in a Parallel job.

And to be specific let's say we have the following server job.....

Code: Select all

                                        OracleStage
                                             |
                                             |
                                             |
                                             |
                                             V
SequentialFile------------->Transformer--------------->SequentialFile

And we have a Parallel job with a single node configuration file.

Code: Select all

                                        OracleStage
                                             |
                                             |
                                             |
                                             |
                                             V
SequentialFile--------------->Lookup----------------->SequentialFile
                                        (Sparse)

How does the performance of these jobs differ assuming that they work on the same DATA. Is the Server job better or the parallel?

And in the documentation it says that when we do a Sparse lookup a select query if fired for every input record. Does the Transformer in the Server jobs use the same logic?

Thank you very much

ray.wurlod · Post by **ray.wurlod** » Sat Jul 14, 2007 2:08 pm

Yes, at a conceptual level.

In practice the server job may be able to use host array processing if row buffering is enabled. Parallel jobs will need to buffer the stream input to the Lookup stage (that is, to insert a buffer operator) to manage the difference in flow rates into the Lookup stage. Dump and inspect the score to see what I mean by that.

Minhajuddin · Post by **Minhajuddin** » Mon Jul 16, 2007 4:31 pm

Thanks for the information Ray...........

Can we do buffering in Parallel jobs with Sparse lookups too?

I heard this from a guy from IBM.

"Don't mess with Buffering, It has drastic impact on your performance"

Just wanted to know if we could tweak our jobs by giving the proper amount of Buffer memory. And how we can do this(Where do we set these things up).

Thank you.

ray.wurlod · Post by **ray.wurlod** » Mon Jul 16, 2007 6:34 pm

Of course you can tweak your jobs by tuning buffering. This includes tweak in both directions!

Ask "this guy from IBM" to amplify the comment. Ask why buffering is tunable if it's not supposed to be tuned.

Information about how to tune buffers and transport blocks is given in IBM's Advanced DataStage class; you might mention that also.