Diff Between Lookup, Join And Merge

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
reddysrc
Participant
Posts: 39
Joined: Mon May 30, 2005 3:19 pm

Diff Between Lookup, Join And Merge

Post by reddysrc »

Hi Experts,

I am still confusing the difference between Lookup, Join and Merge Stage .

in which scenario which Stage do we need to use.


thx
Reddysrc
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Lookup stage is for small reference data sets where exact or range lookup capability is required. Duplicates of the lookup key are permitted in the stream input - each will be looked up separately. Lookup stage is also indicated if you need to access huge data sets directly from the database tables in which they are stored (sparse lookup). Failed lookups can cause the stream input row to be directed to a reject link.

Join stage is when you want to perform a regular SQL-style join. There is no concept of a reject link; output is the result of the join. For outer joins, some of the columns need to be declared to be nullable.

Merge stage implements a master/update model. Rows from the update links are consumed, so duplicates from the master input will only be matched once. Separate reject links can be created for each separate update link. Properties of the stage include how you want to handle unmatched masters.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply