diffrence between joiner and lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
arasan
Participant
Posts: 44
Joined: Wed Nov 30, 2005 3:54 am
Contact:

diffrence between joiner and lookup

Post by arasan »

can any one explain me what is the difference between joiner and lookup
and when we can use lookup and when we can use joiner.
sanjeev_sithara
Participant
Posts: 15
Joined: Wed May 26, 2004 6:30 am

Post by sanjeev_sithara »

The main difference between joiner and lookup is in the wasy they handle the data and the reject links.In joiner, no reject links are possible.So we cannot get the rejected records directly.Lookup provides a reject link.Also lookup is used if the data being looked up can fit in the available temporary memory.If the volume of data is quite huge, then it is safe to go for Joiner.
-Sanjeev
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Hi

Join requires the input dataset to be key partitioned and sorted. Lookup does not have this requirement.

As mentioned before Lookup allows reject links. Join does not allow reject links. If the volume of data is huge to be fit into memory you go for join and avoid lookup as paging can occur when lookup is used.

--Balaji S.R
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Don't ignore the Merge stage, which does allow you to capture failed lookups from each reference input separately. It also requires identically sorted and partitioned inputs and, if more than one reference input, de-duplicated reference inputs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

HI,
It would be more useful if discussion involves the handling of duplicates as well.

-Kumar
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Hi

In case of merge stage as part of pre processing step duplicates should be removed from master dataset. If there are more than one update dataset then duplicates should be removed from update datasets as well.

The above mentioned step is not required for join and lookup stages.

--Balaji S.R
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

balajisr wrote:Hi

In case of merge stage as part of pre processing step duplicates should be removed from master dataset. If there are more than one update dataset then duplicates should be removed from update datasets as well.

The above mentioned step is not required for join and lookup stages.

--Balaji S.R
Yes the statments are correct.
Even Acential advise to remove duplicate.
But how all of these stages behave if the input data has duplicates.
a) Master alone duplicate
b) Update alone duplicate
c) Master as well as Update have duplicates.


-Kumar
kwwilliams
Participant
Posts: 437
Joined: Fri Oct 21, 2005 10:00 pm

Post by kwwilliams »

I find the join stage most useful when I want to do a full outer join. With a lookup you can do an outer join with the 0 input data set always passing through the lookup stage. But what if you want both sets of data regardless of whether there is a match on either side. Using the join stage I can get both record sets to pass through.
Post Reply