diffrence between joiner and lookup

arasan · Post by **arasan** » Tue Jan 03, 2006 10:51 pm

can any one explain me what is the difference between joiner and lookup
and when we can use lookup and when we can use joiner.

sanjeev_sithara · Post by **sanjeev_sithara** » Tue Jan 03, 2006 11:12 pm

The main difference between joiner and lookup is in the wasy they handle the data and the reject links.In joiner, no reject links are possible.So we cannot get the rejected records directly.Lookup provides a reject link.Also lookup is used if the data being looked up can fit in the available temporary memory.If the volume of data is quite huge, then it is safe to go for Joiner.
-Sanjeev

balajisr · Post by **balajisr** » Tue Jan 03, 2006 11:22 pm

Hi

Join requires the input dataset to be key partitioned and sorted. Lookup does not have this requirement.

As mentioned before Lookup allows reject links. Join does not allow reject links. If the volume of data is huge to be fit into memory you go for join and avoid lookup as paging can occur when lookup is used.

--Balaji S.R

ray.wurlod · Post by **ray.wurlod** » Tue Jan 03, 2006 11:49 pm

Don't ignore the Merge stage, which does allow you to capture failed lookups from each reference input separately. It also requires identically sorted and partitioned inputs and, if more than one reference input, de-duplicated reference inputs.

kumar_s · Post by **kumar_s** » Wed Jan 04, 2006 1:35 am

HI,
It would be more useful if discussion involves the handling of duplicates as well.

-Kumar

balajisr · Post by **balajisr** » Wed Jan 04, 2006 2:17 am

Hi

In case of merge stage as part of pre processing step duplicates should be removed from master dataset. If there are more than one update dataset then duplicates should be removed from update datasets as well.

The above mentioned step is not required for join and lookup stages.

--Balaji S.R

kumar_s · Post by **kumar_s** » Wed Jan 04, 2006 3:08 am

balajisr wrote:Hi

In case of merge stage as part of pre processing step duplicates should be removed from master dataset. If there are more than one update dataset then duplicates should be removed from update datasets as well.

The above mentioned step is not required for join and lookup stages.

--Balaji S.R

Yes the statments are correct.
Even Acential advise to remove duplicate.
But how all of these stages behave if the input data has duplicates.
a) Master alone duplicate
b) Update alone duplicate
c) Master as well as Update have duplicates.

-Kumar

kwwilliams · Post by **kwwilliams** » Wed Jan 04, 2006 9:21 am

I find the join stage most useful when I want to do a full outer join. With a lookup you can do an outer join with the 0 input data set always passing through the lookup stage. But what if you want both sets of data regardless of whether there is a match on either side. Using the join stage I can get both record sets to pass through.