how to decide using lookup or joiner
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 5
- Joined: Tue Aug 26, 2008 9:00 am
how to decide using lookup or joiner
How can we decide to use lookup or joiner when we are loading large amount of data.Thanks.
Hi,
There are quite a few posts out there on this subject.
But I think it comes down to the size of the two input sources that you are trying to combine and what you are wanting to do with the output.
For instance join's can only be across two inputs and there is no reject link, while lookups are really designed to read a reference source into memory and so the reference input size should be less than amount of memory available on the box. So basically if you have two large inputs then a join would probably be better!
There is also the MERGE stage, again it all depends on what your expected output looks like!
There are quite a few posts out there on this subject.
But I think it comes down to the size of the two input sources that you are trying to combine and what you are wanting to do with the output.
For instance join's can only be across two inputs and there is no reject link, while lookups are really designed to read a reference source into memory and so the reference input size should be less than amount of memory available on the box. So basically if you have two large inputs then a join would probably be better!
There is also the MERGE stage, again it all depends on what your expected output looks like!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
A joiner is someone who assembles wooden furniture, particularly cabinetry. Your choice is therefore clear.
Unless, of course, you are loading large amounts of wooden objects...
When's the interview?
Unless, of course, you are loading large amounts of wooden objects...
When's the interview?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
This is not the case. A Join stage can have more than two inputs. In this case pairwise joins are created as intermediate results, the same way that databases do it. The "other" inputs are referred to as Intermediate. I prefer to use cascaded two-input joins to make it clearer what's happening to the next developer.cdp wrote:For instance join's can only be across two inputs and there is no reject link, ...
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Charter Member
- Posts: 193
- Joined: Tue Sep 05, 2006 8:01 pm
- Location: Australia
Do you know what, you are absolutely correct. Maybe I was confusing should with could, but I was always told not too. Sorry for the incorrect advice.ray.wurlod wrote: This is not the case. A Join stage can have more than two inputs. In this case pairwise joins are created as intermediate results, the same way that databases do it. The "other" inputs are referred to as Intermediate. I prefer to use cascaded two-input joins to make it clearer what's happening to the next developer.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: