join,merge, lookup
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 97
- Joined: Tue Feb 21, 2006 6:45 am
join,merge, lookup
how to decide which to use and when. The document says use join when the physical memory is less or else use lookup. merge is used when all the inputs are datasets. My doubt is
a) how to assess practically that the physical memory is less to take join stage.
b) when to use merge? shouldnt we use join or lookup when the inputs are datasets?
-Beginner to DS
a) how to assess practically that the physical memory is less to take join stage.
b) when to use merge? shouldnt we use join or lookup when the inputs are datasets?
-Beginner to DS
-
- Participant
- Posts: 97
- Joined: Tue Feb 21, 2006 6:45 am
Merge can be used ideally for anything .. A seq file, a Dataset, etc...
Anything with data can be used to merge. Although I have not tried everything. But the manual would say which ones are not compatible with merge stage.
Join and Lookup stage are classically used to make joins as it says by name.
Anything with data can be used to merge. Although I have not tried everything. But the manual would say which ones are not compatible with merge stage.
Join and Lookup stage are classically used to make joins as it says by name.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 97
- Joined: Tue Feb 21, 2006 6:45 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There are various other criteria. The decision should really depend on what functionality you need - the memory savings of the Join and Merge stages are countered by the need to sort upstream (which eats most of the saved memory!).
Lookup stage gives you conditional lookup. It allows condition failures and lookup failures to be directed to a rejects link, but there is only one rejects link.
Join stage performs a join. Like SQL join, there is no concept of a reject and therefore the Join stage does not support a rejects link.
Do you need to preserve the key columns from the secondary inputs, or is it OK that they be consumed? That, too, would affect your choice of stage type.
Merge stage lets you separately capture the failed "lookups" from multiple update Data Sets. This might best suit your requirements.
Lookup stage gives you conditional lookup. It allows condition failures and lookup failures to be directed to a rejects link, but there is only one rejects link.
Join stage performs a join. Like SQL join, there is no concept of a reject and therefore the Join stage does not support a rejects link.
Do you need to preserve the key columns from the secondary inputs, or is it OK that they be consumed? That, too, would affect your choice of stage type.
Merge stage lets you separately capture the failed "lookups" from multiple update Data Sets. This might best suit your requirements.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Some generic difference among the three.
Code: Select all
Join Lookup Merge
Description RDBMS-style Source and lookup Master table and one
Relational tables table in RAM or more update tables
Number and names 2 or more inputs 1 source and N lookup 1 master tables and N
Of inputs tables update tables
Memory usage Light Heavy Light
Handling of Ok, produces a OK Warning given
duplicates in Cross-product Duplicates will be an
primary input unmatched primary.
Handling of Ok, produces a Warning given. The Ok only when N =1
duplicates in Cross-product second lookup table
secondary input entry is ignored.
Options on
Unmatched None None Capture in reject sets.
secondary
On match, secondary reusable reusable reusable
entriesare
Number of outputs 1 1 output and 1 output and 1 reject for
optionaly 1 reject each update table
Captured in reject Does not apply Unmatched primary Unmatched Secondary entries
Set entries
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'