join vs lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vamsi.4a6
Participant
Posts: 334
Joined: Sun Jan 22, 2012 7:06 am
Contact:

join vs lookup

Post by vamsi.4a6 »

When i had discussion with my team mate he told look up stage should be used when the reference data is small.Not sure what could be the reason and even he do not know.Any thoughts on this
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There is a discussion of Lookup versus Join in the documentation. Suggest you start there. You can also find discussions here if you search, for example this one.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Lookup loads the reference data set into memory, so that your lookups are performed in memory (at memory speed) and there is no need to sort the data (a hash table index is also created in memory). Join does require that both its inputs be sorted.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Then there is also sparse lookup, also documented.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply