Lookup file set best place to use it

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
myukassign
Premium Member
Premium Member
Posts: 238
Joined: Fri Jul 25, 2008 8:55 am

Lookup file set best place to use it

Post by myukassign »

In one of Rays post I saw Dataset is the best stage to use if I want to store data for a reference link and the data volume is huge.

Then practically this lookup file set is designed to use at which place. Can anyone give me a scenario where this stage do the best fit.

I am in to designing these days and sorry for asking this question as this stage leave me in a confused state.
madhukar
Participant
Posts: 86
Joined: Fri May 20, 2005 4:05 pm

Re: Lookup file set best place to use it

Post by madhukar »

Lookup fileset to be used, if reference data fits into memory and used in multiple jobs
srinivas.g
Participant
Posts: 251
Joined: Mon Jun 09, 2008 5:52 am

Post by srinivas.g »

Spare lookup better to use lookup file set
Srinu Gadipudi
srinivas.g
Participant
Posts: 251
Joined: Mon Jun 09, 2008 5:52 am

Post by srinivas.g »

Spare lookup better to use lookup file set
Srinu Gadipudi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I think you must have misinterpreted what I said.

The only place you can use a Lookup File Set is to service the reference input to a Lookup stage. The benefit you get is that the index on the lookup key is pre-built - it does not have to be build "on the fly" by the LUT_CreateOp operator.

And, yes, the total volume of reference data must still fit into memory.

Data Sets are really more about staging data between parallel jobs, rather than servicing lookups. However, it would be an interesting exercise to determine whether the cost of the import operator associated with a Lookup File Set would be greater or less than the cost of the LUT_CreateOp operator building the index if a Data Set were used.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
myukassign
Premium Member
Premium Member
Posts: 238
Joined: Fri Jul 25, 2008 8:55 am

Post by myukassign »

In my experience.. lookup fileset did a better job than the dataset....

I stand for Lookup file set than DS.

Thanks for your reply. I m closing this thread.
Post Reply