Lookup on which stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
algfr
Participant
Posts: 106
Joined: Fri Sep 09, 2005 7:42 am

Lookup on which stage

Post by algfr »

Hello guys,

Just a very quick question ?

I need to do a lookup on an Oracle database, is it worth loading the table into a dataset or not ?

If I'm not wrong, it is no more possible to produce an output link out of an OCI stage that has an input link (except reject) which forces me to create an extra job if I need to load a dataset.

Thank you very much
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

Have you actually tried?

It is very possible to use an oracle stage directly as a referential input to a lookup.
algfr
Participant
Posts: 106
Joined: Fri Sep 09, 2005 7:42 am

Post by algfr »

ShaneMuir wrote:Have you actually tried?

It is very possible to use an oracle stage directly as a referential input to a lookup.
Hi ShaneMuir,

Yes I have

Just want to know if it takes down performance or not. In other words, is it compulsory to create the datasets to avoid performance drop ?

I heard the wew version creates virtual datasets when lookups on oracle stages occur.
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

It depends on amount of data that you are handling ...How much data you will have in reference link ??
Nag
algfr
Participant
Posts: 106
Joined: Fri Sep 09, 2005 7:42 am

Post by algfr »

nagarjuna wrote:It depends on amount of data that you are handling ...How much data you will have in reference link ??
My biggest dim is about 150,00 records but the average is below 10,000.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Unless you are doing a sparse lookup, the data are always loaded into a (virtual) Data Set. You don't have to do anything. This just happens.

Loading into a (physical) Data Set won't change performance at all in the big picture, but may time-shift the cost of loading the data set to an earlier phase of processing. The gain will be slight. A Lookup File Set may be preferable, because that also time-shifts the building of the hash table index on the lookup table (LUT) in memory.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
algfr
Participant
Posts: 106
Joined: Fri Sep 09, 2005 7:42 am

Post by algfr »

Thank you, this answer my question.

Then I don't have to bother to create a lookup directly from Oracle reference sources.

I suppose this is the main difference with previous versions where it was strongly recommended to insert intermediate datasets.

Good day all
Post Reply