Warnings while using lookup file set

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bala_135
Premium Member
Premium Member
Posts: 156
Joined: Fri Oct 28, 2005 1:00 am
Location: Melbourne,Australia

Warnings while using lookup file set

Post by bala_135 »

Hi,

I am using a lookup fileset as the lookup and sequentila file as the primary. I am having the following warining when i run the job.I have also tried changing the partition of the input file(primary) to entire.I getting the following warning.

Lkp: Input Dataset 0 has partitioning other than Entire specified; disabling memory sharing.

Regards,
Bala.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Bala,

you didn't really ask a question, just posted a statement. I will assume that you would like to know why PX is giving you this warning. It has to do with the partitioning, or lack thereof, on Sequential files as well as your lookup file set partitioning options. The message is perfectly normal (and correct) for your configuration.

The partitioning is a fundamental concept for PX jobs; you should take a quick look at the "Parallel Job Developer's Guide" pages 2-7 onwards to refresh your knowledge on how a PX job performs it's partitioning; plus chapter 7 on the details of using a lookup file set (in particular page 7-7).
Benouche
Participant
Posts: 15
Joined: Tue Apr 22, 2003 8:54 am
Location: France

Post by Benouche »

Bala,

this warning is issued by DataStage because you did partition your reference data set (Fileset) in a mode other than Entire.

The standard use of Lookup is to use Entire mode for the reference data set, so that the data will be shared in memory to perform the lookup operation for all input partitions.

-> use Entire for reference and any partitionning for input dataset and the warning will disappear :D

Benouche
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To clarify Benouche's post, if you are going to use a partitioned Lookup File Set, it must be partitioned in exactly the same way as the stream input. This guarantees that all lookup keys will be in the correct partition.

Since a single sequential file can not readily be partitioned on a column, you must guarantee that any key can be found on any processing node. The Entire partitioning algorithm puts every key on every processing node - a little wasteful, perhaps, but the only way to guarantee that every key will be found.

By specifying anything other than Entire you generate the warning.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bala_135
Premium Member
Premium Member
Posts: 156
Joined: Fri Oct 28, 2005 1:00 am
Location: Melbourne,Australia

Post by bala_135 »

Hi Ray/Benouche,

Thank you very much. I am using a Dataset the source now. I have changed the partition type to Entire for all links(ie the job which generate the lookup fileset).Also I have changed the partition type of the job which uses the lookup fileset as the reference(ie I have changed the partition type to Entire for my Primary) but still I am getting the warning.My config file has only one node.

Regards,
Bala.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

HI Bala,
It is not required to place the Entire partion on the job which produces the lookup file set. It is enough to Partion entirely only on the reference link of the lookup stage.
Since it is a Single node partion, it is not a must to use entire partion, because both hash partion (on the key on which the lookup is done) and the entire partion severs the same purpose.
Indeed, Entire partion for lookup stage is more meaningfull for MPP system and not for SMP systems.
In MPP the reference data should be available in all the stream, where as in SMP, since its a common memory sharing, its ok to have a hash partiton on the key on which the lookup is done. But need to be careful that main stream is also partioned on the same key.

-Kumar
Post Reply