Need help with a sparse lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Need help with a sparse lookup

Post by vmcburney »

Having problems with the lookup stage. Let's say I have 10 input rows and I am looking up a surrogate key in a DB2 table with 1 million rows. When I use the lookup stage in Normal mode it loads all 1 million rows down my reference link (even though I only need 10 of them). When I use sparse lookup mode it only brings down 10 rows (hooray) but drops or rejects all 10 primary rows (boo). If I set the lookup type to Continue it lets all 10 rows through with all 10 surrogate key fields populated. In other words the lookup finds 10 matching rows but classifies them all as lookup failures even though they are clearly lookup successes. What is going on?

Is there any way to perform a lookup or join and only retrieve the 10 rows required other then a sparse lookup?

I am disappointed that a sparse lookup does not support lookup conditions like a normal lookup does.
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Vincent,

The sparse lookup is different from the normal lookup. The sparse lookup sends individual SQL statement for every incoming row. It is useful when you want to get the next sequence number from oracle or DB2 sequence. Since it is sending a SQL statement for every incoming row it should not drop or reject the rows.

Are you running the job using single node or multinode? If you are running on multinode try to set the partitioning to Entire on the lookup stage and run the job.

HTH
--Rich
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I didn't think partitioning of the DB2 source would be required since a sparse lookup goes directly against the table. We are willing to pay the price for a sparse lookup since the lookup tables are typically over a million rows and the primary data has less then a thousand rows and the DB2 database is local. The problem we are having is that we cannot always get the sparse lookup to find a match. It seems to be a bit tempramental.

Is same partitioning really necessary when you are doing a sparse lookup to a DB2 table?
dennis77
Participant
Posts: 4
Joined: Tue Dec 16, 2003 9:13 pm

Post by dennis77 »

hi,

Try giving 'DROP' option , If not Found of the lookup stage , if there is no match in your look up table you will not get any warnings or your job will not got aborted .

But any how from the lookup sparse table link you will have the exact number of rows it looked up .

Only the output link returns matched row value /zero value.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

We have found sparse lookups to be very flaky. In one instance the DROP option drops all rows regardless of whether the lookup was successful or not. When we set it to DROP all rows get dropped. When we set it to continue rows get outputted with the lookup fields fully populated. In another instance the lookup fields are null however the lookup link count shows rows were returned. We are at the stage where we don't think we can use sparse lookups at all.
Post Reply