Range lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Range lookup

Post by vskr72 »

I am trying to build a range lookup. This is done in the stream. I have 3 different range key expressions specified. 2 of them are based on dates and another is based on an integer number.

Having a weird issues. When I try to perform the lookup with just few records (like less than 50), the lookup performs well and I get proper results. But, if I do not have a filter and I let all the records load up the lookup (approx 500K), then the results are weird and wrong. Is there anything that I need to set or specify in the job? Thanks,

Satish
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

We'd need much more in the way of specifics before anyone could help, there's not much to work with when you just say the "results are weird and wrong". :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

As you're providing the greater detail that Craig requested, please also advised how the data are partitioned on every input link to the Lookup stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

I should have provided this earlier:

Code: Select all

 Stream Input: A
Case, Emp, Action_DT
-------------------------
A100, 100, 1/1/2010
A100, 101, 2/1/2010

Reference Input: B
SK, EMP,   E_ST_DT, E_END_DT, O_ST_DT, O_END_DT  
-------------------------------------------------------------
1,  100,  6/1/2009, 4/1/2010, 4/1/2008, 6/1/2010
2,  100,  6/1/2009, 4/1/2010, 4/1/2006, 4/1/2008

For the above datasets, the following are the conditions that I apply:

- A.EMP=B.EMP
- ACTION_DT>= E_ST_DT AND ACTION_DT<= E_END_DT
- ACTION_DT>= O_ST_DT AND ACTION_DT<= O_END_DT

When I just filter the lookup data to EMP=100 (for my testing), I get the SK value as 1 which is correct.

Code: Select all

Reference Input: B
SK, EMP,   E_ST_DT, E_END_DT, O_ST_DT, O_END_DT  
-------------------------------------------------------------
1,  100,  6/1/2009, 4/1/2010, 4/1/2008, 6/1/2010
2,  100,  6/1/2009, 4/1/2010, 4/1/2006, 4/1/2008
3,  200,  6/1/2009, 4/1/2010, 4/1/2008, 6/1/2010
4,  200,  6/1/2009, 4/1/2010, 4/1/2006, 4/1/2008

But, when I include, 2 more records that belong to a differnt EMP like 200, then for the same stream input, I am getting the SK value as 3. I am completely lost in this range lookup deal.

I am using the default 1 node config file and all other partitioning settings are standard. Thanks.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

That is a much better explanation, but you are missing the values for employee 200 in the reference stream - you can omit the employee 100 values since you stated that those work correctly. At least by running with 1 node you eliminated partitioning as the source of you problem.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't know, I got the impression that the input stream only provides the correct information for EMP 100 when there were only records for EMP 100 in the lookup data. Once more rows are introduced into the lookup source for EMP 200, the results for EMP 100 are now wrong. :?

Is that what I'm reading?
-craig

"You can never have too many knives" -- Logan Nine Fingers
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

Yes Craig. Thats exactly what is hapenning.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

I think you haven't got any boundry condition for emp = 100 for which you are getting the correct results. and when ever it encounters any such condition its giving you incorrect results. Test it for the emp for which you are getting incorrect values.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
vskr72
Premium Member
Premium Member
Posts: 128
Joined: Wed Apr 28, 2004 9:36 pm

Post by vskr72 »

Kunal - Can you pls clarify what you said? I did not understand u.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

what if you get a value for emp 100 like


A100, 100, 4/1/2008
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
agpt
Participant
Posts: 151
Joined: Sun May 16, 2010 12:53 am

Post by agpt »

Hi,

you have marked this topic resolved. Could you please update us how exactly you solved this?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

vskr72 wrote:Kunal - Can you pls clarify what you said? I did not understand u.
Please strive for a professional standard of written English on DSXchange. English is difficult enough for people whose first language is something else. SMS-style abbreviations are completely unnecessary - DSXchange is not a mobile telephone and there is no character limit on the length of post.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply