Complex Lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
daniel0623
Charter Member
Charter Member
Posts: 34
Joined: Tue May 31, 2005 8:17 pm
Location: ShangHai,China

Complex Lookup

Post by daniel0623 »

Dear all,

How to implement complex lookup in DSEE? Here is the description of problem:

I have 2 files: A and B. And I want to get file C through following arithmetic:

File A:
A1 A2
1 X
3 Y
5 Z

File B:
B1 B2 B3
0 2 U
2 4 V
4 6 W

Lookup condition: (A1 >= B1) and (A1 < B2)

Result file C:
C1(from A.A1) C2(from A.A2) C3(from B.B3)
1 X U
3 Y V
5 Z W


Pls help me implement it.Thanks a lot.

Daniel
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Range lookup in PX :evil:
What is the volume of record you are dealing with.
If it is considerablly small, you can try to extapolate the records in an order.
Say
New File B:
BB1 B2 B3
0 U
1 U
2 V
3 V
4 W
..
..
And do a direct lookup.
Else you can build you own custom routine to acheive this.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
daniel0623
Charter Member
Charter Member
Posts: 34
Joined: Tue May 31, 2005 8:17 pm
Location: ShangHai,China

Post by daniel0623 »

If File B is very small, I can list all possibility of the Key.But the problem is Range is random and File B has about 20 million rows.I can't list. :(
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

PX isn't really geared to non equijoin lookups. None of the Lookup, Join or Merge stages supports any technique other than exact key match. The string_lookup_from_int32 function for the Modify stage also does an exact key match.

How are the source data stored? Possibly your best solution would be to load them into two database tables and perform the join with SQL.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I feel the same. Load it into two temp tables and run a sql query on it. That would be the best, IMHO, solution.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

You can find this function 'Range lookup' build in forth comming HAWK release.
No, no... I dont ask you upgrade just for this reason.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply