Grouping of Related Data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Grouping of Related Data

Post by Bilwakunj »

Hi,
I've a following situation.
I'm supposed to do a look up on 1 dataset (primary i/p) and a oracle table (reference link) based on 5 columns.
1. A
2. B
3. Date1
4. Date2
5. C

Now the criteria is if there is a match found on colum A, B and C, it meets the initial criteria. And from the output of this I'm supposed to pick up the earliest date(Date1) among the related group(the records which have a match for column A, B, C are relared). So in short there will be 1 earliest Date1 for each group & there can be several groups.
Can this task be done using look up ? Or shd I go for some other stages?

Thanks in advance.

Bilwakunj
gh_amitava
Participant
Posts: 75
Joined: Tue May 13, 2003 4:14 am
Location: California
Contact:

Post by gh_amitava »

Hi,

You have to do a Lookup/Join to get the values from database but after that you can use "Remove Duplicates" stage to pick the smallest Date1. For that , in the input of "Remove Duplicate" stage, use a Hash partition on A,B, C and so a sort on Date1.

Regards
~Amitava
Post Reply