Join Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Join Stage

Post by Bilwakunj »

Hi,
I have two datasets, driver and reference. Both of them have same metadata and same column names. The values for the primary key columns may be different or same also for the other columns values may be same or different.
I want to capture all the cases so I'm going for full outer join (as volume of data is large,can't go for lookup). I need to record the minimum values from the non-key columns. (by comparing value from driver & reference). How can I do this? Do I need to implement this in 2 steps: 1. just join all the dat2 2. using transform compare them and find minimum out of it.
Thanks in advance.
Sunshine2323
Charter Member
Charter Member
Posts: 130
Joined: Mon Sep 06, 2004 3:05 am
Location: Dubai,UAE

Join Stage

Post by Sunshine2323 »

Hi Bilwakunj, :)
Both of them have same metadata and same column names.
Since the column names in both the driver and reference are the same you will not get only one non-key column that is from the driver in the output. So if u need to make a comparison in between the non-key columns your non key column names for driver and reference have to be different.
and after the join stage you can use a transformer and compare the values of the non-key columns and select the lesser one
Eg.
Discount is a non key field in both the datasets.
You can use the Modify stage to change the column names to say Discount_Driver and Discount_Reference then after they are joined you will get both the columns in the output and using a transformer after the join stage a comparison can easily be made

IF Discount_Driver< Discount_Reference THEN Discount_Driver ELSE Discount_Reference

Hope this helps :)
Warm Regards,
Amruta Bandekar

<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
If my memory serves me this discution was already covered, search for it.

by the way what is the scale of this large volume?
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

This can be done using a full outer join stage however with a large number of columns you may find it much easier to use the Change Capture stage.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Re: Join Stage

Post by T42 »

Bilwakunj wrote:The values for the primary key columns may be different or same also for the other columns values may be same or different.
Change Capture Stage.
Post Reply