Hi,
I have two datasets, driver and reference. Both of them have same metadata and same column names. The values for the primary key columns may be different or same also for the other columns values may be same or different.
I want to capture all the cases so I'm going for full outer join (as volume of data is large,can't go for lookup). I need to record the minimum values from the non-key columns. (by comparing value from driver & reference). How can I do this? Do I need to implement this in 2 steps: 1. just join all the dat2 2. using transform compare them and find minimum out of it.
Thanks in advance.
Join Stage
Moderators: chulett, rschirm, roy
-
- Charter Member
- Posts: 130
- Joined: Mon Sep 06, 2004 3:05 am
- Location: Dubai,UAE
Join Stage
Hi Bilwakunj, ![Smile :)](./images/smilies/icon_smile.gif)
and after the join stage you can use a transformer and compare the values of the non-key columns and select the lesser one
Eg.
Discount is a non key field in both the datasets.
You can use the Modify stage to change the column names to say Discount_Driver and Discount_Reference then after they are joined you will get both the columns in the output and using a transformer after the join stage a comparison can easily be made
IF Discount_Driver< Discount_Reference THEN Discount_Driver ELSE Discount_Reference
Hope this helps![Smile :)](./images/smilies/icon_smile.gif)
![Smile :)](./images/smilies/icon_smile.gif)
Since the column names in both the driver and reference are the same you will not get only one non-key column that is from the driver in the output. So if u need to make a comparison in between the non-key columns your non key column names for driver and reference have to be different.Both of them have same metadata and same column names.
and after the join stage you can use a transformer and compare the values of the non-key columns and select the lesser one
Eg.
Discount is a non key field in both the datasets.
You can use the Modify stage to change the column names to say Discount_Driver and Discount_Reference then after they are joined you will get both the columns in the output and using a transformer after the join stage a comparison can easily be made
IF Discount_Driver< Discount_Reference THEN Discount_Driver ELSE Discount_Reference
Hope this helps
![Smile :)](./images/smilies/icon_smile.gif)
Warm Regards,
Amruta Bandekar
<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
Amruta Bandekar
<b>If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. </b>
--Albert Einstein
Hi,
If my memory serves me this discution was already covered, search for it.
by the way what is the scale of this large volume?
If my memory serves me this discution was already covered, search for it.
by the way what is the scale of this large volume?
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
This can be done using a full outer join stage however with a large number of columns you may find it much easier to use the Change Capture stage.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Re: Join Stage
Change Capture Stage.Bilwakunj wrote:The values for the primary key columns may be different or same also for the other columns values may be same or different.