Page 1 of 1
Merge Duplicate Records
Posted: Thu Apr 09, 2009 3:08 am
by Ashish
I got 2 files from source
Structure of File1 is
Col1 Col2 Col3
1 RS A
1 RD B
1 GD C
Structure of File2 is
Col1 Col5
1 TB
1 TB
By using above two files I have to create output file like
Output file structure
Col1 Col2 Col3 Col5
1 RS A TB
1 RD B TB
1 GD C
Can any one help me how to create output file Like above structure
Cheers,
A
Posted: Thu Apr 09, 2009 3:18 am
by BugFree
hi,
This is the left outer join logic.
keep file1 as left link data, File2 as right link data for the Join/Lookup stage.
Mapp Col1 Col2 Col3 Col5 to target and you will get the result
![Smile :)](./images/smilies/icon_smile.gif)
.
Posted: Thu Apr 09, 2009 3:25 am
by mahadev.v
BugFree, You would get the result but it will be 6 records instead of 3
![Wink :wink:](./images/smilies/icon_wink.gif)
. Ashish, if you are sure that the records are already in the required order then you can generate a surrogate key for each of the links using a Row generator stage and then join the data on this field.
Posted: Thu Apr 09, 2009 3:30 am
by BugFree
Yes Mahadev you are right.. :D . we need to have unique value for each row for both the files.
Posted: Thu Apr 09, 2009 7:25 am
by ray.wurlod
Can you not add a Remove Duplicates stage on the Right input?
Posted: Thu Apr 09, 2009 9:35 pm
by Ashish
No Ray we can't add RDUP stage on right side,
Posted: Fri Apr 10, 2009 9:57 am
by ray.wurlod
Why not?
Can be done without Remove Duplicate
Posted: Sat Sep 04, 2010 10:46 pm
by laconic
This can be done using Merge stage and without removing Duplicate-
File1
Col1 Col2 Col3
1 RS A
1 RD B
1 GD C
File2
Col1 Col5
1 TB
1 TB
Use Merge stage with File2 on Master link and File1 on Update link. Set "Unmatched master mode" as "Drop". Key column - Col1.
Output
Col1 Col2 Col3 Col5
1 RS A TB
1 RD B TB
1 GD C TB