Page 1 of 1

Merge Duplicate Records

Posted: Thu Apr 09, 2009 3:08 am
by Ashish
I got 2 files from source

Structure of File1 is
Col1 Col2 Col3
1 RS A
1 RD B
1 GD C
Structure of File2 is
Col1 Col5
1 TB
1 TB

By using above two files I have to create output file like
Output file structure
Col1 Col2 Col3 Col5
1 RS A TB
1 RD B TB
1 GD C

Can any one help me how to create output file Like above structure

Cheers,
A

Posted: Thu Apr 09, 2009 3:18 am
by BugFree
hi,

This is the left outer join logic.

keep file1 as left link data, File2 as right link data for the Join/Lookup stage.
Mapp Col1 Col2 Col3 Col5 to target and you will get the result :) .

Posted: Thu Apr 09, 2009 3:25 am
by mahadev.v
BugFree, You would get the result but it will be 6 records instead of 3 :wink: . Ashish, if you are sure that the records are already in the required order then you can generate a surrogate key for each of the links using a Row generator stage and then join the data on this field.

Posted: Thu Apr 09, 2009 3:30 am
by BugFree
Yes Mahadev you are right.. :D . we need to have unique value for each row for both the files.

Posted: Thu Apr 09, 2009 7:25 am
by ray.wurlod
Can you not add a Remove Duplicates stage on the Right input?

Posted: Thu Apr 09, 2009 9:35 pm
by Ashish
No Ray we can't add RDUP stage on right side,

Posted: Fri Apr 10, 2009 9:57 am
by ray.wurlod
Why not?

Can be done without Remove Duplicate

Posted: Sat Sep 04, 2010 10:46 pm
by laconic
This can be done using Merge stage and without removing Duplicate-
File1
Col1 Col2 Col3
1 RS A
1 RD B
1 GD C

File2
Col1 Col5
1 TB
1 TB

Use Merge stage with File2 on Master link and File1 on Update link. Set "Unmatched master mode" as "Drop". Key column - Col1.

Output
Col1 Col2 Col3 Col5
1 RS A TB
1 RD B TB
1 GD C TB