Upsert data from multiple files to single file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Upsert data from multiple files to single file

Post by ag_ram »

Hello folks

I have 2 files , each row is defined by 3 Key Columns . Apart from that i have 2 Non Key columns as well . Some keys are common on both the files , preference to the output then goes to data from first file .

If the records are unique on either files, they qualify for output automatically .

One way for doing this is a FullOuter followed bu filter for Nulls and Commons .This requires a couple of transformers .

Was wondering if there was a better way of going round it ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Split file1 using a Copy stage into two streams. The first stream, all rows from file 1, go to output. The second stream is compared with data from file 2 using the stage of your choice (Change Capture? Compare? Lookup?) to determine whether to output the row from file 2. If required, the two streams can be re-united with a Funnel stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Or simply run file2 followed by file1....if loading both does not cause any harm !!!!
ag_ram
Premium Member
Premium Member
Posts: 524
Joined: Wed Feb 28, 2007 3:51 am

Post by ag_ram »

ray.wurlod wrote:Split file1 using a Copy stage into two streams. The first stream, all rows from file 1, go to output. The second stream is compared with data from file 2 using the stage of your choice (Change Capt ...

I tried doing the same , not on actual job but as a POC .

1|AA|100 1|AA|25
2|BB|101 2|BB|26
3|DD|103 3|AA|27
4|BB|104 4|BB|28
5|CC|102 5|AA|29
7|FF|123 6|AA|29



1|AA|25
2|BB|26
3|DD|103
3|AA|27
4|BB|28
5|CC|102
5|AA|29
7|FF|123
6|AA|29

This is what i ended up getting

1|AA|25
2|BB|26
3|AA|27
3|DD|103
4|BB|28
5|AA|29
5|CC|102
6|AA|29
7|FF|123

Looks the same , but all i did was do change capture (After-link File1Data , Before-link File2Data) ...
Forgive my ignorance .why do we need a copy stage ..to the output ..have i got it wrong

:?
Post Reply