Merge Stage functionality

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
GowthamSen
Participant
Posts: 21
Joined: Tue Nov 02, 2010 2:36 pm
Location: India

Merge Stage functionality

Post by GowthamSen »

Hi,


I created a job which merges data from 1 Master file and 1 update file using Merge stage.


here is the data as follows:

Master file content:

mKey,Col1

1,A

2,C

3,D


Upadate1 fiel content:

1,P2

3,t2

3,u2

2,Q2

2,R2


The output file gives output as:

1,A,P2

2,C,Q2

2,C,R2

3,D,t2

3,D,u2


This works fine.


But when I add another update file as follows:

Update2 file content:

1,P

2,Q

4,S


The output file gives output as:

1,A,P2,P

2,C,Q2,Q

3,D, -------- This record is coming as I used Unmatched records: KEEP.


But, Here I am confused, why the merge stage is behaving differently with respect to 1 update file and 2 update files.


I am assuming that, even with 2 update files, the output should contain all duplicate records from update files.


Please let me know, if I am missing anything.
Thank you,
Regards

Gowtham
(Learning DS)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It is a documented requirement that, when there are more than one Update inputs, all Update inputs must be de-duplicated. This does not appear to be the case in your example.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
GowthamSen
Participant
Posts: 21
Joined: Tue Nov 02, 2010 2:36 pm
Location: India

Post by GowthamSen »

Thank you Ray.

Previously I was worried that, some thing I am missing in stage properties.

So, now its clear to me that, in case of multiple update links, there shouldn't be duplicates.
Thank you,
Regards

Gowtham
(Learning DS)
Post Reply