Merge Stage - Duplicates in the master link.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sreelalpp
Participant
Posts: 22
Joined: Sun Jan 01, 2006 6:51 am

Merge Stage - Duplicates in the master link.

Post by sreelalpp »

Hi,

I need to merge two sequential files. The records from the master link have duplicates in the Merging Key field. In the records in the update link has no duplicate. In the output, I am getting merged data only for the first among the duplicate entries in the master link. In the other fields, I am getting zero. -

Ie, it finds a match and gets the data from update link only for the first among duplicate( in the merge key field) in the master link.

I was in an impression that Merge stage can be used exactly to perform lookup. Am I wrong? Is it like, the merge key field must be a primary key field in the master link ?

Below is the sample input data, output data I got and what I expected to get.

Code: Select all

From Master link

ID        Merge_Key
-----------------------
1                 A
2                 A
3                 B
4                 C
5                 C
6                 C
7                 E
8                 F

Input from Update Link:

Merge_Key              AMT
------------------------------
A                            1000
B                            2000
C                            3000
D                            4000


Output I got.

ID         Merge_Key      AMT
----------------------------------
1             A                  1000             
2             A                     0
3             B                  2000  
4             C                  3000
5             C                     0
6             C                     0
7             E                      0
8             F                      0 

What I expect  :

ID        Merge_Key        AMT
-------------------------------------
1             A                  1000             
2             A                  1000
3             B                  2000  
4             C                  3000
5             C                  3000
6             C                  3000
7             E                      0
8             F                      0 

I am keeping the unmatched rows from the master as I need them in out put with zeros.

Please be advised..

Thanks,
Sree.
lal
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Refer the documentation. You need to remove duplicates from the master dataset.
madhukar
Participant
Posts: 86
Joined: Fri May 20, 2005 4:05 pm

Re: Merge Stage - Duplicates in the master link.

Post by madhukar »

[quote="sreelalpp"]Hi,

I need to merge two sequential files. The records from the master link have duplicates in the Merging Key field. In the records in the update link has no duplicate. In the output, I am getting merged data only for the first among the duplicate entries in the master link. In the other fields, I am getting zero. -

Ie, it finds a match and gets the data from update link only for the first among duplicate( in the merge key field) in the master link.

I was in an impression that Merge stage can be used exactly to perform lookup. Am I wrong? Is it like, the merge key field must be a primary key field in the master link ?

Below is the sample input data, output data I got and what I expected to get.

[code]From Master link

ID Merge_Key
-----------------------
1 A
2 A
3 B
4 C
5 C
6 C
7 E
8 F

Input from Update Link:

Merge_Key AMT
------------------------------
A 1000
B 2000
C 3000
D 4000


Output I got.

ID Merge_Key AMT
----------------------------------
1 A 1000
2 A 0
3 B 2000
4 C 3000
5 C 0
6 C 0
7 E 0
8 F 0

What I expect :

ID Merge_Key AMT
-------------------------------------
1 A 1000
2 A 1000
3 B 2000
4 C 3000
5 C 3000
6 C 3000
7 E 0
8 F 0 [/code]


I am keeping the unmatched rows from the master as I need them in out put with zeros.

Please be advised..

Thanks,
Sree.[code][/code][/quote]

Do a left outer join
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Please dont quote the entire message, especially if its 9 yards long :?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
sreelalpp
Participant
Posts: 22
Joined: Sun Jan 01, 2006 6:51 am

Re: Merge Stage - Duplicates in the master link.

Post by sreelalpp »

Thanks for your responses.

balajisr,
If I remove duplicates based on the merge key from master, I will lose records. (Please see the records from the master and the merge key i am using in the sample data). Do you think, I can not use Merge to achieve this requirement ?

madhukar, This works fine if I use Left outer join. It will carry all records from left, even if duplicates are there in the join key.

Can I conclude that Merge can NOT be used to do the same functionality of Lookup if it contains duplicates in the merge key from the master ?

Thank you,
Sree
lal
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Yes, results are there for you to see.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Merge Stage - Duplicates in the master link.

Post by ray.wurlod »

sreelalpp wrote:Can I conclude that Merge can NOT be used to do the same functionality of Lookup if it contains duplicates in the merge key from the master ?
Yes
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sreelalpp
Participant
Posts: 22
Joined: Sun Jan 01, 2006 6:51 am

Re: Merge Stage - Duplicates in the master link.

Post by sreelalpp »

Thanks !
lal
Post Reply