Page 1 of 1

Mergestage issue

Posted: Tue Jun 16, 2009 7:20 am
by nani0907
Hi ,

we are using merge stage to identify the insert and updates.its working fine with fewer records.in case of more records it doesnot identifies the inserts ad updates correctly.we are not using any partition technique and running on two nodes.please help me.

Posted: Tue Jun 16, 2009 7:47 am
by singhald
for merge stage , it is always better to provide sorted data set, use sort stage and sort both input on key fields,

Posted: Tue Jun 16, 2009 7:51 am
by chulett
And if that doesn't help, give us a better idea what "it doesnot identifies the inserts ad updates correctly" means.

Posted: Tue Jun 16, 2009 7:52 am
by throbinson
You are using two nodes. This means you are partitioning on two nodes. This means, in general, half the data goes to one partition and half the data goes to the other partition. If you have not consciously planned for this partitioning then Datastage has done it for you. Erroneously, it would seem. Meaning, you must examine the partitioning scheme used and take a look at the keys you are using to merge the two datasets. Make sure ALL keys from both datasets end up in the same partitions.
A record in one partition will not be merged with a record in the other partition. Keys determine both partitioning and merging. Partitioning and merging keys must be consistent. This may be your problem. To verify that partitioning is your problem, use a single node config file. If the data is correct, then your problem is partitioning.
Or sorting...

Posted: Fri Jun 19, 2009 6:14 am
by Sreenivasulu
HiAll,

We are also facing a similar issue with the merge stage.

Regards
Sreeni

Posted: Fri Jun 19, 2009 6:24 am
by nagarjuna
In addition to partitioning and sorting , take care f duplicates in the master link of the merge stage