Page 1 of 1

Sort stage with keychange

Posted: Mon Jan 07, 2008 9:45 pm
by johnthomas
I need to remove duplicate records and used sort stage to do that . I used keychange option to flag the duplicate records and use the filter stage to get the duplicate records for exception logging .(since remove duplicate does not have a reject link) . My requirement we need to reject same record (if repeating keys are for different customer records , we need to reject the same customer across diffrent job runs). seems sort does not work for this ,so need to use transformer stage for this .

any idea to make this to work with sort stage

Posted: Mon Jan 07, 2008 11:50 pm
by ray.wurlod
I fail to see how even a Transformer stage could recognize duplicates "across different runs". Can you please elaborate on that?

Meanwhile, the Sort stage itself (or input link sort) allows you to perform a unique sort. Would that do it for you?

Posted: Tue Jan 08, 2008 5:13 am
by johnthomas
Hi Ray,
What i meant by "duplicates across different runs" using transformer stage is that we need retain the same record( based on the fields which are not part of the sort keys) and reject other records to maintain consistency.
.I was able to do that using sort option (key fields)in the partition option and then sort by specifying the keys in the sort stage.while specifying the key i specified the do not sort option for the key fields so that it is part of the cluster key and addtional attribute as key with sort option . It finaally works

:D

was not able to read the full message since it shows as premium content :x