Page 1 of 1

Remove duplicates in Datastage MVS edition

Posted: Sat May 31, 2008 2:36 pm
by Sandeep.pendem
Hi,
I have fixed width file in Datastage mainframe job, When I used aggregator stage my job failes with a SORT error, despite of using SORT stage prior to aggreagtor stage along with a intermediate fixed width file the job fails with the same error message.

Can anyone tell me how to remove duplicates from a file in mainframe edition of datastage, as such we have very few processing stages in mainframe edition.

Posted: Sat May 31, 2008 4:14 pm
by ray.wurlod
Welcome aboard.

Add a sort ahead of the Aggregator stage, sorting by the grouping (duplicate identifier) keys. You can use a Sort stage or, if the data are coming from a relational table, specify the ordering in the extraction.

On the Output page General tab, select the Group By option rather than the Control Break option.

Posted: Sun Jun 01, 2008 11:46 am
by Sandeep.pendem
[
Hi Ray,
Thanks for the support, I have tried using a sort stage before a aggregator stage, but it seems we cant have a sort stage before aggreator stage, since when I link sort stage followed by aggregator stage it gives me compilation error as input should be a file, relational table for aggregator stage. do I need to introduc another flat file after a sort stage?

quote="ray.wurlod"]Welcome aboard.

Add a sort ahead of the Aggregator stage, sorting by the grouping (duplicate identifier) keys. You can use a Sort stage or, if the data are coming from a relational table, specify ...[/quote]

Posted: Sun Jun 01, 2008 2:13 pm
by ray.wurlod
Yes, you do need to stage the data. A flat file is as good a way as any.

Posted: Mon Jun 02, 2008 8:46 am
by Sandeep.pendem
Hi ,

I have already put a flat file after a sort stage then followed by a aggreagtor stage still gets the same sort error message. below is the job design for the same, Do I need to have 2 separate jobs one with a sort stage and other hob with an aggreagator stage or anything specific?

Flat file(i/p) --->Sort --->Flat file--->Aggreagator -->transfromer-->Lookup---> flat file


Thanks,
Sandeep S Pendem

Posted: Mon Jun 02, 2008 4:11 pm
by ray.wurlod
As far as I can tell without seeing the detail of your design (sort keys etc) it should be able to remove duplicates satisfactorily. There ought to be no need for more than one job. Are you sorting and aggregating (grouping) the same keys (the ones that identify "duplicates"? How are you getting the other fields (if any) through the Aggregator stage?

Posted: Mon Jun 02, 2008 5:23 pm
by rameshrr3
A hob is what i need to bake an alleagator cut..

Posted: Mon Jun 02, 2008 5:27 pm
by ray.wurlod
It's the Spanish pronunciation of "job".
:lol:

Do you get a lot of alligators in California?

Posted: Mon Jun 02, 2008 5:29 pm
by rameshrr3
Louisiana is the place to go..