Remove duplicates in Datastage MVS edition

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
Sandeep.pendem
Participant
Posts: 27
Joined: Fri May 02, 2008 8:01 am
Location: Mumbai

Remove duplicates in Datastage MVS edition

Post by Sandeep.pendem »

Hi,
I have fixed width file in Datastage mainframe job, When I used aggregator stage my job failes with a SORT error, despite of using SORT stage prior to aggreagtor stage along with a intermediate fixed width file the job fails with the same error message.

Can anyone tell me how to remove duplicates from a file in mainframe edition of datastage, as such we have very few processing stages in mainframe edition.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Add a sort ahead of the Aggregator stage, sorting by the grouping (duplicate identifier) keys. You can use a Sort stage or, if the data are coming from a relational table, specify the ordering in the extraction.

On the Output page General tab, select the Group By option rather than the Control Break option.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sandeep.pendem
Participant
Posts: 27
Joined: Fri May 02, 2008 8:01 am
Location: Mumbai

Post by Sandeep.pendem »

[
Hi Ray,
Thanks for the support, I have tried using a sort stage before a aggregator stage, but it seems we cant have a sort stage before aggreator stage, since when I link sort stage followed by aggregator stage it gives me compilation error as input should be a file, relational table for aggregator stage. do I need to introduc another flat file after a sort stage?

quote="ray.wurlod"]Welcome aboard.

Add a sort ahead of the Aggregator stage, sorting by the grouping (duplicate identifier) keys. You can use a Sort stage or, if the data are coming from a relational table, specify ...[/quote]
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, you do need to stage the data. A flat file is as good a way as any.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sandeep.pendem
Participant
Posts: 27
Joined: Fri May 02, 2008 8:01 am
Location: Mumbai

Post by Sandeep.pendem »

Hi ,

I have already put a flat file after a sort stage then followed by a aggreagtor stage still gets the same sort error message. below is the job design for the same, Do I need to have 2 separate jobs one with a sort stage and other hob with an aggreagator stage or anything specific?

Flat file(i/p) --->Sort --->Flat file--->Aggreagator -->transfromer-->Lookup---> flat file


Thanks,
Sandeep S Pendem
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

As far as I can tell without seeing the detail of your design (sort keys etc) it should be able to remove duplicates satisfactorily. There ought to be no need for more than one job. Are you sorting and aggregating (grouping) the same keys (the ones that identify "duplicates"? How are you getting the other fields (if any) through the Aggregator stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

A hob is what i need to bake an alleagator cut..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's the Spanish pronunciation of "job".
:lol:

Do you get a lot of alligators in California?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rameshrr3
Premium Member
Premium Member
Posts: 609
Joined: Mon May 10, 2004 3:32 am
Location: BRENTWOOD, TN

Post by rameshrr3 »

Louisiana is the place to go..
Post Reply