to get duplicate records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

abhay10
Participant
Posts: 31
Joined: Tue Nov 20, 2007 11:39 pm
Location: Bangalore

Post by abhay10 »

ray.wurlod wrote:The job design I posted will give you what you want. ...
no it doesnt hold good for more than 2 same duplicate records...
koolnitz
Participant
Posts: 138
Joined: Wed Sep 07, 2005 5:39 am

Post by koolnitz »

Abhay, your question itself is not very clear.
Do you want to capture all the records that occur more than once, and also the count of occurences? If yes, then try the following logic:

Use a copy stage to split the incoming records into two streams. One stream goes to an Aggregator stage that groups the records by key field(s) and counts the number of records in each group and outputs the results to the COUNT field. The output from Aggregator stage is then joined to the other stream using a Join stage on key field(s) and the results are then passed to Transformer stage. In Transformer, you could put a constraint like COUNT>2.
Nitin Jain | India

If everything seems to be going well, you have obviously overlooked something.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

abhay10 wrote:
ray.wurlod wrote:The job design I posted will give you what you want. ...
no it doesnt hold good for more than 2 same duplicate records...
Yes it does. Because every row from source appears on the left input of the Join stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abhay10
Participant
Posts: 31
Joined: Tue Nov 20, 2007 11:39 pm
Location: Bangalore

Post by abhay10 »

ray.wurlod wrote:
abhay10 wrote:
ray.wurlod wrote:The job design I posted will give you what you want. ...
no it doesnt hold good for more than 2 same duplicate records...
Yes it does. Because ...
i got it using a differnt query, any wayz thanx...
Post Reply