Page 1 of 1

Trap Duplicate record

Posted: Thu Sep 11, 2008 11:37 pm
by prasson_ibm
Hi
I have record like this:-
Col1 Col2
1 a
1 b
1 c
2 d
2 e
2 f
I want to trap duplicate record in PX.
My output1 should be like this:-
Col1 Col2
1 b
1 c
2 e
2 f

and Output2 should be like this:-
Col1 Col2
1 a
2 d
kindly help me to implement this in parallel
Thanks in advance

Posted: Thu Sep 11, 2008 11:50 pm
by ray.wurlod
Have you undertaken a Search?

This question has been answered previously.

Re: Trap Duplicate record

Posted: Fri Sep 12, 2008 1:26 am
by talk2shaanc
prasson_ibm wrote:Hi
I have record like this:-
Col1 Col2
1 a
1 b
1 c
2 d
2 e
2 f
I want to trap duplicate record in PX.
My output1 should be like this:-
Col1 Col2
1 b
1 c
2 e
2 f

and Output2 should be like this:-
Col1 Col2
1 a
2 d
kindly help me to implement this in parallel
Thanks in advance
Which of the two is the key column? I don't see a duplicate in your input if I assume that both Col1 and Col2 are part of keys

Posted: Fri Sep 12, 2008 2:24 am
by ray.wurlod
... on which basis you'd be safe enough to deduce that Col1 is the key for the purposes of identifying duplicates.

Re: Trap Duplicate record

Posted: Fri Sep 12, 2008 6:14 am
by prasson_ibm
My Col1 is key col. and i want to send first repeated record to output2 and rest of the records to output1 on the basis of this key col....................i have trapped this using server stage but in parallel i m getting problem......... :(

Posted: Fri Sep 12, 2008 6:34 am
by mahadev.v
Are we supposed to guess the problem you are facing in parallel jobs?

Posted: Fri Sep 12, 2008 7:18 am
by gabrielac
If I understood correctly the problem, I would divide it into two jobs.
1. Create Output 2, using remove duplicates.
2. Using Output 2 in a lookup, and create Output 1, with leftover records.
HTH,
Gaby

Posted: Fri Sep 12, 2008 5:40 pm
by ray.wurlod
It can be done with one parallel job, which has been explained (fully, I believe) in the past. In outline the technique involves splitting the stream of sorted rows using a Copy stage, sending the key through an Aggregator stage that counts them, then bringing the streams back together using a Join stage. In the Sort stage upstream you have generated a key change column; you use this it identify the first row from each group and the other rows from each group, using a Switch, Filter or Transformer stage.