Its so funny , how this component act on data ,, am passing data with 4 columns key to this component , and guess what ,, it doesnt catch the duplicate !!! thats funny
example : we have an input stream with keys col1,col2,col3,col4
inside the remove duplicate component m , i do sort based on those keys for the incoming stream , and defined those 4 keys as my uniqueness key ,,
does this component is not composite keys friendly ?
thanks
Remove dulpicate
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 10
- Joined: Tue Nov 20, 2007 7:15 am
- Location: CANADA
Remove dulpicate
E.M
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Composite keys are fine. Are you data partitioned, as well as sorted, on these key fields?
Not being partitioned on the keys would seem to manifest as "missing (some) duplicates" if the duplicates were on different partitions as a result, say, of Round Robin partitioning.
Not being partitioned on the keys would seem to manifest as "missing (some) duplicates" if the duplicates were on different partitions as a result, say, of Round Robin partitioning.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 10
- Joined: Tue Nov 20, 2007 7:15 am
- Location: CANADA
Hi;
Thanks for the response , Yes i did partioned the data and hash sorted records based on the same key .. i guess now am seeing the data different.
Its corrrect now.
thanks
Thanks for the response , Yes i did partioned the data and hash sorted records based on the same key .. i guess now am seeing the data different.
Its corrrect now.
thanks
ray.wurlod wrote:Composite keys are fine. Are you data partitioned, as well as sorted, on these key fields?
Not being partitioned on the keys would seem to manifest as "missing (some) duplicates" if the duplicates ...
E.M
-
- Premium Member
- Posts: 236
- Joined: Sun Apr 01, 2007 7:41 am
- Location: Michigan
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
No.
(Auto) leads to Round Robin except:
(Auto) leads to Round Robin except:
- on reference input to Lookup stage - Entire
on inputs to Join and Merge stages - Hash on join key(s)
on DB2/UDB Enterprise stages - DB2
on other parallel to parallel with same degree of parallelism - Same
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 10
- Joined: Tue Nov 20, 2007 7:15 am
- Location: CANADA
SO , do you recommend partioning the data ( HASH ) based on the key , before we sort it , then remove the duplicate ?
ray.wurlod wrote:No.
(Auto) leads to Round Robin except:
- on reference input to Lookup stage - Entire
on inputs to Join and Merge stages - Hash on join key(s)
on DB2/UDB Enterprise stages - DB2
on ot ...
E.M
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: