Remove duplicates problem
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 18
- Joined: Wed Feb 14, 2007 10:04 am
Remove duplicates problem
Hi,
I am having problems with removing duplicates stage in PX. I have a job as follows
Dataset ---- Sort ----- Removed Duplicates -------- Dataset
Here is a sample of the data
UIDCONTRACT COMMISSION_TYPE PRIORITY
694430 LS Commission 1
694430 LS Percentage 2
694430 LS Fixed Fee 3
I sort the data by contract and priority both ascending nulls first.
I then remove duplicates based on the UIDCONTRACT in order to retain the first which I believe should be the LS Commission Priority 1 as I have sorted the data in this way.
Why is the single row that is outputted in the Dataset LS Percentage Prioriy 2 ? I would of thought it should be either 1 or 3 depending on either keeping first or last rows?
I am very confused by this and am having a terrible time trying to work out what is going wrong.
I am having problems with removing duplicates stage in PX. I have a job as follows
Dataset ---- Sort ----- Removed Duplicates -------- Dataset
Here is a sample of the data
UIDCONTRACT COMMISSION_TYPE PRIORITY
694430 LS Commission 1
694430 LS Percentage 2
694430 LS Fixed Fee 3
I sort the data by contract and priority both ascending nulls first.
I then remove duplicates based on the UIDCONTRACT in order to retain the first which I believe should be the LS Commission Priority 1 as I have sorted the data in this way.
Why is the single row that is outputted in the Dataset LS Percentage Prioriy 2 ? I would of thought it should be either 1 or 3 depending on either keeping first or last rows?
I am very confused by this and am having a terrible time trying to work out what is going wrong.
-
- Premium Member
- Posts: 18
- Joined: Wed Feb 14, 2007 10:04 am
-
- Premium Member
- Posts: 18
- Joined: Wed Feb 14, 2007 10:04 am
You can achieve the same with justboris_karloff wrote:What I want to do is to remove the duplicates based on UIDCONTRACT, and obtain the first row i.e. the row with priority 1.
I thought about using an aggregator stage bu that just brings back all the rows as I cant leave out the description.
Dataset-->Sort-->Dataset
Sort can also be used as a remove duplicate stage, by selecting the option to either retain first or last, with remove duplicate option.
Just my thought
Re: Remove duplicates problem
looking at UIDCONTRACT COMMISSION_TYPE, combination of both seems to be unique, please provide more data for input and what exactly output you want.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 18
- Joined: Wed Feb 14, 2007 10:04 am
What settings do you have for sorting and partitioning in the remove duplicates stage?
Partition by UIDCONTRACT and sort by UIDCONTRACT and PRIORITY in the sort stage.
Make sure that the data is not then repartitioned or sorted in the remove duplicates stage (use SAME partitioning if it is being), and the row you specify to keep will depend on whether you sorted PRIORITY ascending or descending in the previous sort stage. The remove duplicate stage should only specify UIDCONTRACT as the key.
You can, if you like, remove the sort stage and do it on the input link to the remove duplicates stage. Partition by UIDCONTRACT and sort by UIDCONTRACT and PRIORITY.
Partition by UIDCONTRACT and sort by UIDCONTRACT and PRIORITY in the sort stage.
Make sure that the data is not then repartitioned or sorted in the remove duplicates stage (use SAME partitioning if it is being), and the row you specify to keep will depend on whether you sorted PRIORITY ascending or descending in the previous sort stage. The remove duplicate stage should only specify UIDCONTRACT as the key.
You can, if you like, remove the sort stage and do it on the input link to the remove duplicates stage. Partition by UIDCONTRACT and sort by UIDCONTRACT and PRIORITY.
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
What partition are you using? Hash ?
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Premium Member
- Posts: 18
- Joined: Wed Feb 14, 2007 10:04 am