Hi All,
I am trying to handle Duplicates with Sort Stage using Create Key Change Column = True.
However, I have some issues:
1) Say I have 103 Columns out of which 6 are duplicate, and I want to Capture these 6 records.
Instead, with Sort stage, it retains the first column and discards the other.
2) Also, I want to retain only those records with the following condition:
a. I have 7 columns out of which 2 are key columns and 5 are value columns.
Emp_id Dept_id Qtr YTD One_YY Three_YY Five_YY Ten_YY
12 20 123.44 -99.00 -99.00 -99.00 125.55 -99.00
12 20 -99.00 -99.00 145.55 567.88 012.45 -99.00
The above is some sample duplicates data.
Requirement is to retain only the record with less number of -99.00 values and discard the others. How can I attain this.
Any Ideas!!
Regards.
Handling Duplicate Data
Moderators: chulett, rschirm, roy
Re: Handling Duplicate Data
Use a transformer to count the number of -99.00 columns and add that count as an output column, then sort using the count as the last sort key column (don't partition on it tho). If you sort in ascending order your desired record will be the first within a key group.
Regards,
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.