I want to compare only the duplicate records in a file:
For eg
Key Start Date End Date
1 1/1/2010 1/1/2012
1 1/1/2011 1/1/2012
The data is sorted on key,Start Date in Ascending Order.
If EndDate of the First Record is greater than the StartDate of Second Record,
then reject both the records and capture it.
Please tell me how to achieve this.
Compare Duplicate Records in a File
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 4
- Joined: Fri Sep 02, 2011 9:56 am
Compare Duplicate Records in a File
Thanks,
Isha Sharma
Isha Sharma
-
- Participant
- Posts: 4
- Joined: Fri Sep 02, 2011 9:56 am
Seems to me the first question back to you is - did it work? I'm guessing the answer is no, especially with this little wrinkle in your requirements:
If EndDate of the First Record is greater than the StartDate of Second Record, then reject both the records and capture it.
The rub here is the need to reject the first record only after it has been processed and you are looking at the second record. So it seems to me you can either learn about the Save/Fetch Input Records functions (if they are in your version and we are only talking about pairs of records) or go with the traditional fork join design. Meaning, one branch does the whole stage variable compare thing and outputs the key and something in the way of a reject indicator based on what it finds with regards to the date ranges. Then the main input streams through, joins to that by 'key' and decides what to do based on said reject indicator.
If EndDate of the First Record is greater than the StartDate of Second Record, then reject both the records and capture it.
The rub here is the need to reject the first record only after it has been processed and you are looking at the second record. So it seems to me you can either learn about the Save/Fetch Input Records functions (if they are in your version and we are only talking about pairs of records) or go with the traditional fork join design. Meaning, one branch does the whole stage variable compare thing and outputs the key and something in the way of a reject indicator based on what it finds with regards to the date ranges. Then the main input streams through, joins to that by 'key' and decides what to do based on said reject indicator.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 4
- Joined: Fri Sep 02, 2011 9:56 am
Hi,
I used 4 stage variables KeyCheck,PrevKey,DtCheck and Enddt.
PrevKey=KeyCheck
in.Key=PrevKey
StgEndDt=DtCheck
in.EndDt=StgEndDt
Through applying constraint in transformer,I got the current record which is rejected.I then left join all Valid Records with this Reject Record and got both the records in rejected file.
I used 4 stage variables KeyCheck,PrevKey,DtCheck and Enddt.
PrevKey=KeyCheck
in.Key=PrevKey
StgEndDt=DtCheck
in.EndDt=StgEndDt
Through applying constraint in transformer,I got the current record which is rejected.I then left join all Valid Records with this Reject Record and got both the records in rejected file.
Thanks,
Isha Sharma
Isha Sharma
So... this question was posted again on another forum with a slight tweak to the requirements:
If EndDate of the First Record is greater than the StartDate of Second Record, then reject the first record and capture it.
I don't have any time left this morning but putting it out there in case someone wants to take a stab at it.
If EndDate of the First Record is greater than the StartDate of Second Record, then reject the first record and capture it.
I don't have any time left this morning but putting it out there in case someone wants to take a stab at it.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers