Hi, I am new in DS. I need to finish this real fast otherwise I am fired.
I extracted the data but I need to sorted it with the most current record, then just send out those records.
The header (first record) need to have a hard-coded value 'Header' and a current yyyymm, the trailer (last record) need to have a hard-code
the 'Trailer' and the record countds (total count of this file excluded header and trailer).
I almost done except the sorting out the duplicate and get the
most current one out. What seems to be the cleanest way to get this
done except using "select max(xxx) from xxx where ...group by...." because I have spent 2 day to make this work but it did not. I think
it is the company setup between VMS and DS, it is something the date is not converting correctly. Therefore, all other method except select
max(date) where xxx group by yyy to take care of the duplicate record.
Thanks for all.
Simple job - Extract, check dup, write header and trailer
Moderators: chulett, rschirm, roy
No yet. But I think tomorrow, I will try the aggregator with group byGazelle wrote:Sorry. I missed some vital information about your environment.
I'm using v7.x. I do not know if v6 has a "Remove Duplicates" stage.
Someone else will have to help you on this one.
Have you tried the derivation options of the Aggregator stage?
the provider number and max on date. (basically, a lot of provider information field but I only want the most recent record) and that should work. I have a SQL which does that but it is not working in DS.
I think this 2 step should do it.
Thanks.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If the meaning of "duplicate" means "same key", write to hashed file then read from hashed file. Updates to hashed files are destructive overwrites, so this is an easy way to remove duplicates.
If you need to record duplicates, this is slightly more complex, in that you have to check the same hashed file to determine whether you've seen this key before. Select "lock for update" in the HF stage that writes to the hashed file. Both HF stages access the same hashed file.
Iin this design, the output link is constrained on the result of the lookup
If you need to record duplicates, this is slightly more complex, in that you have to check the same hashed file to determine whether you've seen this key before. Select "lock for update" in the HF stage that writes to the hashed file. Both HF stages access the same hashed file.
Code: Select all
HFStage1
|
V
------> Transfomer ----->
|
V
HFStage2
- if the lookup failed, this key has not been seen yet
if the lookup succeeded, this key has already been seen
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.