Hi,
I am creating a job to find the duplicate values. Source is seq file & target is also seq file. The job uses hash file to determine the duplicate value. If there is duplicate record it is written to the target seq file else a new record is added in the hash file. Total there r 2 hash files. one for target & the otherfor lookup. I made all the settings like Enabled,Lock for updates for lookup hash & all.
Now I want to implement following logic in the transformer as a constraint.
I want towrite a record to the target file if it is duplicate,i.e. if that empid has been already extracted from the source file.
Write a record to the target hash file if it is a new record.
How this is to be implemented as a constraint in the transformer? Shd I use if-else statement or rejectcode ...please help me out...
Thanks!!!
Read-Write from Same hash file
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If it's already in the hashed file, it's a duplicate.
Keep writing to the hashed file, so that later duplicates will be detected.
To capture only the duplicates, test whether it was found. For exampleor
The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.
Keep writing to the hashed file, so that later duplicates will be detected.
To capture only the duplicates, test whether it was found. For example
Code: Select all
Not(RefInputLink.KeyCol.NOTFOUND)
Code: Select all
Not(IsNull(RefInputLink.KeyCol))
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks ray.I will try both methods
ray.wurlod wrote:If it's already in the hashed file, it's a duplicate.
Keep writing to the hashed file, so that later duplicates will be detected.
To capture only the duplicates, test whether it was found. For exampleorCode: Select all
Not(RefInputLink.KeyCol.NOTFOUND)
The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.Code: Select all
Not(IsNull(RefInputLink.KeyCol))