Read-Write from Same hash file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Read-Write from Same hash file

Post by Bilwakunj »

Hi,
I am creating a job to find the duplicate values. Source is seq file & target is also seq file. The job uses hash file to determine the duplicate value. If there is duplicate record it is written to the target seq file else a new record is added in the hash file. Total there r 2 hash files. one for target & the otherfor lookup. I made all the settings like Enabled,Lock for updates for lookup hash & all.
Now I want to implement following logic in the transformer as a constraint.
I want towrite a record to the target file if it is duplicate,i.e. if that empid has been already extracted from the source file.
Write a record to the target hash file if it is a new record.
How this is to be implemented as a constraint in the transformer? Shd I use if-else statement or rejectcode ...please help me out...
Thanks!!!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If it's already in the hashed file, it's a duplicate.

Keep writing to the hashed file, so that later duplicates will be detected.

To capture only the duplicates, test whether it was found. For example

Code: Select all

Not(RefInputLink.KeyCol.NOTFOUND)
or

Code: Select all

Not(IsNull(RefInputLink.KeyCol))
The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Post by Bilwakunj »

Thanks ray.I will try both methods
ray.wurlod wrote:If it's already in the hashed file, it's a duplicate.

Keep writing to the hashed file, so that later duplicates will be detected.

To capture only the duplicates, test whether it was found. For example

Code: Select all

Not(RefInputLink.KeyCol.NOTFOUND)
or

Code: Select all

Not(IsNull(RefInputLink.KeyCol))
The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.
Post Reply