Read-Write from Same hash file

Bilwakunj · Post by **Bilwakunj** » Wed Oct 27, 2004 2:20 pm

Hi,
I am creating a job to find the duplicate values. Source is seq file & target is also seq file. The job uses hash file to determine the duplicate value. If there is duplicate record it is written to the target seq file else a new record is added in the hash file. Total there r 2 hash files. one for target & the otherfor lookup. I made all the settings like Enabled,Lock for updates for lookup hash & all.
Now I want to implement following logic in the transformer as a constraint.
I want towrite a record to the target file if it is duplicate,i.e. if that empid has been already extracted from the source file.
Write a record to the target hash file if it is a new record.
How this is to be implemented as a constraint in the transformer? Shd I use if-else statement or rejectcode ...please help me out...
Thanks!!!

ray.wurlod · Post by **ray.wurlod** » Wed Oct 27, 2004 3:52 pm

If it's already in the hashed file, it's a duplicate.

Keep writing to the hashed file, so that later duplicates will be detected.

To capture only the duplicates, test whether it was found. For example

Code: Select all

Not(RefInputLink.KeyCol.NOTFOUND)

or

Code: Select all

Not(IsNull(RefInputLink.KeyCol))

The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.

Bilwakunj · Post by **Bilwakunj** » Wed Oct 27, 2004 5:08 pm

Thanks ray.I will try both methods

ray.wurlod wrote:If it's already in the hashed file, it's a duplicate.

Keep writing to the hashed file, so that later duplicates will be detected.

To capture only the duplicates, test whether it was found. For example
Code: Select all
Not(RefInputLink.KeyCol.NOTFOUND)
or
Code: Select all
Not(IsNull(RefInputLink.KeyCol))
The first example uses the input link variable NOTFOUND, the second relies on the (documented) fact that DataStage returns NULL for all columns on the reference input link if the key value sought is not found.