Writing data into two hash

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Luk
Participant
Posts: 133
Joined: Thu Dec 02, 2004 8:35 am
Location: Poland
Contact:

Writing data into two hash

Post by Luk »

Hi!

I have strange problem. I have got transformer and output of transformer is connected with two different hash files.

As input of transformer I veve got sequence of surrogate keys( made with GetKeyNextValue) colA, colB and colC. I'm putting key column, colA as PK, colB as PK into firs hash - key column, colB as PK and colC as PK into second one.

When job is finished, in my log I have 2000 rows written to hash1 and 2000 rows written to hash2.

But I have noticed that some integer keys which are in first hash aren't in second.
I used UniVerse stage and made SQL to hash files - and there is 2000 rows in first hash and 1900 in second!

Do you have any ideas why it is happens??
LUK
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Luk,

as you stated, both links had 2000 rows go down them. Since the way that Hash files work is that WRITEs to an already existing key will perform an overwrite of the existing record, you need to look at your PK on the second file of ColB and ColC.

I would suggest you change the stage to drop & re-create the hash file and make sure you are using both ColB and ColC as keys in the output to that stage. You have some periodicity to the overwrites, so if you do a select of your second file and order by ColB and ColC you should, within the first couple of pages, find a "missing" row which might help debug the problem. Which integer keys are missing? Could you have leading "0"s?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Try to write them into seq files which can then be used to compare. Hash file by default overwrites keys and hence may result in <= rows you wrote in it.
Luk
Participant
Posts: 133
Joined: Thu Dec 02, 2004 8:35 am
Location: Poland
Contact:

Post by Luk »

Is it possible to overwrite surrogate key column (made by GetKeyNextValue) in hash when I have PK build with two columns (key is unique only when you take both columns, if you take only one they won't be unique) ??
LUK
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Luk,

if you ran the job and created the hash file with just one column as the PK and later added the second column the file will only use the original definition, that's why I suggested you force delete and re-create the file. This is a relatively common source of problems. Does it now work?
Luk
Participant
Posts: 133
Joined: Thu Dec 02, 2004 8:35 am
Location: Poland
Contact:

Post by Luk »

Yes that is true - I have noticed that pair of PK columns is not 100% unique !!!
if you ran the job and created the hash file with just one column as the PK and later added the second column the file will only use the original definition,
I am using checbox "create file" and checkbox "delete file befor creation" in hash options. Is it enaugh for recreating file with new definition??
LUK
Luk
Participant
Posts: 133
Joined: Thu Dec 02, 2004 8:35 am
Location: Poland
Contact:

Post by Luk »

OK :)

problem is solved

thanks you all!!

Regards
LUK
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Luk,

in order for this forum to work it would be nice to have you tell us what the solution and/or the problem was so that others might search this thread and get a solution.
Luk
Participant
Posts: 133
Joined: Thu Dec 02, 2004 8:35 am
Location: Poland
Contact:

Post by Luk »

:) You already gave solution - as I've mentioned you had right!!

set of columns which I used as PK in hash file wasn't 100% unique (there was an update on few records in hash) . I have added 1 more column as key and everything is unique and number of rows i hashes is correct!

Regards
LUK
Post Reply