Page 1 of 1

Doubt in Hash file

Posted: Fri Feb 17, 2006 1:57 pm
by somu_june
Hi ,


Iam using a hash file in job for looking in to DB2 table. I read that hash file removes duplicates. I want to know will that effect my job by using hash file for ex: I had material,mandt,variant as keys in hash file.

Material ..................mandt....................variant as keys
XXXX......................YYYYYY......................uuuuu
XXXX......................YYYYYY........................''
XXXX......................YYYYYY.......................wwww
RRRR......................yyyyyy.........................wwww

I think I can achieve my output if it is like the one above in DB2 table.Then I can write all records in hash file is it write



If I have like this

Material................Mandt.......................Variant as keys
XXXXX...............YYYYY.........................uuuuuu
XXXXX...............YYYYY.........................uuuuuu
SSSSS...............TTTTT.........................RRRRRR
SSSSS...............TTTTT.........................RRRRRR

Then I can have only one record from Db2 table in to hash file because of key. Please correct me if Iam wrong

Thanks,
Somaraju.

Posted: Fri Feb 17, 2006 2:26 pm
by sjhouse
When writing to a Hash file, if you have a duplicate record key, the last record written is what is in the file.

For your second example, only 2 records would exist in the hash file

Material................Mandt.......................Variant as keys
XXXXX...............YYYYY.........................uuuuuu
SSSSS...............TTTTT.........................RRRRRR


Sephen

Posted: Fri Feb 17, 2006 5:51 pm
by ray.wurlod
Writes to hashed files (note: it's "hashed", not "hash") are destructive overwrites.

Posted: Sat Feb 18, 2006 5:28 pm
by I_Server_Whale
Hi Ray,

I'm just curious. What is the difference between a normal overwrite and a destructive overwrite?

I'm thinking that a write to a hashed file is a destructive write and not destructive overwrite. Is it not that a overwrite is always destructive.

Please correct me if I'm wrong. Your help is very much appreciated.

Many Thanks,
Naveen.

Posted: Sun Feb 19, 2006 1:04 am
by ray.wurlod
When a Hashed File stage (or DataStage BASIC routine) writes to a hashed file, if the key value is one that already exists on the hashed file, the old record is destroyed and the new one replaces it completely. That's why it's called destructive. However, this is all one operation in hashed files (not delete followed by insert, which you might do using one of the SQL-based stages, and which is a double operation). That's why it's called overwriting.