Doubt in Hash file

somu_june · Post by **somu_june** » Fri Feb 17, 2006 1:57 pm

Hi ,

Iam using a hash file in job for looking in to DB2 table. I read that hash file removes duplicates. I want to know will that effect my job by using hash file for ex: I had material,mandt,variant as keys in hash file.

Material ..................mandt....................variant as keys
XXXX......................YYYYYY......................uuuuu
XXXX......................YYYYYY........................''
XXXX......................YYYYYY.......................wwww
RRRR......................yyyyyy.........................wwww

I think I can achieve my output if it is like the one above in DB2 table.Then I can write all records in hash file is it write

If I have like this

Material................Mandt.......................Variant as keys
XXXXX...............YYYYY.........................uuuuuu
XXXXX...............YYYYY.........................uuuuuu
SSSSS...............TTTTT.........................RRRRRR
SSSSS...............TTTTT.........................RRRRRR

Then I can have only one record from Db2 table in to hash file because of key. Please correct me if Iam wrong

Thanks,
Somaraju.

sjhouse · Post by **sjhouse** » Fri Feb 17, 2006 2:26 pm

When writing to a Hash file, if you have a duplicate record key, the last record written is what is in the file.

For your second example, only 2 records would exist in the hash file

Material................Mandt.......................Variant as keys
XXXXX...............YYYYY.........................uuuuuu
SSSSS...............TTTTT.........................RRRRRR

Sephen

ray.wurlod · Post by **ray.wurlod** » Fri Feb 17, 2006 5:51 pm

Writes to hashed files (note: it's "hashed", not "hash") are destructive overwrites.

I_Server_Whale · Post by **I_Server_Whale** » Sat Feb 18, 2006 5:28 pm

Hi Ray,

I'm just curious. What is the difference between a normal overwrite and a destructive overwrite?

I'm thinking that a write to a hashed file is a destructive write and not destructive overwrite. Is it not that a overwrite is always destructive.

Please correct me if I'm wrong. Your help is very much appreciated.

Many Thanks,
Naveen.

ray.wurlod · Post by **ray.wurlod** » Sun Feb 19, 2006 1:04 am

When a Hashed File stage (or DataStage BASIC routine) writes to a hashed file, if the key value is one that already exists on the hashed file, the old record is destroyed and the new one replaces it completely. That's why it's called destructive. However, this is all one operation in hashed files (not delete followed by insert, which you might do using one of the SQL-based stages, and which is a double operation). That's why it's called overwriting.