Page 1 of 1

What is difference between Hashed Files and Lookup Files set

Posted: Thu Sep 27, 2007 7:44 am
by nexus2me
Hi,

Anyone can me tell What is difference between Hashed Files and Lookup Files
Set



with rgrd

Nexus

Posted: Thu Sep 27, 2007 8:50 am
by ameyvaidya
Hi!

Welcome to DSXchange!!


Difference 1:
The first does not exist in PX and the other does.

Apologies,

But I do not see why the two need ever be compared.
:?

Posted: Thu Sep 27, 2007 4:48 pm
by ray.wurlod
A Hashed File is only available in server jobs. It uses a hashing algorithm (without building an index) to determine the location of keys within its structure. It is not amenable to parallelism. The contents of a hashed file may be cached in memory when using the Hashed File stage to service a reference input link. New rows to be written to a hashed file may first be written to a memory cache, then flushed to disk. All writes to a hashed file using an existing key overwrite the previous row. Duplicate key values are not permitted.

A Lookup File Set is only available in parallel jobs. It uses an index (based on a hash table) to determine the location of keys within its structure. It is a parallel structure; it has its records spread over the processing nodes specified when it was created. The records in the Lookup File Set are loaded into a virtual Data Set before use, and the index is also loaded into memory. Duplicate key values are (optionally) permitted. If the option is not selected, duplicates are rejected when writing to the Lookup File Set.

Re: What is difference between Hashed Files and Lookup Files

Posted: Fri Oct 12, 2007 12:55 am
by mcs@rajesh
hi
hashed file: this is used in server job. It does the function just like the Dataset in Parallel job.
LUKUPfileset: it is used in parallel jobs to enhance the performance.
i dont think these two has ever been compared...

Posted: Sat Oct 13, 2007 2:29 pm
by dwblore
Hi,
Hash Files were in Server Job and is no more supported in PX.
In Server jobs writing a reference table in to a Hash file and reading (referencing) for lookup would result in better performance rather than looking up a table.