hash file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
loe_ram13
Participant
Posts: 35
Joined: Thu Apr 12, 2007 1:17 am

hash file

Post by loe_ram13 »

what is the hashing algorithm used in HASH lookup???
What is the difference between static & dynamic hashing in Datastage??
Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Beware of the terminology. It's hashed file, therefore it's hashed lookup.
The hashing algorithm used in lookup is the one that is specified for the hashed file. There are two choices (GENERAL or SEQ.NUM) for dynamic hashed files, and seventeen choices for static hashed files. There are thus 19 separate algorithms that might be used, though GENERAL is very like Type 18 and SEQ.NUM is very like Type 2 - but they are not identical.
There is no such thing as static or dynamic hashing. Static hashed files have an unvarying number of groups ("pages") where records are stored; the hashing algorithm selects one of these (the correct one for the particular record key). Dynamic hashed files have a number of groups that may vary over time depending on the total volume of data stored in the file; the hashing algorithm selects one of these at the time the lookup is performed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
loe_ram13
Participant
Posts: 35
Joined: Thu Apr 12, 2007 1:17 am

Post by loe_ram13 »

ray.wurlod wrote:Beware of the terminology. It's hashed file, therefore it's hashed lookup.
The hashing algorithm used in lookup is the one that is specified for the hashed file. There are two choices ...
How are GENERAL & SEQ.NUM different??
Can u elaborate a little??
Thanks in advance...
Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

GENERAL is a totally general-purpose algorithm that attempts to get "random" (= "flat") distribution of records over the available groups. SEQ.NUM is biased towards numeric characters, and operates right-to-left (much like an odometer). It works best when the keys form an unbroken integer sequence.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply