Hashed file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
veera24
Premium Member
Premium Member
Posts: 150
Joined: Thu Feb 07, 2008 9:37 pm
Location: NewYork

Hashed file

Post by veera24 »

hi all,

Usually when we write data into Hashed file, then the records will be in shuffled manner. Why is it so? Any reason behind that?


Thanks in advance...
veera...
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

That is a function of many databases. With hashed files the key is "hashed" into a number which corresponds to a hashed file group (or bucket). When the file is read in unsorted order it will read keys from each group and this will give you the "shuffled" order you see.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It is axiomatic that you ought to have no control over physical storage of any individual record; this is a function of the database server so that it can retrieve that record efficiently. Hashed files implement database tables in a number of database products, including UniVerse and UniData, and use a hashing algorithm to determine the physical storage location.

In all cases if you need ordered data retrieved, you specify that in the query.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
veera24
Premium Member
Premium Member
Posts: 150
Joined: Thu Feb 07, 2008 9:37 pm
Location: NewYork

Post by veera24 »

ArndW wrote:That is a function of many databases. With hashed files the key is "hashed" into a number which corresponds to a hashed file group (or bucket). When the file is read in unsorted order it will read key ...
hi i couldn see ur information properly. could you pls. help me out.
veera24
Premium Member
Premium Member
Posts: 150
Joined: Thu Feb 07, 2008 9:37 pm
Location: NewYork

Post by veera24 »

ray.wurlod wrote:It is axiomatic that you ought to have no control over physical storage of any individual record; this is a function of the database server so that it can retrieve that record efficiently. Hashed fil ...

hi i couldn see ur information properly. could you pls. help me out.

Thanks...
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

veera24 - in order to see Ray's post in its entirety, consider signing up for a membership
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Re: Hashed file

Post by PhilHibbs »

veera24 wrote:Usually when we write data into Hashed file, then the records will be in shuffled manner. Why is it so?
That's what hashing is - it generates a number based on the input key, often by adding up all the ASCII characters with weighting and modulus functions applied, and uses the resulting number to decide where to put or look for the result. If you stream the contents out, you get them in numerical order of the hash key, which may well appear random. The better the hashing algorithm, the more random the positions are, as this produces fewer "collisions" where two records produce the same hashed value and it then has to store them in a list which it then has to traverse linearly (usually, could be a btree though) to find the right one.
Phil Hibbs | Capgemini
Technical Consultant
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply