Hash file size

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
yaminids
Premium Member
Premium Member
Posts: 387
Joined: Mon Oct 18, 2004 1:04 pm

Hash file size

Post by yaminids »

Hello there,
Is it possible to build a 'Hash file' with 20 million rows. Does it reach the limit of hash file size of 2GB. Can anyone clear my doubt.
Thanx in advance

-Yamini
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, it's possible.

Whether it reaches the 2GB limit obviously depends on the length of the rows.

If it's likely to exceed 2GB, create a hashed file that uses 64-bit internal addressing; the 2GB limit then goes away (at the cost of a small amount of additional storage overhead). Theoretical maximum size is then approximately 19 million TB. That's not a challenge! Some operating systems do not support files this large.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

20 million rows, how many average characters per row? If each row is 100 characters, then you need o store 20 X 100 million characters, which I would guess exceeds 2.2 gigabytes.

So, you need to do a little "data profiling" and figure out if your average row will keep you in 32-bit sizing or send to you 64-bit heaven. You may consider a "partitioned" hash file arrangement, or go for the gusto of a "distributed" hash file. Both terms are discussed in various threads on the forum as well as configuring for 64-bit usage which is often simpler in use.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
sdmy
Participant
Posts: 18
Joined: Thu Dec 09, 2004 7:16 pm

Post by sdmy »

To estimate the hash file sixe, how can i find the length of the record in Oracle?

Thx in Adv
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
Just wanted to point on the previous post, that how much it takes in the DB is not always the same size in your hash file.
to get the average size in the DB you can ask your DBAs.
to get the size in a hash file you can write representative rows to a hash file and use the ANALIZE.FILE to get this info.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Also, if you have a tool like TOAD, it can compute the Average Record Length for you - either based on the metadata, or (more accurately) by scanning the contents of the table.
-craig

"You can never have too many knives" -- Logan Nine Fingers
yaminids
Premium Member
Premium Member
Posts: 387
Joined: Mon Oct 18, 2004 1:04 pm

Re: There are ways to size a Hashed File, as it becomes too

Post by yaminids »

Hello trokosz,
Thank you very much for your posting.
-Yamini
Post Reply