Hash File Different

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
snassimr
Premium Member
Premium Member
Posts: 281
Joined: Tue May 17, 2005 5:27 am

Hash File Different

Post by snassimr »

Hi !

I need help in hash files.

I need create hashfile 20000000 rows with 6 keys . It to slowly.
I want to create hash file with one key that is faster and go to Inputs-> Columns and add 5 keys that I needed before and use the file with 6 needed keys in another place.

I don't suceed because the some columns are empty when I do View data
What problem with this ?

Have anybody solved the issue of low writing another way?
sjacobk
Participant
Posts: 9
Joined: Fri Apr 15, 2005 4:32 am
Location: India

Post by sjacobk »

If you are not doing a lookup inside your job, create a sequential file first.
And later when the hash file is required create the hash file. I think you need to specify all the six keys when you create the hash file. Hash file is just like a table and the keys acts as if it is the primary mentioned in the table.
Smitha Jacob
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

snassimr,

the hash mechanism in a default Hash file using the dynamic file type can be slow when writing a lot of data. If you ensure that your minimum modulus is not the default of 1 but a larger number when you create the file the speed at which data is written is much, much faster. You can search this forum for details.

If you declare a hash file with a compound key using just one key but at another place click on other columns as "key" fields you will not get correct results; the keys all need to be declared when you create the file and you need to use the same keys in all jobs when reading from or writing to this hash file.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your approach to add key columns does not work with hashed files. A hashed file must, at all times, use the key, the whole key and nothing but the key to determine the location of a record.

20000000 rows is an awful lot for a hashed file - do you REALLY need them all? Can you be selective about which rows are loaded, for example exclude out-of-date information?

You will definitely need to create the hashed file so that it can grow beyond 2GB - that is, with 64-bit internal addressing. Search the forum for how to achieve this. Since you will need a custom command for creating the hashed file, you can easily incorporate Arnd's suggestions about either (or both) preallocating disk space or using a static - rather than dynamic - hashed file.

Note, too, that such a large hashed file will not be able to use write cache, because it is larger than the biggest cache that can be allocated.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply