Hash file aKey

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
koojo
Premium Member
Premium Member
Posts: 43
Joined: Sun Jul 11, 2004 1:30 pm
Location: USA

Hash file aKey

Post by koojo »

I need to do the following

Create a Hashfile with the following composite keys
Field one - Integer --Duplicates Allowed [Defined as key on creation]
Field two - Date [Defined as key on creation]
Field Three - Date [Defined as key on creation]

Use Hash File as a look up for the above created hash file as follows
Field one - Integer --Duplicates Allowed [Defined as key on reading]
Field two - Date [Not Defined as key on reading]
Field Three - Date [Not Defined as key on reading]



Will I lose records in Field1 on reading by elimination of duplicate records on the hash files or will this retain the duplicates (I need to retain the duplicates)

In a nutshell on generation of the hash file I have three composite keys(making it unique) and on reading them I need to read from one of the fields which is not unique and not lose the duplicate records.

The lookup is for a large table an is required in multiple jobs

Is it a good idea to do the above or is there a better way to do it?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard! :D

In the job that populates the hashed file, specify all three columns as key, and check the Create check box. (Optionally click on the Options... button to set sizing options, and "delete before create" choice.)

Save the hashed file column definitions into the repository, using the Save button on the Columns tab of its input link. Choose the category where you want the table definition to be saved before doing so.

When you're designing the job that uses the hashed file, you can Load the table definition from where you saved it. All three columns will be defined as keys.

I have good news and bad news. First the bad news.

While you can do all this it will not work if you use a Hashed File stage for reading. A Hashed File stage uses the key, the whole key, and nothing but the key, to locate (or fail to locate the record). A Hashed File stage can, therefore, only ever return one row (since it selects based on the entire primary key).

The good news is that you can do what you want to do, but you must use a UV stage, not a Hashed File stage. A UV stage uses SQL to access data in the hashed file, and supports multi-row return capability. Load the table definition from where you saved it earlier, then change the two date key column definitions so that they are not key.

To enable the multi-row return capability you must visit the link properties in the Transformer stage - see on-line help topic Defining Multirow Lookup for Reference Input for more details.

Performance of this lookup would be enhanced by creating an index on the single remaining key column. The commands required are CREATE.INDEX and BUILD.INDEX. Or, if you use a UV stage when populating/creating the hashed file, then the command is the SQL command CREATE INDEX.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
koojo
Premium Member
Premium Member
Posts: 43
Joined: Sun Jul 11, 2004 1:30 pm
Location: USA

Thanks

Post by koojo »

Opens a new word for me :lol:
Post Reply