Accessing Hashed files using UV stage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
dhiraj
Participant
Posts: 68
Joined: Sat Dec 06, 2003 7:03 am

Accessing Hashed files using UV stage

Post by dhiraj »

Hi,

I am using an UV stage to access a hashed file.Hash file is defined with three keys A,B and C. In which A is of type char and B & C are timestamps.My look up condition to the hashed file(using UV stage) has an equal to (=) contion on column A ,Less Than ( <) condition on column B and greater than(>) condition on column C. This look up is performing very poorly in terms of throughput.

NOw my Question is that whether the UV stage will use the Hashinhg algo rithm to do the look up or do i need to seperately define an index on these fields. Is there any other way to improve the performance. The size of the hash file is around 7 million records.

Thanks,


Dhiraj
WoMaWil
Participant
Posts: 482
Joined: Thu Mar 13, 2003 7:17 am
Location: Amsterdam

Post by WoMaWil »

Hi Dhiraj,

I don't understand what you are doing with you lookup but the way you describe it you seam to do something wrong.

Please give some further detail, what you want to do and what you did for to solve the problem, so that we can help you.

Wolfgang
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If your goal is to emulate the behaviour of a SQL Lookup using "between" you might want to check out this post by the ever popular Mr Kenneth Bland. It describes an alternate methodolgy to do what (I believe) you are trying to do.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The answer to the question is NO.

The hashing algorithm can only be used (and only makes sense) if the exact value of the key is being sought. It is the value of the key that is processed by the hashing algorithm to determine the location of the record.

You can improve performance markedly by indexing the key or its individual columns.
  • If you created the hashed file via the UV stage (that is, using CREATE TABLE), then you use the SQL form of CREATE INDEX and you can index the entire key.
    If you created the hashed file via a Hashed File stage (that is, using CREATE.FILE or mkdbfile), then you use the non-SQL form CREATE.INDEX and must index @ID to index the entire key.
Given the nature of your query it makes better sense to index the individual columns of the key than to index the entire key.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply