Oddity in Hash Files

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
JDionne
Participant
Posts: 342
Joined: Wed Aug 27, 2003 1:06 pm

Oddity in Hash Files

Post by JDionne »

I am seeing an oddity in hash files that I dont understand. I haev one job that creates the hash files. I am using a directory to recieve the hash files. I then go into another job and try to use the hash files. When I look at the files in the second job the columns are the same but the data values in them have changes. The columns seem to have switched around. Makes joins realy hard heheh
anyone ever seen anything like that?
Jim
Sure I need help....But who dosent?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Are you sure that the same columns in both jobs are checked as 'keys'?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Peytot
Participant
Posts: 145
Joined: Wed Jun 04, 2003 7:56 am
Location: France

Post by Peytot »

Another possibility : check the field 'Position'. If you have a value, it's perhaps not the good one. So, you can supress this value. Personaly, I already put this field to blanck, like that I suppress a risk.

Pey
sdevashis
Participant
Posts: 54
Joined: Thu Oct 09, 2003 4:00 am
Location: India

The sequence and key matters

Post by sdevashis »

Hi,
The sequence in hash files matter a lot. Its a good practice to save the metadata as a hash file is created.

If you have hash file like, (CUST_KEY, CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_NAME, CUST_ID) Then you are in problem.

The keys in hash files matter a lot. Lets say (CUST_KEY [PK], CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_ID [PK], CUST_NAME) then again you are in trouble as HAsh files know nothing about keys.

I hope things will go fine now. :arrow:
/*Devashis*/
aaronej
Participant
Posts: 31
Joined: Mon Aug 18, 2003 9:25 am

Post by aaronej »

I would go with Peytot's suggestion. That pesky position column always causes problems.

Good luck!

Aaron
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Well, it may "cause problems", but you have to learn to live with it. That's because "position" is how columns are identified in records in hashed files; by their ordinal position.
The hashed file storage mechanism is precisely a delimited string; all values are stored as text, and the location of a particular field is determined by counting those delimiters. For example, to get the column in position four, DataStage counts past three delimiters, then starts accreting bytes until the fourth delimiter (or end-of-record) is encountered.
Position 0 is the key value. If there are multiple key columns, they each have position 0, and DataStage has the means (within the hashed file's dictionary) to determine which is the first, which is the second, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply