I am seeing an oddity in hash files that I dont understand. I haev one job that creates the hash files. I am using a directory to recieve the hash files. I then go into another job and try to use the hash files. When I look at the files in the second job the columns are the same but the data values in them have changes. The columns seem to have switched around. Makes joins realy hard heheh
anyone ever seen anything like that?
Jim
Oddity in Hash Files
Moderators: chulett, rschirm, roy
Oddity in Hash Files
Sure I need help....But who dosent?
The sequence and key matters
Hi,
The sequence in hash files matter a lot. Its a good practice to save the metadata as a hash file is created.
If you have hash file like, (CUST_KEY, CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_NAME, CUST_ID) Then you are in problem.
The keys in hash files matter a lot. Lets say (CUST_KEY [PK], CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_ID [PK], CUST_NAME) then again you are in trouble as HAsh files know nothing about keys.
I hope things will go fine now.
The sequence in hash files matter a lot. Its a good practice to save the metadata as a hash file is created.
If you have hash file like, (CUST_KEY, CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_NAME, CUST_ID) Then you are in problem.
The keys in hash files matter a lot. Lets say (CUST_KEY [PK], CUST_ID, CUST_NAME) and you try to use that as (CUST_KEY, CUST_ID [PK], CUST_NAME) then again you are in trouble as HAsh files know nothing about keys.
I hope things will go fine now.
/*Devashis*/
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Well, it may "cause problems", but you have to learn to live with it. That's because "position" is how columns are identified in records in hashed files; by their ordinal position.
The hashed file storage mechanism is precisely a delimited string; all values are stored as text, and the location of a particular field is determined by counting those delimiters. For example, to get the column in position four, DataStage counts past three delimiters, then starts accreting bytes until the fourth delimiter (or end-of-record) is encountered.
Position 0 is the key value. If there are multiple key columns, they each have position 0, and DataStage has the means (within the hashed file's dictionary) to determine which is the first, which is the second, and so on.
The hashed file storage mechanism is precisely a delimited string; all values are stored as text, and the location of a particular field is determined by counting those delimiters. For example, to get the column in position four, DataStage counts past three delimiters, then starts accreting bytes until the fourth delimiter (or end-of-record) is encountered.
Position 0 is the key value. If there are multiple key columns, they each have position 0, and DataStage has the means (within the hashed file's dictionary) to determine which is the first, which is the second, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.