Hash File Key Columns
Moderators: chulett, rschirm, roy
Hash File Key Columns
I am populating 13 columns into a hash file. Out of which first 6 columns are keys, as these 6 columns make the record unique.
The same hash file is used as a look up in transformer. This hash file is looked up by input source. I need to compare only first 4 columns of hash files with that of source.
So, for that, while reading from hash file, I need to make only first 4 columns of hash file as keys in transforemr because if the 5th & 6th column are made as key, there is no field in source to compare, and it gives me error.
But if i make only 1st 4 fields as keys, the incoming data from hash file is wrong, and so look up is not done properly. Wrong in the sense the values for columns are interchanged in hash file while reading.
If I make only 4 fileds as keys while populating hash file, only latest values for 5th & 6TH columns will be present as it overides the old values.
What is the solution for this. I need to compare only 1st 4 fields, but the hash file has to be populated with 6 fields
The same hash file is used as a look up in transformer. This hash file is looked up by input source. I need to compare only first 4 columns of hash files with that of source.
So, for that, while reading from hash file, I need to make only first 4 columns of hash file as keys in transforemr because if the 5th & 6th column are made as key, there is no field in source to compare, and it gives me error.
But if i make only 1st 4 fields as keys, the incoming data from hash file is wrong, and so look up is not done properly. Wrong in the sense the values for columns are interchanged in hash file while reading.
If I make only 4 fileds as keys while populating hash file, only latest values for 5th & 6TH columns will be present as it overides the old values.
What is the solution for this. I need to compare only 1st 4 fields, but the hash file has to be populated with 6 fields
Aim high
Could you please explain why you wanted to create hash file with 6 key columns when you are using only 4 for lookup.
I don't think you can create hash file with 6 key columns and using only 4 key columns. Because Hash file work in the following way.
When it stores the data, hash key will be generated for all unique key columns and based on that it will be stored / retrieved from hash file. if you use only a few as key in lookup wrong hash key will be generated which will fetch wrong values.
I don't think you can create hash file with 6 key columns and using only 4 key columns. Because Hash file work in the following way.
When it stores the data, hash key will be generated for all unique key columns and based on that it will be stored / retrieved from hash file. if you use only a few as key in lookup wrong hash key will be generated which will fetch wrong values.
It is Hahsed file and not Hash file.
And the hashing mecanism is for the whole keys rather than the keys.
Its basically stores the data based on the order of the field. The position can be still changed while reading the file using the "Position" option. But care should be take while using this. It will totally confuse the order of the fields.
And the hashing mecanism is for the whole keys rather than the keys.
Its basically stores the data based on the order of the field. The position can be still changed while reading the file using the "Position" option. But care should be take while using this. It will totally confuse the order of the fields.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
As the OP stated that input has many records when considiring first 4 keys and the duplicate rows are getting rejected during the key field mark. And would be able to get the full list only if all the six fields are marked as key.
But its a very good valid point that, if the lookup is done only based on four keys, the same duplicate issue will occur. Lets see whats the requirement of OP.
But its a very good valid point that, if the lookup is done only based on four keys, the same duplicate issue will occur. Lets see whats the requirement of OP.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Kumar I want to compare first 4 only not 3rd, 4th, 5th, 6th columns. I didnt understand the follwoing sentence.
suppose values for 6 columns are as below for 2 records:
a b c d e f g
a b c d e s t
If I make first 4 columns as keys then in hash file only the second records will be present.
a b c d e s t as it overrides the first record.
So, what should I do to satisfy my requirement
Yes, as oacvb said I am using only partial key columns. I am populating hash file with 6 key columns becauseIf so change the order, maintain the same order while populating as well as lookup operation.
Code: Select all
If I make only 4 fileds as keys while populating hash file, only latest values for 5th & 6TH columns will be present as it overides the old values
a b c d e f g
a b c d e s t
If I make first 4 columns as keys then in hash file only the second records will be present.
a b c d e s t as it overrides the first record.
So, what should I do to satisfy my requirement
Aim high
As noted, you cannot build a hashed file with X key fields and then perform a lookup with less than X key fields. Period. You have two choices here:
1) Build the hashed file with 4 key fields. Yes, the data for the 'missing' 5 & 6 key fields will because they are no longer part of the key and the 'destructive overwrite' mechanism will trump the old values. Make sure that the last one that is written to the hashed file is the one that you want back from the lookup.
2) Use a UV stage over the hashed file to do the lookup. That will allow you to use less keys during lookup that it was built with. It will also mean that each lookup could return more than one record.
The latter is a whole 'nuther conversation. You'd need a VOC record for the UV stage, and you'd need to build an index over the 4 key fields or the lookup will be dog slow. This has been discussed quite a number of times, which a search for the terms BUILD.INDEX or CREATE.INDEX would reveal.
1) Build the hashed file with 4 key fields. Yes, the data for the 'missing' 5 & 6 key fields will because they are no longer part of the key and the 'destructive overwrite' mechanism will trump the old values. Make sure that the last one that is written to the hashed file is the one that you want back from the lookup.
2) Use a UV stage over the hashed file to do the lookup. That will allow you to use less keys during lookup that it was built with. It will also mean that each lookup could return more than one record.
The latter is a whole 'nuther conversation. You'd need a VOC record for the UV stage, and you'd need to build an index over the 4 key fields or the lookup will be dog slow. This has been discussed quite a number of times, which a search for the terms BUILD.INDEX or CREATE.INDEX would reveal.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It is possible, but only using a UV stage, as noted. Performance will be helped by indexing the four columns that make up the partial key. The UV stage does support multiple reference row return (find out how by searching the forum or on-line Help).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.