Hash File Question
Moderators: chulett, rschirm, roy
Hash File Question
I've come across a job that selects data from a database table and populates a hash file. In particular, one source column has a datatype of char(40) in the source system. However, in DataStage the column has been defined incorrectly as char(10) and is passed through as such to the hash file where the column is also defined as char(10). (It's the last column in the hash file.)
I realize that this discrepancy is incorrect and must be fixed, however... when I view the hash file, I can see all 40 characters contained in the source data, not just the first ten. Will this difference between the defined maximum column width and actual values cause the lookup results be incorrect, or is the tool fairly forgiving in these types of cases?
Thanks!
Peter
I realize that this discrepancy is incorrect and must be fixed, however... when I view the hash file, I can see all 40 characters contained in the source data, not just the first ten. Will this difference between the defined maximum column width and actual values cause the lookup results be incorrect, or is the tool fairly forgiving in these types of cases?
Thanks!
Peter
DataStage hashed file don't have datatype based limitations. Everything in the record is stored as one long noncontinguous string. The data types and lengths you specify in the metadata are mainly for use in other stages. So there is no reason to modify your Hashed file metadata to reflect accurate string lengths.
I will be important to note that a CHAR(10) column from a hashed file that contains up to 40 characters will cause problems when writing to an Oracle CHAR(10) column since DS won't automatically or implicitly truncate the data.
I will be important to note that a CHAR(10) column from a hashed file that contains up to 40 characters will cause problems when writing to an Oracle CHAR(10) column since DS won't automatically or implicitly truncate the data.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
The key is stored separately from the data. A hashed file only has one unique key string. Multiple or compound keys as used in DS jobs are actually stored as one string with a specific separator.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Actually the key is a part of the string. Each level has a separate separator starting with char(255) which cannot be used for anything. This separates keys from the record also known as @FM. Char(254) is called a field mark. Char(253) is called a value mark also @VM. A value mark will separate multiple values or an array at the column level. Char(252) is called a subvalue mark. Char(251) is called a text mark and is @TM. @TM is used to separate multiple part keys.
Mamu Kim
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Yes if accessed natively, no if accessed via SQL which does enforce data types, etc. If the hashed file is created as a UV table, however, then data types and security and integrity constraints are honoured at all times.
The only exception is where you have created triggers - these can not be bypassed at all. But this is not something currently practised in DataStage - mainly because it's not required.
The only exception is where you have created triggers - these can not be bypassed at all. But this is not something currently practised in DataStage - mainly because it's not required.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.