Page 1 of 1

Regarding the Hashed File Stage Column Length

Posted: Wed Oct 26, 2005 10:07 am
by loveojha2
Hi All,

I have a question related with the column length that we specify for varchar columns for Hashed Files.

I have a Hashed File which is of more than 20 fields, all fields are having datatypes as varchars, The size of some fields is more than even 4000 characters. But I don't have exact idea about the lengths of the fields.
I am using Varchar(100) for each of the fields.

By this I am creating a new Hashed File (it is just a copy+some transformations) and even in that also I am using the same length as 100 for each column.

I don't know how, but any how it works fine. Why? :?

Thanks in advance.

Posted: Wed Oct 26, 2005 10:30 am
by ArndW
Loveojha2,

the underlying engine in DataStage has never been typed and doesn't restrict string data types, the concept of "length" has been limited to display lengths and the string mechanism doesn't reserve contiguous blocks in order to store strings.

Parts of this approach have been inherited in DataStage server jobs. If you read a text file with a "VarChar(32)" definition but which contains 4096 character then DataStage server will not even notice and process this correctly. You will get an error when writing to a database which has a VarChar(32) limitation, but the error will come from the ODBC or database interface and not from DataStage.

Posted: Wed Oct 26, 2005 1:40 pm
by kcbland
Read up on the hash files in the most recent newsletter. There are no datatypes, no practical size limitations, on columns.

Posted: Wed Oct 26, 2005 1:49 pm
by DeepakCorning
ArndW wrote:Loveojha2,
the underlying engine in DataStage has never been typed and doesn't restrict string data types, the concept of "length" has been limited to display lengths and the string mechanism doesn't reserve contiguous blocks in order to store strings.
Does that mean it doesn't really matter what you mention in type or in length. Its not going to matter in the form in which the data is stored in a hash file? So even if i define a column to be a varchar of 1 and pass char of 20 to it , it will store the data? It will not truncate it for sure. What happens if such a file is loaded and then used as lookup??
So that means even if I have to change from CHAR 5 to char 2 I have to use a function to substring and the defining of different length will not handle this situation.
I tried it by desgining some jobs and yeah seems to be true. It does not throws up a error untill the database comes in picture.

Posted: Wed Oct 26, 2005 2:31 pm
by ArndW
DeepakCorning,

that's the way it works. Declare it as a integer column reading from a text file and write it to a hash file stage as a char(10) and read it in another stage as a decimal(10,2) and it will still contain the original contents, regardless of it was "hello there" or Japanese text or "1.2345".

Posted: Wed Oct 26, 2005 2:53 pm
by kcbland
What does matter is the scale. When there is a scale setting and you perform math (just a "+0" will do) on the column, the number is rounded to to the setting of the scale if the data type is DECIMAL. Other than that, there's no interaction with the metadata during processing until you reach a database, or a Sequential file stage, where the metadata matters.

Posted: Wed Oct 26, 2005 9:03 pm
by loveojha2
Thank you all, Kenneth and Arnd. Thanks for your replies :)