New column declared as integer; value stored as string

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

New column declared as integer; value stored as string

Post by pdntsap »

We have an integer value as input in one of the columns. We declare a new column as an integer in the Transformer stage and perform some basic if-then-else logic on the incoming integer column and store an integer in the newly declared interger column of the Transformer stage. The problem is this integer is converted into a string and stored as a string (within double quotes as we have give the Field default option Quote=Double). We are not sure why this is happening and any help would be appreciated.

Thanks.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

What do you mean by storing? Where are you storing the value--database, dataset, sequential file? What is the schema of the output?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Post by pdntsap »

We are making use of sequential files. So the data flow is:

Sequential file---> Transformer---> Sequential file.

If the input file has 10 columns, the output file has 11 columns with a new column added in the Transformer stage.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The sequential file operator is converting the values to string representations of the data as part of its normal operation.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Post by pdntsap »

What is the best method of avoiding the conversion to strings? I ran into a similar situation where we perform a join on an integer column present in two sequential files and and the column was converted to string in the ouput sequential file.

Thanks.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

In your "similar problem" case, were the integers truly stored in an integer format (binary instead of character) in the sequential files, or were they simply a string of character digits? For example, regard the following record from a text (i.e. all character) csv file:

Code: Select all

92,91,50,25,72
There are five 2-byte columns of data. Would you say that these are integers or that they are strings? I can treat them as either.

If you're calling your converted integers strings just because they have quotes around the values, remove the quotes option from the Field Defaults category on the Format tab in your sequential file stage. If you still want quotes around your strings, add the option at the column level.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Post by pdntsap »

Removing the quotes option from the Field Defaults category on the Format tab in the sequential file stage fixed the problem.

James,

I believe the integers were stored as integer but were enclosed in strings. How do you check the hex code of the file? I believe looking at the hex code gives you the exact picture, right?
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

If you can read the integers when viewing the file with more, less or cat, then the values are stored as character strings. If they were stored as actual integer format fields, you would see garbage and/or your records would appear to be corrupted somehow.

more, less and cat have no concept of data layouts other than standard line terminators and therefore have no ability to automatically convert non-character data (i.e. Integers in integer format) to character.


I believe the integers were stored as integer but were enclosed in strings
They were stored as character representations of integers and enclosed in quotes. You could not have visually read them if they were anything other than character strings.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Post by pdntsap »

That makes a lot of sense now. I was able to view them using cat, more, less and so the integers were stored as character representations.

Thanks for the help.
Post Reply