Page 1 of 1

Funny characters in Sequential file columns of a paralell jo

Posted: Tue Sep 16, 2008 12:47 am
by eldonp
We are re-writing a certain job into parallel. The job creates a pipe-delimited flat file.

Problems is, for all varchar fields - for cases that there is less data than the field length, that DataStage is not trimming the data, as per our trimb(column) function, and is in fact inserting 'place holder' characters that resemble a vertical rectangle.

When we do a view data on the file in the job design, we do not see these characters.

When we more/vi the file in telnet, we do not see these characters.

When we try to consume the file in another system, these characters are interpreted as end of line characters and the file is truncated at column 7.

When we view the file in Textpad, we see these 'place-holder' characters.

Has anyone else experienced this, or know how we can overcome it?

Posted: Tue Sep 16, 2008 1:02 am
by ray.wurlod
Lots of people, as a Search would have revealed. This "small rectangle" pad character is actually a NUL byte (0x00). It is the default string pad character used in DataStage parallel jobs. In C programming, this character is understood to mean "end of string", which may explain what you're seeing when consuming the data. You need to change the default string pad character by setting the new pad character as the value of the APT_STRING_PADCHAR environment variable.

Re: Funny characters in Sequential file columns of a paralel

Posted: Tue Sep 16, 2008 3:26 pm
by AKUMAR21
use APT_STRING_PADCHAR with value 0X20 where 20 is hex value for space.

Posted: Fri Sep 19, 2008 11:18 pm
by ambasta
Try this
Lets suppose column A data is of length 10.
Take a subsrting of full length i.e. A[1,10]
Now trim it
Trim(A[1,10]) will remove all the spaces..

Posted: Sat Sep 20, 2008 12:20 am
by ray.wurlod
Maybe but, if the data type is Char(10) - which the framework calls string[10] - it will immediately be refilled with the character specified by the APT_STRING_PADCHAR environment variable.

The effect is that you can't really trim a Char data type. Trim is only really sensible with VarChar. The framework represents VarChar(10) as string[max=10].