Fixed width ASCII file - Chinese Chars

anu123 · Post by **anu123** » Thu Feb 07, 2013 10:14 pm

Hello,

I am creating the target file as fixed width (ASCII) with Sequential file stage.I have one field where i get Chinese chars , in any record when i have Chinese chars in that filed then it is moving the value of next filed to right , so record format is going wrong.

I am using UTF-8 in the job properties level
I have Unicode for that specific field across the job.
Filed Length is 100

Please help me to fix the issue .

Thanks

ray.wurlod · Post by **ray.wurlod** » Thu Feb 07, 2013 10:43 pm

All specifications should be in CHARACTERS. So the number of bytes per character is not an issue.
Do your data contain any double-width Chinese characters (that is, characters that take up two display positions, such as "double happy")? If so that might be affecting your file format.

anu123 · Post by **anu123** » Fri Feb 08, 2013 10:21 am

ray.wurlod wrote:All specifications should be in CHARACTERS. So the number of bytes per character is not an issue.
Do your data contain any double-width Chinese characters (that is, characters that take up two display positions, such as "double happy")? If so that might be affecting your file format.

Hi Ray,

Thank You so much ,

I have CHAR for all fields , you are right l have double-width Chinese Chars. Is it possible to Trim the value to fit as length specified ? is there any specific function to Trim the Chinese char ?

eph · Post by **eph** » Fri Feb 08, 2013 11:24 am

Hi,

I suggest you take a look at this technote (which is also true for 9.1 and for any application/program) :
http://www-01.ibm.com/support/docview.w ... wg21455000

It is definitely not possible to read a fixed length file using an unfixed character set length (like all UTF-xx). You should use only fixed-length encoding like ISO-8859-xx.

Check on wikipedia for alternatives.

Eric

anu123 · Post by **anu123** » Fri Feb 08, 2013 11:32 am

Hi Eric,

I am trying to write to the file (Target) , suggest me ..thanks

eph · Post by **eph** » Mon Feb 11, 2013 2:52 am

Hi,

I don't think this is possible as noted in this technote :
http://www-01.ibm.com/support/docview.w ... wg21485843

Hence, a fixed length text file is not possible to generate with UTF-8.

Maybe someone else could give you a way to do it.

Eric

ray.wurlod · Post by **ray.wurlod** » Mon Feb 11, 2013 1:09 pm

You cannot trim a double-width character to make it smaller. It's called double width because it has so many strokes that it occupies more space than a regular character. (Or it's deliberately double width for the purposes of beauty in typography, such as the double width space.)