Fixed width ASCII file - Chinese Chars

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Fixed width ASCII file - Chinese Chars

Post by anu123 »

Hello,

I am creating the target file as fixed width (ASCII) with Sequential file stage.I have one field where i get Chinese chars , in any record when i have Chinese chars in that filed then it is moving the value of next filed to right , so record format is going wrong.

I am using UTF-8 in the job properties level
I have Unicode for that specific field across the job.
Filed Length is 100

Please help me to fix the issue .

Thanks
Thank you,
Anu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

All specifications should be in CHARACTERS. So the number of bytes per character is not an issue.
Do your data contain any double-width Chinese characters (that is, characters that take up two display positions, such as "double happy")? If so that might be affecting your file format.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

ray.wurlod wrote:All specifications should be in CHARACTERS. So the number of bytes per character is not an issue.
Do your data contain any double-width Chinese characters (that is, characters that take up two display positions, such as "double happy")? If so that might be affecting your file format.
Hi Ray,

Thank You so much ,

I have CHAR for all fields , you are right l have double-width Chinese Chars. Is it possible to Trim the value to fit as length specified ? is there any specific function to Trim the Chinese char ?
Thank you,
Anu
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

I suggest you take a look at this technote (which is also true for 9.1 and for any application/program) :
http://www-01.ibm.com/support/docview.w ... wg21455000

It is definitely not possible to read a fixed length file using an unfixed character set length (like all UTF-xx). You should use only fixed-length encoding like ISO-8859-xx.

Check on wikipedia for alternatives.

Eric
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Hi Eric,

I am trying to write to the file (Target) , suggest me ..thanks
Thank you,
Anu
eph
Premium Member
Premium Member
Posts: 110
Joined: Mon Oct 18, 2010 10:25 am

Post by eph »

Hi,

I don't think this is possible as noted in this technote :
http://www-01.ibm.com/support/docview.w ... wg21485843
Hence, a fixed length text file is not possible to generate with UTF-8.
Maybe someone else could give you a way to do it.

Eric
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You cannot trim a double-width character to make it smaller. It's called double width because it has so many strokes that it occupies more space than a regular character. (Or it's deliberately double width for the purposes of beauty in typography, such as the double width space.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply