Multi-byte field disturbing the position of next fields

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ssunda6
Participant
Posts: 91
Joined: Tue Sep 19, 2006 9:32 pm

Multi-byte field disturbing the position of next fields

Post by ssunda6 »

Hi,

We have few multi-byte columns. The field length(in terms of number of characters) is fixed. e.g., Address cannot be more than 5 character length but it can contain German special chracters(umlaut) in which case it might be more than 5 byte length.

Inorder to maintain same position for each field in a row, I have defined the datatype as char so it would add extra spaces to make it fixed length.

But here is the problem. If we consider all colums to be of char(5), we can notice in the 2nd row, when col3 contains special char - it pushed the next colum by 1 position. How can I avoid this?

Code: Select all

Col1  Col2 Col3  Col4
XXX  YYY  TEST XXX
AAA  BBB  füR  XXX
I tried to get the Length in bytes so that I can manually calculate how many spaces to pad, but no function seem to give that value. I also read another posts and used Raw functions or len on raw string does not return the right results.
ssunda6
Participant
Posts: 91
Joined: Tue Sep 19, 2006 9:32 pm

Post by ssunda6 »

One of the posts suggested using BASIC transformer to get byte length. I see LENDP function but the mapname argument is not optional in 8.x. What value should I specify to find out byte length(display length) for UTF-8? UTF8 is giving error, looks like it accepts some numeric value. Please guide what that value will be?

Also another option seems to be BYTELEN function but I dont see that in 8.x basic transformer stage? I am missing something here?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can type BYTELEN in. It will be red but will compile, as it is a part of the language. However not every function in the language is in the list of available functions stored in DSParams.

Do you know what these characters actually are? Are you using an NLS map when reading the file and, if so, which map? Might there be a better choice of map available?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply