Page 1 of 1

NLS and ? Problem

Posted: Fri Mar 31, 2006 5:08 pm
by snassimr
Hi !

My data content characters that DS cn resolve and these chars replaced by "?" and I see it . But really it doesnt "?" in tranformer I tries to find it with INDEX Basic function and returned 0 .

Does anybody have clue how to remove this chars before it inserted to db ?

Posted: Fri Mar 31, 2006 5:17 pm
by ray.wurlod
"?" represents a character that could not be mapped.

The correct way to get rid of it is to choose a map that correctly maps all the characters in your data.

Posted: Fri Mar 31, 2006 5:22 pm
by snassimr
I'll try . Does DS supply map to all strange symbols ?

Posted: Fri Mar 31, 2006 5:29 pm
by ray.wurlod
Different sets of characters are encoded according to different standards. For example, Chinese characters may be encoded according to standards called BIG5 or GB2312. There are many different standards for encoding Japanese characters. DataStage provides maps for most of the commonly-encountered standards; the DataStage NLS manual shows how you can build your own maps if your data are encoded according to a standard that is not already supported. What is vital is that you know how your data are encoded.

Posted: Fri Mar 31, 2006 5:38 pm
by snassimr
You say that there is no way to remove the symbol after the Ds assigned it to "?" ?

This symbol looks very strange. It doesn't looks like letter of any language. But I"ll try to map it

Thank You ! Ill return after some checking

Posted: Fri Mar 31, 2006 11:06 pm
by ray.wurlod
"?" is NOT the invalid character. "?" is a special Unicode character that DataStages uses when it can not convert the invalid character into Unicode (its own UTF-8 encoding of Unicode) based on your assertion of how the external character set is encoded.

You must determine precisely under what standard the external character set is encoded, and place the map for that character set's encoding between it and DataStage.

You may also need to specify a different, though related, character map between DataStage server and client, so that UNIX-to-Windows character set mapping might occur. These secondary maps have names ending in "-CS".