Hi -
We hv a data migration project using only the DS Server components.
In this, we would receive the source data in UTF8 character set, which also contains some special characters/non-ascii charcters as well - we need to handle them.
Solution attempted-
For this purpose, we changed the DS project level NLS to UTF8. Logically, this should let this NLS to flow-down to all the jobs/stages used in the DS project, right!
Result-
However, the output comes out with some junk characters or "?" wherever it finds any non-ascii characters.
The fields to contain the above mentioned special characters is of datatype Varchar.
We are not doing any changes for NLS at the individual job/ stage level - when checked for the NLS being reflected at any job I find-
value for Default map for stages is 'Project default (UTF8)'
value for Default locale categories CType is 'Project (DEFAULT)'
Am I missing something here..? Is there a better way of handling the subject requirement? Appreciate your help on this.
Additional Info/ details-
- DS version is 7.1
- We are deploying the server components/jobs on PX - though the configuration being used is 1x1 nodes.
Thanks,
Nitin
Handling data in UTF8 Character Set
Moderators: chulett, rschirm, roy
hello dhletl,
normally seeing a "?" in a DataStage NLS context does not necessarily mean that the value is really a question mark; many editors and view programs are not NLS enabled and will convert UTF-8 non-latin characters to a "?" in their output. This also applies to the DS view-data windows which mask undisplayable characters with a "?".
So it is important to be 100% certain that this is not the cause of the problems. If you use that same editor at the same workstation with the same user to look at a file with multibyte characters (I usually use some japanese language help file in windows or create a dummy file on UNIX) that display correctly, and then look at my DataStage output and get "?" marks then I know something has gone wrong; but more often than not it ends up displaying correctly.
normally seeing a "?" in a DataStage NLS context does not necessarily mean that the value is really a question mark; many editors and view programs are not NLS enabled and will convert UTF-8 non-latin characters to a "?" in their output. This also applies to the DS view-data windows which mask undisplayable characters with a "?".
So it is important to be 100% certain that this is not the cause of the problems. If you use that same editor at the same workstation with the same user to look at a file with multibyte characters (I usually use some japanese language help file in windows or create a dummy file on UNIX) that display correctly, and then look at my DataStage output and get "?" marks then I know something has gone wrong; but more often than not it ends up displaying correctly.