Datastage, Oracle and UTF-8

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Paul M
Participant
Posts: 19
Joined: Mon Nov 13, 2006 11:11 am

Datastage, Oracle and UTF-8

Post by Paul M »

For a client that has Siebel with Oracle 10g installed wih UTF-8 support, we have to extract data from Siebel into flat XML files. The proces abort after a while, we found out this is probably due to varchar string export that is UTF-8 encoded. The Datastage server software was installed with UTF-8 support, so I can't see what's happening. The datastage log file does not give any information, the job just aborts. I found messages in this forum about the same subject, I read something about the Oracle NLS_LANG setting, but I guess this relates to Windows client oriented software, not server jobs generating files on AIX.

Thanks in advance for any reply, gr. Paul Mulder, The Netherlands
nisaumande
Participant
Posts: 13
Joined: Fri Aug 11, 2006 11:57 am
Location: Toulouse, France

Post by nisaumande »

The NLS_LANG setting is used by any Oracle client, Windows or Unix.

Do you have any error from Datastage ?
Are the NLS activated in Datastage ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Re: Datastage, Oracle and UTF-8

Post by ArndW »

Paul M wrote:...we found out this is probably due to varchar string export that is UTF-8 encoded... The datastage log file does not give any information, the job just aborts...
What leads you to believe that the problem is being caused by NLS and UTF-8 settings?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

UTF-8 is not a single standard. To say that UTF-8 is a standard is like saying that UNIX is a standard. There are many eight bit Unicode Transformation Format schemes. Even the one used inside DataStage is idiosyncratic, as it preserves dynamic array delimiter markers as single-byte characters.

It's usually a matter of experimenting with the mechanisms and maps available with both sides of the problem (in your case Oracle and DataStage) to find a combination that works. For example, for Oracle you probably need both NLS_LANG and LANG_C environment variables to be set. Do you have NLS enabled for DataStage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply