QualityStage standardisation problem with utf8 character set

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
boxtoby
Premium Member
Premium Member
Posts: 138
Joined: Mon Mar 13, 2006 5:11 pm
Location: UK

QualityStage standardisation problem with utf8 character set

Post by boxtoby »

Guys,

In order to process multi-national data our DS jobs are set to character set utf8 on the NLS settings tab for the job.

I have recently changed the dsenv settings to reflect this as follows:

NLS_LANG=AMERICAN_AMERICA.UTF8
export NLS_LANG

# This entry can help with special characters
LC_CTYPE=C.utf8
export LC_CTYPE

LC_ALL=en_us; export LC_ALL

Having made this change the foreign characters are fine in the database (db2) but the QualityStage standardisation jobs crash with the following messages:

Standardize_23,0: Operator terminated abnormally: received signal SIGSEGV
main_program: APT_PMsectionLeader(1, node1), player 5 - Unexpected exit status 1.
main_program: Step execution finished with status = FAILED.

If I remove the dsenv changes the QualityStage jobs run fine.

Has any seen this sort of thing before?


Thanks,
Bob.
Bob Oxtoby
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Bob,

If you have NLS enabled, and your Job set to UTF8(Unicode), I guess that you should set the columns' extended property to unicode

Try setting only columns that will carry unicode data
Last edited by JRodriguez on Tue Feb 16, 2010 9:27 am, edited 1 time in total.
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
boxtoby
Premium Member
Premium Member
Posts: 138
Joined: Mon Mar 13, 2006 5:11 pm
Location: UK

Post by boxtoby »

Hi Julio,

All columns are set to unicode as they are throughout the whole application.

I can't really choose which columns to set to unicode or not.

It seems strange to me that QS doesn't appear to handle utf-8 when it uses that code page under the hood.


Bob.
Bob Oxtoby
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

That's correct only for server jobs... Server jobs use UTF-8, parallel jobs use UTF-16

In IIS 8.1 QS jobs are paralell type by default. Could you try setting the Standardization job NLS tab to UTF-16?
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
boxtoby
Premium Member
Premium Member
Posts: 138
Joined: Mon Mar 13, 2006 5:11 pm
Location: UK

Post by boxtoby »

Hi Julio,

I tried setting the job to UTF-16 but the result was the same, unfortunately.

Bob.
Bob Oxtoby
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Would you mind posting your STD job design? Source?

There is known error in QS when extracting char type from UTF8 databases. Setting the LANG to American_America. UTF8 will extract extra bytes..... the local fix was to change the char type to varchar

Is this your case?


Another posibility is that an environment variable for NLS_LANG should be added to your project. Use the Administrator client and set an environment variable --> NLS_LANG = AMERICAN_AMERICA.UTF8

Let us know
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
Post Reply