Regarding Loading chinese characters

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Regarding Loading chinese characters

Post by manojbh31 »

Hi,
I am receiving chinese characters in the feed file. and i am loading this file in DB2 table. In DS i am using UTF 8 in Job properties in NLS tab as well as in stages. Data is getting loaded into the table 1. Next job is loading data from table 1 into Dataset. then from Dataset to Table2, This is also loading correct. Next job is creating txt file. here i am getting warning APT_CombinedOperatorController,0: I am using UTF 8 in Job properties as well as in stage level.

No records are getting rejected all are getting loaded into the txt file.

Can any body suggest me how to proceed to remove the warnings.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Please post your complete and unedited warning message.
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

edOperatorController,0: Invalid character(s) ([xE6]) found converting string (code point(s): 2009-08-12 8.25 [xE6][xAD][xA3][xE5][x9C][xA8][xE7][xAD][x89][xE5][xAE][xA2][xE6][x88][xB7][xE7][x9A][x84][xE4][xB8][xA4][xE4][xB8][xAA][xE5][x85][xAC][xE5][x8F][xB8][xE7][x9A][x84][xE6][x9D][x90][xE6][x96][x99][xE3][x80][x82]8.24 Cao Yan[xE5][xB7][xB2][xE5][x9C][xA8][xE7][xB3][xBB]...) from codepage UTF-8 to Unicode, substituting.
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

APT_CombinedOperatorController,0: Invalid character(s) ([xE6]) found converting string (code point(s): 2009-08-12 8.25 [xE6][xAD][xA3][xE5][x9C][xA8][xE7][xAD][x89][xE5][xAE][xA2][xE6][x88][xB7][xE7][x9A][x84][xE4][xB8][xA4][xE4][xB8][xAA][xE5][x85][xAC][xE5][x8F][xB8][xE7][x9A][x84][xE6][x9D][x90][xE6][x96][x99][xE3][x80][x82]8.24 Cao Yan[xE5][xB7][xB2][xE5][x9C][xA8][xE7][xB3][xBB]...) from codepage UTF-8 to Unicode, substituting.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Now add "APT_DISABLE_COMBINATION" to your job and set the value to "true" for one run and the warning will tell you the actual stage generating the message, please post that.
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

Db2Api_PRELOAD_PLD_COLLECTIONS_TGT,0: Invalid character(s) ([xE6]) found converting string (code point(s): 2009-08-12 8.25 [xE6][xAD][xA3][xE5][x9C][xA8][xE7][xAD][x89][xE5][xAE][xA2][xE6][x88][xB7][xE7][x9A][x84][xE4][xB8][xA4][xE4][xB8][xAA][xE5][x85][xAC][xE5][x8F][xB8][xE7][x9A][x84][xE6][x9D][x90][xE6][x96][x99][xE3][x80][x82]8.24 Cao Yan[xE5][xB7][xB2][xE5][x9C][xA8][xE7][xB3][xBB]...) from codepage UTF-8 to Unicode, substituting.


Db2Api_PRELOAD_PLD_COLLECTIONS_TGT this is my source stage
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

Database is set up to UTF-8. and in other two jobs i am loading the data into DB2 table as well as reading the data.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In the extended attribute for the string do you have "unicode" set?
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

That is set to unicode to extended column in stage
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Ok, what about looking at it differently. What should the string be after the date? It looks like it is garbled, with single-byte extended Latin characters instead of initial bytes of multibyte Chinese ones.
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

Please find the data which i am loading for the above warning.

2009-08-12 8.25 正在等客户的两个公司的材料。8.24 Cao Yan已在系统中void掉发票了。8.20 客户说已退回发票,正在等OO关于合同转让的邮件。8.19 OO要给客户做合同转让,估计要下周才能有所进展,不�

Data is getting loaded into seq file. But i getting above warning. and datatype in table VARGRAPHIC and Datastage i am reading it as VARCHAR.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

What if you were to change your job and all stages in it to UTF-8 to avoid convertions altogether, does the warning still appear?
manojbh31
Premium Member
Premium Member
Posts: 83
Joined: Thu Jun 21, 2007 6:41 am

Post by manojbh31 »

Hi,

This got resolved, i was substr function in the source SQL query. I came to know that we cannot use any functions for different language in SQL statement if it the job is used as UTF
Post Reply