Varchar size issue when migrating Db from ISO8859-1 to UTF8

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
eole
Participant
Posts: 3
Joined: Tue Mar 11, 2008 3:42 am

Varchar size issue when migrating Db from ISO8859-1 to UTF8

Post by eole »

Hello,

We are migrating the Oracle Database from ISO8859-1 characterset to UTF-8. In the same way, the dba changes the NLS_LENGTH_SEMANTICS from BYTE to CHAR. That means a Varchar(20) column is now a character string of 20 characters (and maybe more bytes, in UTF-8, that is a multi-bytes characterset).

the NLS_LANG parameter is set to UTF-8 (before it was ISO8859-1).
On the datastage server supports NLS, and is set to UTF-8 too (at project level before it was ISO8859-1 too).

Since those changes,when inserting data in the database, the jobs are returning the warning :
Inserted value too large for column, row rejected.

How to solve this problem?

Thanks in advance for your replies

Best Regards.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

My first guess would be that the change applies only to new tables, and that existing tables are still byte based and thus causing your truncation warnings.
gpatton
Premium Member
Premium Member
Posts: 47
Joined: Mon Jan 05, 2004 8:21 am

Post by gpatton »

If you were to reimport the Database meta data, you will find that for Char and Varchar columns, the length will be a multiple of the byte length ( depending upon the value set in the DataBase ). The soulution is to multiply the length values in the database metadat by appropriate factor ( eg. 4 times the current length)
eole
Participant
Posts: 3
Joined: Tue Mar 11, 2008 3:42 am

Post by eole »

thank you gpatton,

But, when I import the database's metadata, the varchar columns have no length defined (maybe a bug with the import metadata plugin).

My goal is to do the minimum changes on the jobs during the migration. changing all columns of all Oracle OCI stages of all jobs is not an option (or it's the worst).
What's more, if I define a varchar with a size length in bytes multiple of the char size in database, I expose the job not to detect a real too long string, because the UTF-8 characterset is a variable multi-byte, some char are 1 byte, and other are 2, 3 or 4 bytes. This is not really good.

If I understand, the varchar length in datastage is in bytes, while in the database, it could be in bytes or in char (multi-byte) ?
Is there a way to define the datastage varchar type with a char length ?

Best Regards.
aoriano
Participant
Posts: 15
Joined: Fri Apr 25, 2008 8:00 am

Post by aoriano »

Hi, I have the same King of problem.
gpaton wrote : If you were to reimport the Database meta data, you will find that for Char and Varchar columns, the length will be a multiple of the byte length ( depending upon the value set in the DataBase ). The soulution is to multiply the length values in the database metadat by appropriate factor ( eg. 4 times the current length)
That's true, when you reimport the metadata after changing the length type from Bytes to char in the Oracle Database, you have a multiple of the previous length.

But, for my part, I now have a new problem : "Inserted value too large for column, row rejected."

Is there a way to change Bytes to char length in DataStage without importing the new metadata from Oracle ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No. Almost everything is DataStage is based on characters, not bytes.

You can, of course, manually edit the table definitions, even update the entries in the repository if you can find them, but it's probably easier to re-import.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
aoriano
Participant
Posts: 15
Joined: Fri Apr 25, 2008 8:00 am

Post by aoriano »

I just don't understand why I can't import chinese characters in my Oracle Database . The NLS_characterset is UTF8 and all the varchar column are defined as varchar2(X Char), so stored in char mode and not byte.

Ray Said :
Almost everything is DataStage is based on characters, not bytes


My Oracle Stage NLS parameter is UTF8 but I still have warnings "Inserted value too large for column, row rejected." on some lines. The most amazing is that DataStage doesn't show any Oracle exception in the director, so I guess the problem comes from DS, and more precisely from the Oracle Oci Stage but I just can't uderstand.

Should I look in the oracle NLS_database_parameters for some problems (In fact I've already done it but couldn't find anything suspicious) ? Has someone already had this kind of problem ?
Post Reply