Fragmented MBCS character error when loading to iSeries DB2

rjhankey · Post by **rjhankey** » Thu Mar 22, 2012 3:48 pm

I'm pulling in data from an AIX DB2 instance, and loading another iSeries DB2 instance, and am encountering warning messages that I have fragmented MBCS characters in the data.

SQL0191N Error occurred because of a fragmented MBCS character.

Invalid character(s) ([xE6][xA0]) found converting string (code point(s): Warning: TST_DEM_EIW_CUSTOMER_STG.EIW_CUST_STG_TST: CUST_NAME = [xE7][x8B][xAC][xE7][xAB][x8B][xE8][xA1][x8C][xE6][x94][xBF][xE6][xB3][x95][xE4]...) from codepage UTF-8 to Unicode, substituting.

I've tried various options such as different NLS maps, and removing the "Unicode" settings on the source and target column definitions, with the same results.

When I run the query on my PC, I see arrow symbols appearing in the customer name, which I'm guessing are the characters that DB2 is complaining about. When I run the query through the AIX interface, I see ^Z characters in vi.

One of our other DataStage developers encountered something similar in the past, and he abandoned DS and created a job natively on the iSeries DB2 instance. I'm thinking there must be something I can change in the DS job to get this running?

I could set up a message handler to ignore these warnings, but I'd like to see if there isn't something else I could do to load and/or ignore these characters that are coming in.

Thanks in advance!

-Russ

qt_ky · Post by **qt_ky** » Thu Mar 22, 2012 7:43 pm

I agree not to use a message handler. It's always better to figure out the root cause if you have time. It's been a very long time since I dealt with DB2 on AS/400. I don't know the answer, but I remember having to experiment with DB2CODEPAGE settings and CCSID settings. Hopefully someone here already knows the answer, but those are some things to search on.

rjhankey · Post by **rjhankey** » Thu Mar 29, 2012 4:43 pm

I haven't given up hope on this one yet, but did put a message handler in place to suppress the values that are coming in and failing, so I can at least load those values that don't have the error.

I'm definitely open to any and all new suggestions or ideas though!

ray.wurlod · Post by **ray.wurlod** » Thu Mar 29, 2012 5:58 pm

Hmm... Ctrl+Z is Char(26), or code point 0x1A. This is the DOS end-of-file character. Might the problem be related to ASCII transfer of text files from Windows to Unix at some point in their history? Or copy/paste from a Windows application into a DB2 application? (Just clutching at another straw here.)

rjhankey · Post by **rjhankey** » Wed Oct 10, 2012 10:16 am

Looks like I've solved this one -- I've learned to set the length of the character fields to twice as much as it seems like I would need, and I believe this is due to the double-byte characters that are present. So if I'm receiving a 100-character field, I actually need to define it as varchar(200), and that seems to resolve the errors about fragmented characters. That's a little painful for the comments field that tend to come in as 2000+ characters, but the error messages no longer appear.