Non English Characters in Fixed length Char field

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Non English Characters in Fixed length Char field

Post by william.eller@ed.gov »

I am reading a flat file from IBM/DB2 - The layout has each field fixed length CHAR. There is a Char(65) field that has characters that look not to to be English (European maybe).

Anyway - In the log Iget a "buffer overrun" warning, the next field in the sequence shows no data and I get an ODBC function error on the attempt to insert the datat (into Oracle):
SQLSTATE = 22001, Native error code = 0, Msg = [IBM(datadirect OEM)[ODBC Oracle WIRE Protocol Driver(String Data Right truncated)

Should I use a datatype other than CHAR(65)?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's probably sufficient to enable Unicode as the extended property of that particular column. Make sure that it's defined in Oracle as CHAR(65) CHARACTERS (not BYTES).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Re: Non English Charaters in Fixed length Char field

Post by william.eller@ed.gov »

Thanks -

So everywhere that field exists in the job - activate unicode?:
Input file stage
Modify stage (as a pass through column)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Confirm for us the "semantics" of that column in Oracle - is your target column CHAR(65 Byte) or CHAR(65 Char)? It needs to be the latter for this to stand any chance of working. If you are not sure, check with your DBA.
-craig

"You can never have too many knives" -- Logan Nine Fingers
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

In Oracle - it is VArchar(65) -- Byte or Char not sure - will check with DBA

Thanks
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

ok - heres the new scenario(s): Changed each occurance of field in question to unicode - job ebended. All records rejected - following warning: "External ustring too long. Imported only 65 of 67 external charaters into a ustring of fixed length 65. Ignoring last 2 charaters of ecternal ustring up to delimeter."

Then I changed filed size in question to char(67) in all places and In oracle Varchar(67). Job sucessful (all green), warning as follows:
"Input buffer oberrun at field "ST" <this is the field following the unicode field> at offset:93

Please advise
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So now you are basically just playing 'Chicken' with the size of the target string which I'm wagering is in bytes in Oracle. You've accomodated for one multi-byte character, however should another show up you'll be back to square one.

Still would like you to confirm Byte v. Char in your target column.
-craig

"You can never have too many knives" -- Logan Nine Fingers
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

Performed "Alter" statement and made column varchar(65 Char)" type
Reran job:
File stage char 65 unicode
Modify stage char 65 unicode (i/p) - char 65 unicode (o/p)
ODBC Insert into ..... (target varchar(65 char)
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

the record in question (2 of 3 below) seems to read longer than the char 65. You will note the abscence of the last field (i.e., the FC in Recs 1 and 3):
20130320 009426 +00009 +00009 CASOSCARI ISTITUTO D'ARCHITETTURA FC
20130320 009427 +00007 +00005 JULIUS-MAXIMILIANS-UNIVERSITðT W²RZBURG
20130320 009428 +00008 +00005 TECHNION-ISRAEL INSTITUTE OF TECHNOLOGY FC
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

Thought id'ed try something - I imported the Oracle table definition (after I altered the table to varchar (65 char)

ODBC definition as follows:
SQLType:NVarChar Length:260 Display:8

File stage for flat file:
SQL Type:Char Extended Unicode, Length 65

Modify StageL (i/p):
SQL Type:Char Extended Unicode, Length 65

Modify Stage (o/p) - just a pass through:
SQL Type:Char Extended Unicode, Length 65

Job stopped abending but no records processed ----

Log show 2 major warnings:
Field has import error and no default value: Data{D A V I S 20 & 20E L K I N S 20 ........
AND
External ustring too long. Imported only 65 out of 67 external charaters into a ustring of fixed length 65. Ignoring last 2 charaters of external ustring up to delimiter

The 2nd warning repeats for approx 1000+ rows
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

William in email wrote:I set the type in Oracle to varchar2(65 CHAR) and still get the errors. Any thoughts?
First thought - please don't send email to us for help, if and when people have something more to add to an issue, don't worry - they will.

I don't have any additional thoughts on the 65/67 issue at the moment. The first error can be searched for here. Use an exact search as the others have odd issues going back in time very far.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

OK, I lied. :wink:

Since you seem to have full control over the fate of the target column and since it's a varchar why not make it big enough so there's no problem with any of the source data fitting? For example, VARCHAR2(100 Char) or some such...
-craig

"You can never have too many knives" -- Logan Nine Fingers
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

Post by william.eller@ed.gov »

chulett wrote:
William in email wrote:I set the type in Oracle to varchar2(65 CHAR) and still get the errors. Any thoughts?
First thought - please don't send email to us for help, if and when people have something more to add to an issue, don't worry - they will.

I don't have any additional thoughts on the 65/67 issue at the moment. The first error can be searched for here. Use an exact search as the others have odd issues going back in time very far.
My apologies - will never happen again
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Thank you. Just really wanted to note that it's not really necessary to do that, we're here several times a day every day and really try to keep an eye on things as much as possible. :wink:

A couple of other thoughts... there is a Reply to topic link, there's no need to 'Reply with quote' each and every time. However, when you do you cannot have 'Disable BBCode in this post' on by default. It doesn't work so well when you do. I usually clean up stuff like that but left yours alone so you could see what I meant.

As to the 'workaround' suggested for the moment, I thought it might help to get past the "cannot load the data" phase and move into the "make sure it's loading properly" phase. Once things are loading properly we can circle back and see about a more proper "fit" if that still is desired.
-craig

"You can never have too many knives" -- Logan Nine Fingers
william.eller@ed.gov
Participant
Posts: 19
Joined: Fri Aug 03, 2012 11:06 am

How to reply and workaround

Post by william.eller@ed.gov »

no need to 'Reply with quote' each and every time. However, when you do you cannot have 'Disable BBCode in this post' on by default. It doesn't work so well when you do. I usually clean up stuff like that but left yours alone so you could see what I meant. [Note - I couldn't stand it so I fixed it! - Andy]

....I only see Reply with quotes on upper RHS - Maybe its a config issue - I'll check

As to the 'workaround' suggested for the moment, I thought it might help to get past the "cannot load the data" phase and move into the "make sure it's loading properly" phase. Once things are loading properly we can circle back and see about a more proper "fit" if that still is desired.

..... I increased varchar2 field to (67 CHAR) still no good on foreign characters
Post Reply