Page 1 of 1

Display of Arabic Characters

Posted: Sat Feb 18, 2006 7:18 am
by raj_konig
Folks,

I am trying to insert data from access to a database table. I am calling this Access file using ODBC stage. All goes fine expect with one column which contains Arabic data. This is been displayed as junk data.

When I try to laod the same from a flat file i am able to do by changing my NLS to 'Unicode'.(for this I am getting the text itself in Unicode)

I tried with all the avialbale NLS for odbc stage. But invain.

Plzz one help me out.

Thanks in advance
rajesh

Posted: Sat Feb 18, 2006 8:37 am
by ArndW
You have a couple of possible sources of your non-ASCII data being converted incorrectly; and you need to understand your data before narrowing down the cause.

1. Do you know your Access database column character set? Is it just a Unicode representation or something else? Do you have an NLS enabled installation of DataStage?

2. You are reading Access using ODBC into DataStage. If you put take a known value in your arabic column and do a SEQ(In.ArabicColumn[1,1]) does it display, in datastage, a value that you would associate with the arabic character?

Once you have this information you will know where the cause of the mapping problem probably lies, but until you (and we) know more the problem's solution cannot be found without making some unfounded guesses.

Posted: Sat Feb 18, 2006 2:43 pm
by ray.wurlod
If you have NLS enabled, prefer UNISEQ to SEQ.

Posted: Sat Feb 18, 2006 2:46 pm
by ray.wurlod
Moderator: please synchronize with this - apparently duplicate - post.

Posted: Sun Feb 19, 2006 1:31 am
by raj_konig
Yes I do have NLS Enabled Datastage.

I am not sure regarding my access database column character set. Arabic data is not even coded. It is clearly seen in Access.

In Administrator 'MS1252-CS' NLS Map was selected.

I tried SEQ & UNISEQ for Arbaic Column and a numeric was displayed in turn.

rajesh

Posted: Sun Feb 19, 2006 1:56 am
by ArndW
Rajesh - you need to go into your mapping tables and see if the numer value that is displayed with SEQ or UNISEQ corresponds to the arabic characters that are displayed.

Very often the problems with NLS are actually problems with displaying the characters, which is why I asked you to display the actual character value and not it's screen representation.

Posted: Sun Feb 19, 2006 2:22 am
by raj_konig
Folks,

I got a work around for this. I exported the whole access data into XML and from there I loaded into the database. This process is able to display my arabic characters.


ArndW,
The numbers that were diaplayed are almost same which is 46. But I dont think this number is any way concerned with the arabic characters.

Thanks,
rajesh

Posted: Sun Feb 19, 2006 3:43 am
by ArndW
I'm glad you've solved your problem. The value 46 is that of "&" and since it doesn't lie in a part of the chart where arabic characters are the translation error probably occurred there. But it's a moot issue now that you've solved it using other means.

Posted: Mon Feb 20, 2006 3:07 am
by raj_konig
ArndW,

I feel its a problem while conversion in Datastage.

becoz, if we try to load data from a flat file,which contains arabic information, to database we need to mention the source NLS as 'Unicode' and then load. this process loads data without any issue. If i change the source NLS from Unicode to NONE then the same issue arises with arabic characters.

This selection of NLS(Unicode) is not available for an ODBC stage.

I just cannt blame on source access file as i can clearly see arabic characters been displayed. So I think datastage may not be able to pick or get the proper information from that plugin.

thanks,
rajesh

Posted: Mon Feb 20, 2006 3:27 am
by ArndW
DataStage can and does read NLS character, but you need to know what is happening with regards to implicit and explicit conversions of character values. So far this thread has been geared towards getting you to find out what is happening and where it is happening, but as you've solved the problem otherwise there has been no follow-up.
I feel it's a problem while conversion in Datastage.
That isn't going to solve anything.

1. What is the character set of your source, and what is the numeric representation of the specific Arabic character in your source.

2. You have set the DataStage character set to 'MS1252-CS' which is a single-byte representation for English/Latin. What is the numeric representation of the specific Arabic character when read from the source.

Once you do this you will see what conversion, if any, has been done. Also, although I am not at an NLS site right now, can you not specify the mapping on a per-column basis in the source stage and turn any conversions off?