Not able to View Data. -- NLS Enabled

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

View Data is known to be a bit flaky where NLS is involved. Are the data being processed correctly? Can you view them using other techniques (such as text editor for text files, SQL for database tables, Data Set Management for Data Sets)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Yes,
I am able to view data when I do Import --> Table Definitions -->Sequential file Definitions --> at Preview it showing above value.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Importer's Preview does not use osh - it is quite different from View Data in parallel jobs, so this result does not really come as a surprise.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Ray,
Thanks for that new lesson :idea:

Now to answer again your previous question, We are getting flat file from external client. and they confirm that it has UK English characters.

Since that only File is our source, how can we read it ? Any suggestion please.

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can read it with any text viewer/editor.

With no commercial relationship I recommend UltraEdit, which can FTP the file to your client machine for viewing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

ray.wurlod wrote:View Data is known to be a bit flaky where NLS is involved.
So that means cant we do view data at all ?

If yes, then How can I load that data into Oracle using DataStage which is my main goal (Sorry I forget to mention this in Original post :( )

so It will be:

Code: Select all

 Seq File Stage ---> Transformer ---> Oracle Stage 
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can load it quite happily. What you can't do is view it with View Data. But you can view it with TOAD, sqlplus and the like.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Ray,

Thanks for your time.

Yes I am able to load data. but I am keeping this Forum as Wrokaround because still main issue is unresolved :x

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

AMERICAN won't know about the pound sterling symbol. You need to find the appropriate setting, probably one of the ISO8859 ones, that does know about this character.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Thanks for your replay
ray.wurlod wrote:AMERICAN won't know about the pound sterling symbol.
So dosnt it metter what NLS_LANG value has been setup at the database level?

ray.wurlod wrote: You need to find the appropriate setting, probably one of the ISO8859 ones, that does know about this character.
How can we decide proper character set? Or do we need to do trial and error?

Thanks for your time
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

swades wrote:So dosnt it metter what NLS_LANG value has been setup at the database level?
Of course that matters. You need to find out what characterset your database is using, if it doesn't support what you are trying to load no amount of DataStage NLS magic will make it happen.

Your session parameters are not the key here, the database ones are. What is the result of this query?

Code: Select all

select * from nls_database_parameters
Not just LANGUAGE and TERRITORY but what does CHARACTERSET say?
-craig

"You can never have too many knives" -- Logan Nine Fingers
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

chulett wrote:What is the result of this query?

Code: Select all

select * from nls_database_parameters
Not just LANGUAGE and TERRITORY but what does CHARACTERSET say?
It is:
NLS_CHARACTERSET = AL32UTF8
NLS_LANGUAGE = AMERICAN
NLS_TERRITORY = AMERICA
I used value of NLS_LANG=AL32UTF8 and set NLS map in Oracle Target Stage = $NLS_LANG but Job is aborting :(
In log it is(Event Type of Log= Info):
FATAL ERROR (APT_CharsetConverterRep::initConv()): AL32UTF8 is not a valid codepage.
Can you please guide me how can I load data correctly ?

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well, not really the Codepage God that Ray or Arnd are, but the next part of the exercise is this:

What 'codepage' or character set was your data encoded with? This will tell you if a conversion needs to happen between the source and the target or if the data can be loaded directly. So, depending on the answer, some general guidelines:

1) Identical. In this case you need to ensure the character set DataStage is using matches the target database character set. When both 'ends' match, no conversion takes place and the data is loaded 'intact' - i.e. with the encoding from the source. Right or wrong. This is typically what leads to 'garbage' characters in your target, different encodings moved over without conversion and thus misinterpreted in the end.

2) Different. This is where it is critical that your DataStage job be set to the character set of the source data, the database will handle its part automatically. When it sees they are different, it will convert from one to the other during the load. Hopefully correctly, assuming you've gotten the source side set correctly. This too can lead to 'garbage' if done incorrectly.

Bottom line is does your database character set support the pound sterling symbol? Your DBA should be able to help answer that question if you are unsure. Too late for this advice in a sense, but this URL talks about character sets in Oracle and how to choose the right one. And this specifically mentions the AL32UTF8 character set, based on the Unicode UTF8 character set, that you are using and notes that it is a 'universal' character set. It also discusses the same conversion phenomena that I did. It seems that you should be able to accomplish what you need to once you ascertain the source character set.

Then there's the whole topic of viewing the loaded data in a manner that doesn't make it look like it failed. We'll save that for later. :wink:

ps. NLS_LANG is more than just the character set. It is a combination of all three elements I asked you to check on for Oracle, so (assuming your source matches your target) set it thus:

Code: Select all

NLS_LANG=AMERICAN_AMERICA.AL32UTF8
And see what happens!
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Funny, but in thinking about these pearls of wisdom I wrote up earlier, I realized there really aren't two different paths to take, only one. And that is knowing the encoding of your source data and making sure the DataStage environment for the loading job reflects that. Both NLS_LANG and LC_CTYPE can easily be set up as User Defined Environment Variables set to $ENV in the Administrator and then added (and thus overwritten) in any job that needs to have them set differently than the default.

If you are unsure, ask your vendor / supplier / file pusher.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply