Unmappable characters

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
asitagrawal
Premium Member
Premium Member
Posts: 273
Joined: Wed Oct 18, 2006 12:20 pm
Location: Porto

Unmappable characters

Post by asitagrawal »

Hi,

I am having some isue with character translations.

The job under test is being run on exactly the same
database and input job params, except, the Operating System of the Ascential Server.

In case #1, the job is run on Windows 2000 Pro edition server,
and in case #2, I have Windows Server 2003 Enterprise Edition.

I am getting Unmappable characters Warning in the Case #2 and
not in Case #1. U may refer to the log file. ( see the URLs below ).

Can anyone throw some light on this situation ?
Case#1 Log File http://asit.agrawal.googlepages.com/Local.txt
Case#2 Log File http://asit.agrawal.googlepages.com/Production.txt

Warm Regards,
Asit
Share to Learn, and Learn to Share.
asitagrawal
Premium Member
Premium Member
Posts: 273
Joined: Wed Oct 18, 2006 12:20 pm
Location: Porto

Post by asitagrawal »

On the DB2 stage and I have NLS set as Project default (MS1252)

while writing into Flat File I have UNICODE.

Also, note that the same job, with same setting are giving different logs
when run on different OS Server !!
Share to Learn, and Learn to Share.
asitagrawal
Premium Member
Premium Member
Posts: 273
Joined: Wed Oct 18, 2006 12:20 pm
Location: Porto

Post by asitagrawal »

I tried UTF8 for the DB2 stage, but no luck..

How did u resolve? Did u start getting correct characters in the file ??
Share to Learn, and Learn to Share.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage is not a translation tool. Use a consistent mapping throughout. The other important aspect is that you use the correct map. Lacking other information, this choice is likely to be an educated guess. Think about ISO 8859-1 as a possibility.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rafik2k
Participant
Posts: 182
Joined: Wed Nov 23, 2005 1:36 am
Location: Sydney

Post by rafik2k »

asitagrawal wrote:I tried UTF8 for the DB2 stage, but no luck..

How did u resolve? Did u start getting correct characters in the file ??
In my case I used UTF8 for entire job.
try changing some other map.
Those invalid unhandled character was trunacted some other character.
asitagrawal
Premium Member
Premium Member
Posts: 273
Joined: Wed Oct 18, 2006 12:20 pm
Location: Porto

Post by asitagrawal »

Now, I was doing a test for the correct nls map for my case.

I am reading from DataBase and writing it to a Sequential File ( Seq #1 ).

In the next job, I am reading from Seq #1 and writing its data to a new Sequential file ( Seq #2 ).

The NLS settings for the Sequential File stages is UNICODE, with Byte-swapped and byte-order mark options selected.

Now, when the Seq #1 contains any data, then both the jobs run ok.. i.e Finish (Ok).

But, the problem is , if the Seq #1 does not contain any data, i.e a Seq #1 is blank, then the first job, ( one which is creating Seq #1 ) runs ok, but the second job , which is reading from Seq #1 and writing to Seq #2, Aborts.
The error message in the Director is ds_seqgetnext: Unable to read byte order mark - The operation completed successfully.
Share to Learn, and Learn to Share.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage NLS replies upon a byte order mark being present in the file header. This tells DataStage whether the byte order of the machine is big-endian or little-endian. Byte order is critically important when there can be more than one byte per character.

For example, in a hashed file the first two bytes are 0xacef or 0xefac depending on the byte-order of the machine.

Clearly a totally empty file can not have a byte order mark.

Your solution will involve pre-checking for the file's size and only executing where this is greater than or equal to 2 bytes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

"UTF8" is like "UNIX" - every vendor has their own idea about what it should be. There are many eight-bit encodings of Unicode, all of which are entitled to call themselves "UTF8". There is no single UTF8.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply