Fixed Width Files and NLS

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Fixed Width Files and NLS

Post by nick.bond »

Hi,

Has anyone else had problems with fixed width files and NLS?

When I use the row merger or splitter it logs warnings when special characters are found because DS thinks the records are too long (or have an extra column). I think this is caused because the merger/splitter stage are using UTF8 and then checking the size of the record based on bytes. The spitter/merger don't have an NLS tab so I can't change the code page they are using.

If you have seen this problem have you been able to overcome it somehow?

Thanks, Nick.
Regards,

Nick.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
I never had 2 many problems with NLS Fixed Width files.
Then again I never worked with row Merger/Splitter on them.
Having that said:
Check the files are fixed width.
The stages involved writes exactly what the read with no conversion or interpretation so the NLS should not be an issue with the stages (at least as documented).

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
davidnemirovsky
Participant
Posts: 85
Joined: Fri Jun 04, 2004 2:30 am
Location: Melbourne, Australia
Contact:

Post by davidnemirovsky »

I think Datastage uses UTF8 internally as it's own character map. Other NSL character maps store characters with 2 bytes as opposed to 1.

I had a similar problem with the TIS620 Thai Character map and record sizes being to small. Whichever field contained Thai characters was doubled and then it stopped complaining.

Another interesting thing is that I am using DS 5.2 on-site at the moment and there IS an NLS tab on the Merge/Splitter stage. I wonder why they removed it in more recent versions?
Cheers,
Dave Nemirovsky
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
I guess the dropped it from the stage/s since when working with NLS once read it is already converted to UTF8 (already at the seq file stage) and if it doesn't modify convert anything in the merger/splitter stages there is no real need for them to have NLS in them.

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Post by nick.bond »

It just seems to be a bug. If a sequential file stage is used in fixed width format with NLS set to UTF8, records read that contain special chars cause warnings. The internal function nls_readfixedwith (or whatever it is) seems to check the size in bytes not chars.

If NLS tab had been available this could have been overcome.
Regards,

Nick.
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
your NLS should be set to an apropriate value not UTF8 by default.
there is also the off-chance you have a combination of NLS in different columns.
first you need to identify which NLS you need and wethere it is consistant or a per clumn defined.
in case you don't have an existing NLS that supports your file's data solutions might be:
1. build a custom NLS
2. use per column NLS mapping (slows performance down)

IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Post by nick.bond »

Just to resolve this, the workaround if you are having the same issue is to use a sequential file stage instead of the row_merge/row_split stages, just with different metadata on the write and read side.
Regards,

Nick.
Post Reply