Fixed Width Files and NLS

nick.bond · Post by **nick.bond** » Sat Apr 02, 2005 8:23 am

Hi,

Has anyone else had problems with fixed width files and NLS?

When I use the row merger or splitter it logs warnings when special characters are found because DS thinks the records are too long (or have an extra column). I think this is caused because the merger/splitter stage are using UTF8 and then checking the size of the record based on bytes. The spitter/merger don't have an NLS tab so I can't change the code page they are using.

If you have seen this problem have you been able to overcome it somehow?

Thanks, Nick.

roy · Post by **roy** » Sun Apr 03, 2005 2:19 am

Hi,
I never had 2 many problems with NLS Fixed Width files.
Then again I never worked with row Merger/Splitter on them.
Having that said:
Check the files are fixed width.
The stages involved writes exactly what the read with no conversion or interpretation so the NLS should not be an issue with the stages (at least as documented).

IHTH,

davidnemirovsky · Post by **davidnemirovsky** » Sun Apr 03, 2005 5:16 pm

I think Datastage uses UTF8 internally as it's own character map. Other NSL character maps store characters with 2 bytes as opposed to 1.

I had a similar problem with the TIS620 Thai Character map and record sizes being to small. Whichever field contained Thai characters was doubled and then it stopped complaining.

Another interesting thing is that I am using DS 5.2 on-site at the moment and there IS an NLS tab on the Merge/Splitter stage. I wonder why they removed it in more recent versions?

roy · Post by **roy** » Mon Apr 04, 2005 2:25 am

Hi,
I guess the dropped it from the stage/s since when working with NLS once read it is already converted to UTF8 (already at the seq file stage) and if it doesn't modify convert anything in the merger/splitter stages there is no real need for them to have NLS in them.

IHTH,

nick.bond · Post by **nick.bond** » Mon Apr 04, 2005 3:08 am

It just seems to be a bug. If a sequential file stage is used in fixed width format with NLS set to UTF8, records read that contain special chars cause warnings. The internal function nls_readfixedwith (or whatever it is) seems to check the size in bytes not chars.

If NLS tab had been available this could have been overcome.

roy · Post by **roy** » Mon Apr 04, 2005 5:07 am

Hi,
your NLS should be set to an apropriate value not UTF8 by default.
there is also the off-chance you have a combination of NLS in different columns.
first you need to identify which NLS you need and wethere it is consistant or a per clumn defined.
in case you don't have an existing NLS that supports your file's data solutions might be:
1. build a custom NLS
2. use per column NLS mapping (slows performance down)

IHTH,

nick.bond · Post by **nick.bond** » Sun Apr 22, 2007 11:13 pm

Just to resolve this, the workaround if you are having the same issue is to use a sequential file stage instead of the row_merge/row_split stages, just with different metadata on the write and read side.