Special Character reading issue

chulett · Post by **chulett** » Tue Oct 15, 2013 12:33 am

When you have an issue like this you start a new post rather then reply to anything similar you can find. I've taken the liberty of splitting your post out on its own and deleting the others so we can have a conversation in one place... here.

pran.praveen · Post by **pran.praveen** » Tue Oct 15, 2013 2:22 am

Hi chulett,

I understand your point however I was not getting option to start a new post, start new post option was disabled for me. So I have replied to similar kind of post. Anyways any one know solution to this problem?

ArndW · Post by **ArndW** » Tue Oct 15, 2013 2:50 am

You will first need to find out which EBCDIC the original data is encoded in. There are many variants and without knowing which one you will not be able to convert the special characters into any ASCII representation.

sajidkp · Post by **sajidkp** » Tue Oct 15, 2013 4:28 am

There can be Non Printable ASCII charecters in mainframe. Datastage may not be able to show it as is , as in mainframe. Eg : Low values (NULLS) . try to find out the Hex/ Oct value for these and do a tranformations.

pran.praveen · Post by **pran.praveen** » Wed Oct 16, 2013 6:49 am

Hi ArndW,

I am not sure how it is encoded in but for sure it is coming from IBM Mainframe. The output is looking like below when I peek it. That means datastage is not identifying this character. It is giving "?" sign to it. ANy suggestions??
Peek_13,1: ABC:02063700DKDK?G

ArndW · Post by **ArndW** » Thu Oct 17, 2013 1:11 am

Without knowing which EBCDIC character set you are sourcing from you cannot correctly convert your characters. Do you know what character/glyph the "?" is supposed to map to? If so, you can check the source EBCDIC binary value and look through the various EBCDIC implementations to see which one matches.

It will be easier and quicker to get your host people to tell you which character set they used.

pran.praveen · Post by **pran.praveen** » Thu Oct 17, 2013 3:44 am

The binary value of '@' which not getting read is "0110110 0110100 " . So how can this help?

I can ready every data only problem with this kind of character using sequentail stage as source with below settings

NLS MAP:ISO_8859-1:1987
Record Length=fixed
Field Defaults Delimiter =None
Type Defaults==>
genral
Character Set=EBCDIC
Byte Order= Big endian
Data FOrmt=Binary

I have asked this to mainframe resoruce " EBCDIC character set " and as per them its is EBCDIC fixed width file. So what do they need to check in mainframe to understand the EBCDIC character set in mainframe. ANy help from any one?

ArndW · Post by **ArndW** » Thu Oct 17, 2013 3:55 am

There are at least 20 different EBCDIC variants around, probably many more. The standard LATIN-1 characters are the same in most of them, but the extra characters aren't.

The binary code you gave is 54 decimal, which the common EBCDIC maps as a special code NBS (numeric backspace). The default position of the "@" character is decimal 124.

If you are certain that in your EBCDIC the "@" sign is represented by 54 decimal, then you need to find an EBCDIC variant that has that mapping, and when you define that EBCDIC as your input then DataStage will correctly convert the character.

As it is, there is no glyph for EBCDIC 54 and thus it correctly gets mapped to "?" by DataStage.

pran.praveen · Post by **pran.praveen** » Thu Oct 17, 2013 4:49 am

Thanks for repling quickly.For standard LATIN-1 characters what should be the NLS mapping . Basically this data is from northern Europe.

ArndW · Post by **ArndW** » Thu Oct 17, 2013 5:25 am

I don't know how often I need to repeat this answer - nobody can give you a definitive answer. The characters you posted are not part of the standard characters in EBCDIC (those are the letters a-z,A-z,0-9, and a couple of punctuation characters. All the rest can be different.

As mentioned before, you need to find one of these characters in your source, get the numeric value of that character and then check the EBCDIC table to see if you have a match.

As stated earlier, the "@" character, if it really is mapped at 54 decimal in your EBCDIC is non-standard and you need to find out which EBCDIC is being used.

Does your data come from a DB2 database on the Host? From a text editor? Which OS and version are you using? I am sure that if you speak with an operator that he or she can determine which EBCDIC variant is being used.

pran.praveen · Post by **pran.praveen** » Sat Oct 19, 2013 10:29 am

No its not..But my client is from northern europe.They have customers all over the world. Please suggest any solution if you have any ..It does not matter where the data is from..

pran.praveen · Post by **pran.praveen** » Sat Oct 19, 2013 12:52 pm

Arnd - Yes data is coming from DB2 database. And I see these characters over there when I query it. It is indeed surprising to see disability of datastage to read it properly because datastage is highly compatiable with DB2 as both are IBM products. I need to check with mainframe guy to see which variant of EBCDIC it belongs to. Hopefully I will get answer ASAP. Appreciate your time to have look at this issue

pran.praveen · Post by **pran.praveen** » Mon Oct 21, 2013 12:18 am

Hi Guys,

I have found a solution to this problem, rather I would say workaround. I was not able to read these special character in EBCDIC format. So I have made mainframe to send me file in ASCII format . I did face problem while reading decimal fields of this ascii file but did some setting changes to read the decimal fields. So finally it looks good . Thanks to people who helped me with some clues...Have a great day and greater week ahead...

pran.praveen · Post by **pran.praveen** » Thu Oct 24, 2013 3:54 am

Hi Guys ,

I have found one more solution to this problem. Just wanted to share , may be it will help some of you. Read the EBCDIC file with following settings. In format tab set String==>Export EBCDIC as ASCII option and In general Character Set = EBCDIC and Byte order=Big endian , read the decimal columns as binary and packed also colums with special character should be read as ASCII. Later in transformer use StringToUString(Sourcecolumn,'IBM01142') for those columns with special character and Job NLS should be "ISO_8859-1:1987". Hope this helps..