junk data issue using sequential stage

nikhil_bhasin · Post by **nikhil_bhasin** » Tue Oct 23, 2012 6:21 am

Hi,
I am facing an issue while creating a sequentail pipe delimited file using datstage job. My job design is like seq stg -> trfm -> seq stg
Target is pipe delimited. Now the job runs fine without any warning, even when i view data from view data utility in seq stage, i can see it properly but when i try to view the data in unix prompt, it shows junk characters. Please let me know if any one has faced such issue

chulett · Post by **chulett** » Tue Oct 23, 2012 6:41 am

They're not junk. How are you viewing the data "in unix prompt" and what exactly are you seeing? Best to identify the characters you are having an issue with before you decide what (if anything) needs to be done about them.

nikhil_bhasin · Post by **nikhil_bhasin** » Tue Oct 23, 2012 8:22 am

I am using vim <file name> in unix. Normally for a ascii file, we can see the actual data. But here in this case i am seeing entire data as junk.. following is example
201210|^@Z|^@^@'^S|^@^@^@^\|^@^@^@~| |^@^@^@^@|^@^@^@^@| |^@^@^@^@| | |^@^@^@^A|^@^B|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@
^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@
^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|
^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@
^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@
^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^E|^@^@^@^@^@!@^\|^@^@^@^M|^@^@^@^@^A8C\|^@^@^@^@|^@^@^@^@|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L

<edited to add some returns so the lines are not so dang long>

ArndW · Post by **ArndW** » Tue Oct 23, 2012 8:47 am

The data is not junk, it is just not part of the displayable ASCII characters.

You should check up on "cat -v" and/or "od -x" to see what your characters are, i.e. those you pasted are "^@" which is VI's way of telling that the charactes are Control-@ and in your editor that might mean null (0x00), but in this case I'd use "od -x" to get the hexadecimal valures and then look it up in the ASCII table.

ArndW · Post by **ArndW** » Tue Oct 23, 2012 8:49 am

Going by your first post, I'd guess that these are null characters. But there is also the ^A in there. Is your data actually expected to be ASCII or is it perhaps a COBOL file with some binary data?

ray.wurlod · Post by **ray.wurlod** » Tue Oct 23, 2012 3:50 pm

There's also a Ctrl-Z in there, fairly early. This is the DOS end-of-file mark. You shouldn't be reading past this character (on Windows) but, since you're on UNIX where Ctrl-Z has no particular meaning, the operating system is merrily reading bytes until it encounters a UNIX end-of-file character (Ctrl-D) or some other reason to stop reading.
That said, everything else looks like binary data. For example ^@^@^A is 001, ^@^B is 02, ^@^@^@^@^@ is 0000, ad so on.

nikhil_bhasin · Post by **nikhil_bhasin** » Tue Oct 23, 2012 11:53 pm

The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt.
So this rules out the possibility of having ebcdic data. Moreover the input consists of integer and decimal values and transformations are also simple, so i dount if any non-readable ascii character can come

chulett · Post by **chulett** » Wed Oct 24, 2012 6:45 am

nikhil_bhasin wrote:The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt. So this rules out the possibility of having ebcdic data.

No, it really doesn't. You can easily read in ASCII data and output it as EBCDIC, which the View Data utility understands and will translate properly for you. Double-check your properties in the writing sequential file stage and make sure you haven't accidentally changed something - like perhaps setting Character Set to EBCDIC.

nikhil_bhasin · Post by **nikhil_bhasin** » Wed Oct 24, 2012 10:37 am

Got the root cause of this issue - The file schema that i loaded into the target sequential file stage was imported using a mainframe copybook (it had level nums, groups etc). I think some-how the column definitions were the culprit behind such junk display of data as the definition was for an ebcdic file.
What i did is removed the file schema definition and manually entered the column names and data-types. This resolved the issue.

chulett · Post by **chulett** » Wed Oct 24, 2012 1:20 pm

Again, not junk just EBCDIC.

Glad you got it sorted out.