junk data issue using sequential stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
junk data issue using sequential stage
Hi,
I am facing an issue while creating a sequentail pipe delimited file using datstage job. My job design is like seq stg -> trfm -> seq stg
Target is pipe delimited. Now the job runs fine without any warning, even when i view data from view data utility in seq stage, i can see it properly but when i try to view the data in unix prompt, it shows junk characters. Please let me know if any one has faced such issue
I am facing an issue while creating a sequentail pipe delimited file using datstage job. My job design is like seq stg -> trfm -> seq stg
Target is pipe delimited. Now the job runs fine without any warning, even when i view data from view data utility in seq stage, i can see it properly but when i try to view the data in unix prompt, it shows junk characters. Please let me know if any one has faced such issue
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
I am using vim <file name> in unix. Normally for a ascii file, we can see the actual data. But here in this case i am seeing entire data as junk.. following is example
201210|^@Z|^@^@'^S|^@^@^@^\|^@^@^@~| |^@^@^@^@|^@^@^@^@| |^@^@^@^@| | |^@^@^@^A|^@^B|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@
^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@
^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|
^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@
^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@
^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^E|^@^@^@^@^@!@^\|^@^@^@^M|^@^@^@^@^A8C\|^@^@^@^@|^@^@^@^@|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L
<edited to add some returns so the lines are not so dang long>
201210|^@Z|^@^@'^S|^@^@^@^\|^@^@^@~| |^@^@^@^@|^@^@^@^@| |^@^@^@^@| | |^@^@^@^A|^@^B|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@
^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@
^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|
^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@
^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@
^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^E|^@^@^@^@^@!@^\|^@^@^@^M|^@^@^@^@^A8C\|^@^@^@^@|^@^@^@^@|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L
<edited to add some returns so the lines are not so dang long>
The data is not junk, it is just not part of the displayable ASCII characters.
You should check up on "cat -v" and/or "od -x" to see what your characters are, i.e. those you pasted are "^@" which is VI's way of telling that the charactes are Control-@ and in your editor that might mean null (0x00), but in this case I'd use "od -x" to get the hexadecimal valures and then look it up in the ASCII table.
You should check up on "cat -v" and/or "od -x" to see what your characters are, i.e. those you pasted are "^@" which is VI's way of telling that the charactes are Control-@ and in your editor that might mean null (0x00), but in this case I'd use "od -x" to get the hexadecimal valures and then look it up in the ASCII table.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There's also a Ctrl-Z in there, fairly early. This is the DOS end-of-file mark. You shouldn't be reading past this character (on Windows) but, since you're on UNIX where Ctrl-Z has no particular meaning, the operating system is merrily reading bytes until it encounters a UNIX end-of-file character (Ctrl-D) or some other reason to stop reading.
That said, everything else looks like binary data. For example ^@^@^A is 001, ^@^B is 02, ^@^@^@^@^@ is 0000, ad so on.
That said, everything else looks like binary data. For example ^@^@^A is 001, ^@^B is 02, ^@^@^@^@^@ is 0000, ad so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt.
So this rules out the possibility of having ebcdic data. Moreover the input consists of integer and decimal values and transformations are also simple, so i dount if any non-readable ascii character can come
So this rules out the possibility of having ebcdic data. Moreover the input consists of integer and decimal values and transformations are also simple, so i dount if any non-readable ascii character can come
No, it really doesn't. You can easily read in ASCII data and output it as EBCDIC, which the View Data utility understands and will translate properly for you. Double-check your properties in the writing sequential file stage and make sure you haven't accidentally changed something - like perhaps setting Character Set to EBCDIC.nikhil_bhasin wrote:The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt. So this rules out the possibility of having ebcdic data.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 50
- Joined: Tue Jan 19, 2010 4:14 am
Got the root cause of this issue - The file schema that i loaded into the target sequential file stage was imported using a mainframe copybook (it had level nums, groups etc). I think some-how the column definitions were the culprit behind such junk display of data as the definition was for an ebcdic file.
What i did is removed the file schema definition and manually entered the column names and data-types. This resolved the issue.
What i did is removed the file schema definition and manually entered the column names and data-types. This resolved the issue.