junk data issue using sequential stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nikhil_bhasin
Participant
Posts: 50
Joined: Tue Jan 19, 2010 4:14 am

junk data issue using sequential stage

Post by nikhil_bhasin »

Hi,
I am facing an issue while creating a sequentail pipe delimited file using datstage job. My job design is like seq stg -> trfm -> seq stg
Target is pipe delimited. Now the job runs fine without any warning, even when i view data from view data utility in seq stage, i can see it properly but when i try to view the data in unix prompt, it shows junk characters. Please let me know if any one has faced such issue
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

They're not junk. How are you viewing the data "in unix prompt" and what exactly are you seeing? Best to identify the characters you are having an issue with before you decide what (if anything) needs to be done about them.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nikhil_bhasin
Participant
Posts: 50
Joined: Tue Jan 19, 2010 4:14 am

Post by nikhil_bhasin »

I am using vim <file name> in unix. Normally for a ascii file, we can see the actual data. But here in this case i am seeing entire data as junk.. following is example
201210|^@Z|^@^@'^S|^@^@^@^\|^@^@^@~| |^@^@^@^@|^@^@^@^@| |^@^@^@^@| | |^@^@^@^A|^@^B|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@
^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@
^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|
^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@
^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@
^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^E|^@^@^@^@^@!@^\|^@^@^@^M|^@^@^@^@^A8C\|^@^@^@^@|^@^@^@^@|^@^@
^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@|^@^@^@^@|
^@^@^@^@|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L|^@^@^@^@^@^@^@^L

<edited to add some returns so the lines are not so dang long>
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The data is not junk, it is just not part of the displayable ASCII characters.

You should check up on "cat -v" and/or "od -x" to see what your characters are, i.e. those you pasted are "^@" which is VI's way of telling that the charactes are Control-@ and in your editor that might mean null (0x00), but in this case I'd use "od -x" to get the hexadecimal valures and then look it up in the ASCII table.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Going by your first post, I'd guess that these are null characters. But there is also the ^A in there. Is your data actually expected to be ASCII or is it perhaps a COBOL file with some binary data?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's also a Ctrl-Z in there, fairly early. This is the DOS end-of-file mark. You shouldn't be reading past this character (on Windows) but, since you're on UNIX where Ctrl-Z has no particular meaning, the operating system is merrily reading bytes until it encounters a UNIX end-of-file character (Ctrl-D) or some other reason to stop reading.
That said, everything else looks like binary data. For example ^@^@^A is 001, ^@^B is 02, ^@^@^@^@^@ is 0000, ad so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nikhil_bhasin
Participant
Posts: 50
Joined: Tue Jan 19, 2010 4:14 am

Post by nikhil_bhasin »

The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt.
So this rules out the possibility of having ebcdic data. Moreover the input consists of integer and decimal values and transformations are also simple, so i dount if any non-readable ascii character can come
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

nikhil_bhasin wrote:The input to the job is a sequential pipe delimited file, which i can read properly in UNIX prompt. So this rules out the possibility of having ebcdic data.
No, it really doesn't. You can easily read in ASCII data and output it as EBCDIC, which the View Data utility understands and will translate properly for you. Double-check your properties in the writing sequential file stage and make sure you haven't accidentally changed something - like perhaps setting Character Set to EBCDIC.
-craig

"You can never have too many knives" -- Logan Nine Fingers
nikhil_bhasin
Participant
Posts: 50
Joined: Tue Jan 19, 2010 4:14 am

Post by nikhil_bhasin »

Got the root cause of this issue - The file schema that i loaded into the target sequential file stage was imported using a mainframe copybook (it had level nums, groups etc). I think some-how the column definitions were the culprit behind such junk display of data as the definition was for an ebcdic file.
What i did is removed the file schema definition and manually entered the column names and data-types. This resolved the issue.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Again, not junk just EBCDIC. :wink:

Glad you got it sorted out.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply