reading fixed width file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ajithaselvan
Participant
Posts: 75
Joined: Mon Jul 12, 2010 4:11 am
Location: Chennai

reading fixed width file

Post by ajithaselvan »

Hi
I'm receiving input(fixedwidth) in below format
PCR2D11 D 93 G08010CR201
PCR2D2 DC REP G08010CR2 03-01-2011 DC115509435
PCR2DD SOURCE: DEPOSITORY TRUST CO.

I'm reading it as char(1)-Col 1,char(4)-Col 2,char(1)-Col 3,char(50)-Col 4 into sequential file.
when i view the data, i'm getting it as below
Col1 Col2 Col3 Col4
P CR2D 1 1
P CR2D 2
P CR2D D

i'm able to view first 3 columns. But for the 4th column, only first value of first row is printed. But i need all the values to be printed in the 4th filed.

Kindly help me to resolve it


Thanks in advance
Ajitha S
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Re: reading fixed width file

Post by BI-RMA »

Hi ajithaselvan,

check the Hex-values of characters in position 8 (row 1) and 7 (row 2-3). Are these blanks or possibly something like your record-delimiter?
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Based on your description, I assume that your input record is fixed-width (56 bytes) with no delimiters between your columns, and that the last (fourth) column contains free-form text. Sounds like a file that was generated by a mainframe source.

What is your input schema?

Have you checked the format tab for the output link options to make sure that the field delimiters option is set correctly?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ajithaselvan
Participant
Posts: 75
Joined: Mon Jul 12, 2010 4:11 am
Location: Chennai

Post by ajithaselvan »

I found there is a junk value(special character) is placed at the poition 8.
I believe, that is the reason for not able to read by datastage.
Can we handle junk value in datastage while reading file?

jwiles: yeah. file is generated from mainframe.
checked the format tab. set fields defaults (delimiter and quote) as none


Regards,
Ajitha S
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

As BI-RMA suggests, view the records in hex mode to see what the junk data actually is. I'm going to guess that it's either binary zero (x'00') or maybe a tab character (x'09').

If possible, ask the data supplier if they can remove non-printable characters from the data prior to transmission to your server. If this isn't possible, you may be able to do the same using an external filter in the sequential file stage, calling a sed or awk script. DSGuru28 could suggest an appropriate script, I bet.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

ajithaselvan wrote:I found there is a junk value(special character) is placed at the poition 8.
I believe, that is the reason for not able to read by datastage.
Can we handle junk value in datastage while reading file?
Hello ajithaselvan,

Yes, we can!

You might, for example, use a filter-command on the sequential-file stage. sed might be a good idea. Of course, You will have to make sure that You replace just the correct values...
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Post Reply