reading from sequential file
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 263
- Joined: Fri Sep 23, 2005 6:49 am
reading from sequential file
Hi All,
I am trying to read a file through sequential stage. The file is a pipe delimited file. with quote double charcater.
my data is coming in the following requirment.
If an attribute value must contain both a pipe and double quotes then the entire attribute should be enclosed in double quotes and each double quote that is a part of the value should be prefixed with another double quote.
can anyone please suggest me how can i accomplish reading such an file from sequential stage.
Regards
Mark
I am trying to read a file through sequential stage. The file is a pipe delimited file. with quote double charcater.
my data is coming in the following requirment.
If an attribute value must contain both a pipe and double quotes then the entire attribute should be enclosed in double quotes and each double quote that is a part of the value should be prefixed with another double quote.
can anyone please suggest me how can i accomplish reading such an file from sequential stage.
Regards
Mark
Re: reading from sequential file
Could you retrieve here one ligne from your file for example
-
- Premium Member
- Posts: 263
- Joined: Fri Sep 23, 2005 6:49 am
Re: reading from sequential file
here it is ;
pipe delimited file.
if any data for a column has embdded pipe in it then the entire column will be in double quotes such as |"ZZXXT|05157"| However this is not my problem.
The records coming like this are causing the problem.
XXXX|413862|"ZZXXT|05157"|"ZZXX1|GN""FK""130183333"|062120|
this is data in 1 column: |"ZZXX1|GN""FK""130183333"|
Thanks
Mark
pipe delimited file.
if any data for a column has embdded pipe in it then the entire column will be in double quotes such as |"ZZXXT|05157"| However this is not my problem.
The records coming like this are causing the problem.
XXXX|413862|"ZZXXT|05157"|"ZZXX1|GN""FK""130183333"|062120|
this is data in 1 column: |"ZZXX1|GN""FK""130183333"|
Thanks
Mark
Re: reading from sequential file
You could use fixed length if your datas have the same length
XXXX
413862
ZZXXT|05157
ZZXX1|GN""FK""130183333
062120
XXXX
413862
ZZXXT|05157
ZZXX1|GN""FK""130183333
062120
-
- Premium Member
- Posts: 263
- Joined: Fri Sep 23, 2005 6:49 am
Re: reading from sequential file
It is a variable length record i am receiving from my client. is there any way to remove those embedded double quotes and read the entire file.
Thanks
Mark
Thanks
Mark
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Parallel jobs don't handle this. You could solve it by pre-processing the file (for example using sed or awk command) to convert the double double-quote characters to something else, then converting these back to single double-quote character within the job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 263
- Joined: Fri Sep 23, 2005 6:49 am
reading from sequential file
[quote="ray.wurlod"]Parallel jobs don't handle this. You could solve it by pre-processing the file (for example using [b]sed [/b]or [b]awk [/b]command) to convert the double double-quote characters to something else, th ...[/quote]
can anyone please suggest me how do i accomplish this with awk or sed.
thanks a lot in advance.
Thanks
Mark
can anyone please suggest me how do i accomplish this with awk or sed.
thanks a lot in advance.
Thanks
Mark
Re: reading from sequential file
if all your 5 fields have the same length in all records;
you could process like this:
you create a schema file like this:
<b>
record
{record_delim='\r', record_length=fixed, delim=none}
(
A:nullable string[4] {width=4};
B:nullable string[1] {width=1};
C:nullable string[6] {width=6};
D:nullable string[1] {width=1};
E:nullable string[13] {width=13};
F:nullable string[1] {width=1};
G:nullable string[25] {width=25};
H:nullable string[1] {width=1};
I:nullable string[6] {width=6}
)
</b>
and use it in a sequental stage, in the next stage (for example transformer stage), you retrieve only the fields A, C, E, G and I;
I think, this will work;
you could process like this:
you create a schema file like this:
<b>
record
{record_delim='\r', record_length=fixed, delim=none}
(
A:nullable string[4] {width=4};
B:nullable string[1] {width=1};
C:nullable string[6] {width=6};
D:nullable string[1] {width=1};
E:nullable string[13] {width=13};
F:nullable string[1] {width=1};
G:nullable string[25] {width=25};
H:nullable string[1] {width=1};
I:nullable string[6] {width=6}
)
</b>
and use it in a sequental stage, in the next stage (for example transformer stage), you retrieve only the fields A, C, E, G and I;
I think, this will work;
Re: reading from sequential file
To substitute " to # by sed;pavan_test wrote:can anyone please suggest me how do i accomplish this with awk or sed.ray.wurlod wrote:Parallel jobs don't handle this. You could solve it by pre-processing the file (for example using sed or awk command) to convert the double double-quote characters to something else, th ...
thanks a lot in advance.
Thanks
Mark
use this command:
$ sed s/'"'/'#'/g yourfile.txt >newfile.txt
if the content of yourfile.txt is :
XXXX|413862|"ZZXXT|05157"|"ZZXX1|GN""FK""130183333"|062120|
the content of newfile.txt will be:
XXXX|413862|#ZZXXT|05157#|#ZZXX1|GN##FK##130183333#|062120|
This will fix your issue