Splitting variable number of fields in a QBCSV file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Splitting variable number of fields in a QBCSV file

Post by harshada »

Hi,

I have input file having data as

1,abc,pqr,"hhhh"
2,"abc,yyy",pqr
3,lmn,"uuu,vvv"

i.e. number of fields are variable. I know the maximum number of fields eg 4 in this case. Data is comma seperated and some data has quotes and some donot have. This is what is defined as QBCSV(quote bound CSV) file by our clients.

I have to split the data into 4 fields.

I tried doing it by 2 ways

Way 1

1. Reading the entire row in 1 column
2. Using field() to cut the columns depending on delimiter as comma(,).

Problem here is data also has comma in it so it reads incorrectly.
I also cannot use delimiter as double quote comma double quote (",")
as not all fields have double quotes

Way 2

1. Reading the entire row in 1 column
2. Add extra number of commas in the record so that all records have 3 commas(4 fields).

Here also problem faced is I used data:str(",", 3-count(data,",'))

i.e. append extra commas after counting actual number of commas in the data so that total commas become 3 (number of fields become 4)
Here also when there is comma in data the logic goes wrong.

There is nothing consistant in the data which can be counted or used to split the data into 4 fields.

Can anyone help me on this?

Thanks
Harshada
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What are the business rules around generating four fields when the client does not provide four fields? Resist stupid requirements.

Delete your duplicate post before anyone replies to it.

Use a routine to parse the record, taking account of whether or not the character is within quotes or not, and change the genuine delimiters to some character that does not appear in the data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Hi Ray,

4 fields was an example. Maximum number of fields are needed as some records can have 4 and some have lesser than that number of fields. Need to convert all the variable records to fixed number of fields so as to process further.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What are you supposed to do, then, if the number of fields supplied is insufficient? (That's the same question, with mention of any particular number removed. And that's the problem you need to address before designing anything.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply