CSV file issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
new2ds2011
Premium Member
Premium Member
Posts: 2
Joined: Fri Jan 21, 2011 10:38 am

CSV file issue

Post by new2ds2011 »

I have a source file in .csv format, and the delimiter for field is set to ',', but one of the column value has ',' as part of the value. This particular record failed. How should I fix?

Example, the file name is company.csv. The delimiter is set to ','.

The values are

company id company name
======= ==========
1 "ABC Corp"
2 "John, Smith, Son'

The 2nd record failed because it has ',' in the company name.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The second record may have failed because it has mismatched quote characters.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

also, the data you presented was not even comma delimited.
So record 1 should have failed also.
mansoor_nb
Participant
Posts: 48
Joined: Wed Jun 01, 2005 7:10 am

Post by mansoor_nb »

If you are sure that you will get the values seperated by comma in the file Company Name, then read the entire record as a single column and then split based on the field length. Remember while splitting, the starting position of the second column will be n+1 as there is a comma seperating the first and second column.

Another option is to count the delimiter in each record and then using field function try splitting the columns provided you know the column where the column value will be seperated by comma.

Or ask the people responsible for generating the source file to provide the source file in other than comma delimited format, for example pipe "|" delimited.
prakashdasika
Premium Member
Premium Member
Posts: 72
Joined: Mon Jul 06, 2009 9:34 pm
Location: Sydney

Post by prakashdasika »

In the sequential file stage you have to define text fileds enclosed in double/single quotes. As Ray pointed out there is some inconsistency in the data you provided. Once defined properly the ',' inside the quotes will not be an issue.

PD
Post Reply