Page 1 of 1

Reading Flat file (CSV)

Posted: Tue Apr 19, 2011 6:16 pm
by h4harry1
Hi All,
I have a Flat file with comma as delimiter in the columns. I am reading data from this file using sequential file stage (Delimiter Comma). But the only problem is one of the column value includes comma in it. Don't know how to handle this. Any Help.
Thanks in advance.

Posted: Tue Apr 19, 2011 6:52 pm
by ray.wurlod
Are the character strings quoted?

Re: Reading Flat file (CSV)

Posted: Wed Apr 20, 2011 1:14 am
by Sampath
You can take full row as a one record and put transform stage and use Field Function

Posted: Wed Apr 20, 2011 4:31 am
by h4harry1
ray.wurlod wrote:Are the character strings quoted? ...
Thanks for reply...

NO .... Not Quoted...

Posted: Wed Apr 20, 2011 4:37 am
by blewip
This has to be fixed in the input file.

Possibly use a different delimiter.

DataStage cannot guess which is the delemiter and which is data.

Posted: Wed Apr 20, 2011 6:36 am
by jwiles
Is the <non-delimiter> comma always present in the column, or just sometimes?

If always present, one workaround is to just adjust the schema to import that column as two columns and then concatenate them later in a transformer.

If no always present, read the entire record as a single column (suggested above), count the number of commas present and parse accordingly.

Regards,

Posted: Wed Apr 20, 2011 7:11 am
by h4harry1
jwiles wrote:Is the <non-delimiter> comma always present in the column, or just sometimes?

If always present, one workaround is to just adjust the schema to import that column as two columns and then concatenate them later in a transformer.

If no always present, read the entire record as a single column (suggested above), count the number of commas present and parse accordingly.

Regards,
Thanks for the reply
Comma (non-delimiter) is not present in all of the the columns, only in few records. My CSV contains 90 columns and if i go for your second suggestion i think it takes lot of time....
Any suggestion ?
Regards

Posted: Wed Apr 20, 2011 7:13 am
by h4harry1
blewip wrote:This has to be fixed in the input file.

Possibly use a different delimiter.

DataStage cannot guess which is the delemiter and which is data.
Any other suggestion plz. I m not allowed to change source file at all.
Thanks

Posted: Wed Apr 20, 2011 7:19 am
by zulfi123786
your fields are not quoted
you cant change the input file
you have delimiters within the columns
Its is not a fixed width file, (its comma delimited)

How can we make datastage to guess where the string ends?

Posted: Wed Apr 20, 2011 7:31 am
by blewip
h4harry1 wrote:
blewip wrote:This has to be fixed in the input file.

Possibly use a different delimiter.

DataStage cannot guess which is the delemiter and which is data.
Any other suggestion plz. I m not allowed to change source file at all.
Thanks
The source file is in error, simple as that.

You cannot use a delimiter that is also valid data for the fields.

Posted: Wed Apr 20, 2011 7:35 am
by chulett
h4harry1 wrote:Any other suggestion plz. I m not allowed to change source file at all.
Thanks
Whomever is producing the file needs to "change" it. What you are receiving right now is invalid.

Posted: Wed Apr 20, 2011 8:01 am
by jwiles
h4harry1 wrote:if i go for your second suggestion i think it takes lot of time....
Not necessarily...you can either parse it out within a transformer (which would be less efficient) or just split the stream to two different column importers and follow one with a transformer to "fix" the extra column. Then funnel the streams back together. Thinking it takes a lot of time is not the same as knowing...you won't know for certain

Craig's suggestion is the preferred method...the data should be fixed at the source before you receive it. Make a request that the string columns be quoted: "Last Name, First Name","City Name","State" so you can properly import the data with a minimum of extra work and processing time. The worst they can say is "no", at which point you'll need to implement one of the workarounds that's been suggested.

Regards,

Posted: Wed Apr 20, 2011 10:19 am
by blewip
Second option wouldn't work if there were two commas in a field.

If it were free text and they allowed commas why wouldn't there be two.

Posted: Wed Apr 20, 2011 10:24 am
by MarkB
blewip wrote:Second option wouldn't work if there were two commas in a field.

If it were free text and they allowed commas why wouldn't there be two.
...which is why, in this case, the only logical solution is Craig's - the file is bad ... go back to the source and have them give you the file in the correct format.