Reading Flat file (CSV)
Moderators: chulett, rschirm, roy
Reading Flat file (CSV)
Hi All,
I have a Flat file with comma as delimiter in the columns. I am reading data from this file using sequential file stage (Delimiter Comma). But the only problem is one of the column value includes comma in it. Don't know how to handle this. Any Help.
Thanks in advance.
I have a Flat file with comma as delimiter in the columns. I am reading data from this file using sequential file stage (Delimiter Comma). But the only problem is one of the column value includes comma in it. Don't know how to handle this. Any Help.
Thanks in advance.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Reading Flat file (CSV)
You can take full row as a one record and put transform stage and use Field Function
Is the <non-delimiter> comma always present in the column, or just sometimes?
If always present, one workaround is to just adjust the schema to import that column as two columns and then concatenate them later in a transformer.
If no always present, read the entire record as a single column (suggested above), count the number of commas present and parse accordingly.
Regards,
If always present, one workaround is to just adjust the schema to import that column as two columns and then concatenate them later in a transformer.
If no always present, read the entire record as a single column (suggested above), count the number of commas present and parse accordingly.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
Thanks for the replyjwiles wrote:Is the <non-delimiter> comma always present in the column, or just sometimes?
If always present, one workaround is to just adjust the schema to import that column as two columns and then concatenate them later in a transformer.
If no always present, read the entire record as a single column (suggested above), count the number of commas present and parse accordingly.
Regards,
Comma (non-delimiter) is not present in all of the the columns, only in few records. My CSV contains 90 columns and if i go for your second suggestion i think it takes lot of time....
Any suggestion ?
Regards
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
The source file is in error, simple as that.h4harry1 wrote:Any other suggestion plz. I m not allowed to change source file at all.blewip wrote:This has to be fixed in the input file.
Possibly use a different delimiter.
DataStage cannot guess which is the delemiter and which is data.
Thanks
You cannot use a delimiter that is also valid data for the fields.
Modern Life is Rubbish - Blur
Not necessarily...you can either parse it out within a transformer (which would be less efficient) or just split the stream to two different column importers and follow one with a transformer to "fix" the extra column. Then funnel the streams back together. Thinking it takes a lot of time is not the same as knowing...you won't know for certainh4harry1 wrote:if i go for your second suggestion i think it takes lot of time....
Craig's suggestion is the preferred method...the data should be fixed at the source before you receive it. Make a request that the string columns be quoted: "Last Name, First Name","City Name","State" so you can properly import the data with a minimum of extra work and processing time. The worst they can say is "no", at which point you'll need to implement one of the workarounds that's been suggested.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
...which is why, in this case, the only logical solution is Craig's - the file is bad ... go back to the source and have them give you the file in the correct format.blewip wrote:Second option wouldn't work if there were two commas in a field.
If it were free text and they allowed commas why wouldn't there be two.