CSV
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 4
- Joined: Mon Oct 11, 2010 1:29 am
CSV
I want to load the data of a CSV file into a DB2 database. The CSV file is quite complex as it contains different section and each sections has a different number of columns. So records of a csv file could be like:
a,b,c
d,f
g,h,i,j,k,l,m
Currently I am making use of the Sequential File Stage, I've specified the columns to the maximum number of columns. In the example above, that would be 7 columns of datatype Varchar(255) with Nullable ste to Yes.
But now I'm having the problem that only those records are coming through which actually have 7 columns. So the records that do not have 7 columns are not being read.
Why is this?
I want to load all data from the CSV file into DB2 no matter how many columns. How can I do this?
a,b,c
d,f
g,h,i,j,k,l,m
Currently I am making use of the Sequential File Stage, I've specified the columns to the maximum number of columns. In the example above, that would be 7 columns of datatype Varchar(255) with Nullable ste to Yes.
But now I'm having the problem that only those records are coming through which actually have 7 columns. So the records that do not have 7 columns are not being read.
Why is this?
I want to load all data from the CSV file into DB2 no matter how many columns. How can I do this?
The PX Sequential File stage is quite militant about file structure and I'm not sure there is a way to handle that with the properties it has. A Server job would handle that quite easily, however. Be curious what others say.
For PX, you may need to read the file as one long string and then parse the fields out "in job". Is there some way to recognize each section, the first value or counting the delimiters?
For PX, you may need to read the file as one long string and then parse the fields out "in job". Is there some way to recognize each section, the first value or counting the delimiters?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I'd look at doing the detection in a Transformer. You'll need to recognize the section you are in and then control your outputs via constraints. Stage variables would help with that. Each section specific output link could then use a Column Import stage to parse them.
What's the bottom line here, how does it all come together? Does each section have a different target table?
What's the bottom line here, how does it all come together? Does each section have a different target table?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 4
- Joined: Mon Oct 11, 2010 1:29 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Server jobs are much more "sensible" (less militant) about handling data like this.
Or you can use a server Sequential File stage in a server Shared Container in a parallel job to leverage the same flexibility.
Or you can use a server Sequential File stage in a server Shared Container in a parallel job to leverage the same flexibility.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 4
- Joined: Mon Oct 11, 2010 1:29 am
I have created a Server Shared Container with a Sequential File Stage to read my flat file. But I get an error, which says that I have a required column missing.
I have specified 41 columns, because the longest record in the CSV file is 41 fields long. But the first record for example contains only one field, so now he's complaiining that he he columns missing.
How can I make the Sequential File Stage in a way that he doesn't care if he find only one column or 41 columns?
Thank you kindly for your help.
I have specified 41 columns, because the longest record in the CSV file is 41 fields long. But the first record for example contains only one field, so now he's complaiining that he he columns missing.
How can I make the Sequential File Stage in a way that he doesn't care if he find only one column or 41 columns?
Thank you kindly for your help.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Scroll to the right in the Columns grid. There you will find more rules, including "contains terminator" and "missing column". For example you could have the stage output NULL if the column is missing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.