Hi,
I have a .csv format file and it comes in the following way
For ex it has 4 fields
a,345,cd,678
b,"6,878",cd,"9,870"
c,"7,989",fg,"7,880"
d,567,cd,989
The second and the fourth field are numeric fields and if it is more than 3 digit number the it comes with a comma but within double quotes.
When i try view data, i am not able to view the 2nd and 3rd record. The issue is because how the final delimiter string is stored.
If i give Final delimiter as end then 1st and 4th record are viewable. If make Final delimiter as \n then 2nd and 3rd record are viewable. Any idea how to handle it?
Problem in reading .csv file
Moderators: chulett, rschirm, roy
Not seeing off the top of my head how this is related to the final delimiter. However, I'm curious how did you define the metadata? To me they should all be defined as strings and then the 2nd and 4th fields will need to be converted inside the job post-read...
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Yes, they are all defined as Strings. And i will be converting those fields as required inside the job.
But now the problem is with reading the file itself. While i view data in sequential file stage the 2nd and 3rd record get dropped if i specify the Final delimiter as "end" or "none". If i specify the Final delimiter as "\n" then 1st and 4th record get dropped.
The issue is definitely in the source file as they said that we will be sending the file with whatever formats they already have. But is there a way to handle this?
But now the problem is with reading the file itself. While i view data in sequential file stage the 2nd and 3rd record get dropped if i specify the Final delimiter as "end" or "none". If i specify the Final delimiter as "\n" then 1st and 4th record get dropped.
The issue is definitely in the source file as they said that we will be sending the file with whatever formats they already have. But is there a way to handle this?
several things you may be able to do...
- you could not use numeric commas (123456 is written 123456 not 123,456)
- you could use something other than , for the fields (tab, |, #, whatever)
- you could parse the entire line by hand
the first 2 options may mean writing a pre-processor to convert the files. This is probably a simple line or 2 of unix magic commands or could be a short C program or a datastage transformer (which becomes the 3rd option).
- you could not use numeric commas (123456 is written 123456 not 123,456)
- you could use something other than , for the fields (tab, |, #, whatever)
- you could parse the entire line by hand
the first 2 options may mean writing a pre-processor to convert the files. This is probably a simple line or 2 of unix magic commands or could be a short C program or a datastage transformer (which becomes the 3rd option).
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: