Determining if sequential file has correct number of columns
Posted: Thu Feb 19, 2009 11:13 pm
Hi,
Say I have a CSV file like so:
Foo,Bar,"Unit 1, 234 Main St, Sydney",Blah
With the correct sequential file link format parameters, this will become:
Col1=Foo
Col2=Bar
Col3=Unit 1, 234 Main St, Sydney
Col4=Blah
I have a requirement to ensure the input file has the correct number of columns.
My validation code reads the entire line as one long string and counts commas, like so:
ColumnCount = Dcount(Extract.DataLine,",") (ColumnCount is a stage var)
BadRow = ColumnCount ne $SoureFileColumnCount (Job parameter == 72)
You see my quandry above - I reject rows where commas are embedded within the quote characters.
Now, I may be able to get the source system to create the file with another delimiter, say tilde, where I'm guaranteed that it will ONLY be EVER used as a delimiter only. If so, my functional problem is solved.
However, I still have these questions:
1) Is the above a good approach (counting delimiters in the incoming file?)
2) I note with great sadness that regular expressions are not supported (at least OOTB) in DS 7.x. I have read for example viewtopic.php?t=107882.
If regex's were supported, there are some complex regex's that will return "all hits for char X except when between quotes". It's been a while since I've coded one of these, but Google is my friend...
So, is there the functional equivalent in DS of "return a count of char X unless it is between quotes" without the use of reqular expressions?
Thanks,
Scott
Say I have a CSV file like so:
Foo,Bar,"Unit 1, 234 Main St, Sydney",Blah
With the correct sequential file link format parameters, this will become:
Col1=Foo
Col2=Bar
Col3=Unit 1, 234 Main St, Sydney
Col4=Blah
I have a requirement to ensure the input file has the correct number of columns.
My validation code reads the entire line as one long string and counts commas, like so:
ColumnCount = Dcount(Extract.DataLine,",") (ColumnCount is a stage var)
BadRow = ColumnCount ne $SoureFileColumnCount (Job parameter == 72)
You see my quandry above - I reject rows where commas are embedded within the quote characters.
Now, I may be able to get the source system to create the file with another delimiter, say tilde, where I'm guaranteed that it will ONLY be EVER used as a delimiter only. If so, my functional problem is solved.
However, I still have these questions:
1) Is the above a good approach (counting delimiters in the incoming file?)
2) I note with great sadness that regular expressions are not supported (at least OOTB) in DS 7.x. I have read for example viewtopic.php?t=107882.
If regex's were supported, there are some complex regex's that will return "all hits for char X except when between quotes". It's been a while since I've coded one of these, but Google is my friend...
So, is there the functional equivalent in DS of "return a count of char X unless it is between quotes" without the use of reqular expressions?
Thanks,
Scott