Variable column numbers (metadata) during File Import

Vikas Jain · Post by **Vikas Jain** » Wed Apr 29, 2009 12:19 pm

Hi all,
I have to import a file with unknown number of columns in a parallel job. By unknown # of columns, I mean that the meta data information may be different for the same file for different runs. If I use Seq file stage and enable RCP and only give one column name, it rejects all others. If I give max possible no. of columns ( say 60) it rejects the record altogether for the case when it is < 60. I am not sure if it can be achieved by other file stages as well.
Is there any way I can get this working.
Also, I read couple of posts in the forum, but could not find any solution. Kindly help, if you have any approach.

sbass1 · Post by **sbass1** » Wed Apr 29, 2009 11:35 pm

Caveat: I only have DS 7.5.x Server perspective.

One approach: read the entire line as one long string, use a loop plus the field function to extract each delimited field, exit loop when done.

Second approach: use a sed or awk script to normalize your file to the largest common denominator. If your delimiter is a tilde, this will give the number of tildes in your file:

This snippet will count the number of delimiters in the first 5 lines of a file. Adjust as necessary.

Code: Select all

for f in $*
do
   printf "$f\n"
   head -5 $f | awk -F'~' '{print NF-1}'
done

then add the number of tildes to pad out your file to the desired number.

HTH...