Page 1 of 1

Reading variable number of columns in a file

Posted: Thu Oct 08, 2009 10:31 pm
by craviraj
Hi,
I have a requirement to read a file in the format as below:

A,F1,F2,F3 (Comma delimited and column A's position is fixed with 3 columns following column A). 3 is an arbitrary number. It can vary from 1 to 25.

So, I do not know the number of fields that will come in the input file.

1) How can I read such a file?
-- Assumption Runtime Column Propogation may work to read the data, but how would I know the number of columns read?

2) Second scenario is the fields will not be in the specific order. For eg. I get the file with 4 fields one time as A,F1,F2,F3 then second time we may get records as A,F3,F1,F2. In such case, how can we identify the fields.

Note: First line will have the column names, but if the number of columns are not fixed and the order is not fixed - is there a way to read these files?

The first requirement is to read this file and Pivot it with the values we receive to read the file as
A1 F1 X1
A1 F2 X2
A1 F3 X3

assuming that the file contained only one record as shown below:
A,F1,F2,F3
A1,X1,X2,X3

Any help is highly appreciated.

Thanks,
Raviraj

Posted: Fri Oct 09, 2009 1:54 am
by robjones
To split the data into columns, you can read the file as if it contained only one column, with all data for each record in that single column.

You can then pass this into a transform stage and use the field function (with the comma character as delimiter) in 26 stage variables to split the data into separate columns.

eg.

svA = Field(link.inputcolumn, ",", 1)
svData1 = Field(link.inputcolumn, "," , 2)
svData2 = Field(link.inputcolumn, "," , 3)
...
svData25 = Field(link.inputcolumn, "," , 26)

I do however think that you're going to need to identify a rule with which to determine how the split columns are then mapped to your F column outputs.

Can you provide examples of the data which you need to process?

It would be useful to see some rows like "A, F1, F2, F3" and "A, F3, F5, F7" to see if there's anything distinctive about each data column that you could base your decision upon.

Once you've finally got the data mapped into the correct columns, the action you wish to perform is known as a vertical pivot and is quite straight forward.

If you search this forum for "vertical pivot" you'll find many examples of how to do this.


Rob

Reading variable number of columns in a file

Posted: Mon Oct 12, 2009 2:03 pm
by craviraj
Thanks Rob for your response.

Some file formats would be: (First Line in each file is column name - please note that all the files will have the similar patterned name and the name will not help in identifying how many columns each file has)

1st example File:
A,F1,F2,F3 (Column Names)
A1,X1,X2,X3
A2,X4,X5,X6

2nd example file:
A,F1,F5,F6,F7
A1,X1,X2,X3,X4
A2,X5,X6,X7,X8
A3,X9,X0,X10,X11

3rd example file:
A,F5,F6,F2,F1,F3
A4,X6,X7,X4,X8
A5,X8,X9,X7,X3
A6, X1,X2,X3,X4

The Values starting with A and X are arbitrary values and can be replaced with any text. If the values have repeated it does not mean that same value would come in. The first line is column name and the names given can be treated as repeated if the same value is repeated.

Posted: Mon Oct 12, 2009 4:48 pm
by kduke
I would use a column import.

Reading variable number of columns in a file

Posted: Wed Oct 14, 2009 12:05 am
by craviraj
As far as I know, Column import stage also expects the columns in a fixed position.

Posted: Thu Oct 15, 2009 3:44 am
by Sreenivasulu
Use Schema files. It would help in selecting the target columns dynamically but they should be defined in the schema file.

Regards
Sreeni