use of column import/transformer
Posted: Tue Sep 06, 2005 1:14 am
Hi,
As an input for a job I have one table, that contains 120million records per day. In these records there are 10 straigthforward fields(char,number,...) and 4 complicated varchars.
These varchars have a structure like :
"var1"=type:"stringvalue","var2"=type:numbervalue,..... , "var_n"=type:"value";
(with different variables in each field, all records however do have the same variables in the same string).
These fields need to be extracted to seperate columns.
The design I was thinking about would be:
-select from oracle stage
-column import stage where the first column is split this will split on "," so output fields will still hold in the fields strings like "var1"=type:"stringvalue"
- since only one column can be split in a column import there will be 4 consecutive column importers
-after this a transformer with some substring-functions to get only the value out of the strings.
-get the values back in an oracle table.
This seems like a lot of stages and we are talking about 120.000.000 records/day.
Any ideas for a simpler design?
As an input for a job I have one table, that contains 120million records per day. In these records there are 10 straigthforward fields(char,number,...) and 4 complicated varchars.
These varchars have a structure like :
"var1"=type:"stringvalue","var2"=type:numbervalue,..... , "var_n"=type:"value";
(with different variables in each field, all records however do have the same variables in the same string).
These fields need to be extracted to seperate columns.
The design I was thinking about would be:
-select from oracle stage
-column import stage where the first column is split this will split on "," so output fields will still hold in the fields strings like "var1"=type:"stringvalue"
- since only one column can be split in a column import there will be 4 consecutive column importers
-after this a transformer with some substring-functions to get only the value out of the strings.
-get the values back in an oracle table.
This seems like a lot of stages and we are talking about 120.000.000 records/day.
Any ideas for a simpler design?