use of column import/transformer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

use of column import/transformer

Post by jasper »

Hi,
As an input for a job I have one table, that contains 120million records per day. In these records there are 10 straigthforward fields(char,number,...) and 4 complicated varchars.
These varchars have a structure like :
"var1"=type:"stringvalue","var2"=type:numbervalue,..... , "var_n"=type:"value";
(with different variables in each field, all records however do have the same variables in the same string).
These fields need to be extracted to seperate columns.

The design I was thinking about would be:
-select from oracle stage
-column import stage where the first column is split this will split on "," so output fields will still hold in the fields strings like "var1"=type:"stringvalue"
- since only one column can be split in a column import there will be 4 consecutive column importers
-after this a transformer with some substring-functions to get only the value out of the strings.
-get the values back in an oracle table.

This seems like a lot of stages and we are talking about 120.000.000 records/day.
Any ideas for a simpler design?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you do all the decompositions in stage variables and load the necessary columns from these?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Agree, since you already plan to have a transformer in your job I would do it all in that transformer. The FIELD command should be able to pick the columns out of your varchars quite easily and you may need stage variables or you may be able to do it straight in the derivation fields.

The FIELD function with , as a seperate gives you each column. Running another FIELD function using " as the seperator for string fields and : as the seperator for numeric fields gives you the value.
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

Post by jasper »

thanx, I'll give this a try.
Post Reply