Page 1 of 1

Data cleansing functionality

Posted: Thu Dec 04, 2008 3:07 pm
by leomauer
I need a generic cleansing functionality.
It sounds like this:
When the field value is empty or Null then if it is an Alphanumeric (Varchar and such) field then default it to Null, if it is numeric field, then default it to zero.
It is easy to code, but I do not want to code it for every field in every job.
I would like to have a stage to which I may have any input and output and get the work done.
Does anybody know how to create this (custom or buildup)? Or may be somebody have something similar already.
Thanks.

Posted: Thu Dec 04, 2008 3:55 pm
by Mike
Not exactly what you're looking for... but you can highlight multiple columns and choose "Derivation Substitution..." from the right-click menu. A definite productivity help when doing repetitive derivations.

Mike

Posted: Thu Dec 04, 2008 3:59 pm
by leomauer
Mike wrote:Not exactly what you're looking for... but you can highlight multiple columns and choose "Derivation Substitution..." from the right-click menu. A definite productivity help when doing repetitive derivations.

Mike
Thanks but that is exactly what i am not looking for. I know how to do that. I am looking for canned functionality.

Posted: Thu Dec 04, 2008 6:54 pm
by ray.wurlod
There's nothing "out there" of which I'm aware as a parallel routine.

It may be best to create a Build stage, since you're going to have to discover the metadata as well as to process the data records.

One tip (from IOD2008): unless you have reason not to, always write BuildOps as combinable.