Different Options, Best Performance

Daddy Doma · Post by **Daddy Doma** » Wed Oct 11, 2006 1:27 am

Hi Guyz,

I have a couple of options to acheive a result, and wonder whether there would be a significant performance advantage in either.

Option 1: Two Datasets, through transformer, into funnel.

Code: Select all

DS_A-----TR_A
             \
              FU--->
             /
DS_B-----TR_B

Option 2: Two Datasets, through column export, then again, then column generator, into funnel:

Code: Select all

DS_A-----CE_A1-----CE_A2-----CG_A1
                                  \
                                   FU--->
                                  /
DS_B-----CE_B1-----CE_B2-----CG_B1

What am I trying to do?

I have multiple columns in each dataset. I need some of these to combine into a single value to form an ID_SET. I want the other columns to combine into a single value to form an ATTRIBUTE_SET. This information is funnelled together but I want to keep the attributes seperate, e.g.

ID_SET
ATTRIBUTE_SET_A
ATTRIBUTE_SET_B

So, I create an empty column in stream A to match the stream B and vice versa. I can acomplish all this in a single Transformer stage per stream (option 1), but have now hit the issue of Nulls in my attributes.

I could assess each value that makes up the ATTRIBUTE_SET individually for NullToValue. But I wonder if using the inbuilt Null Value settings in the Column Export stages would be quicker then having a transformer assess every column. My option 2 plan is to:

- Create ID_SET in the first Column Export,
- Create ATTRIBUTE_SET in the secong Column Export, then
- Create the dummy column(s) in a Column Generator for input to the funnel stage.

Would this be quicker then a single transformer? I am dealing with data volumes in the high millions and this function will be repeated many times for many data sources - any thoughts?

ray.wurlod · Post by **ray.wurlod** » Wed Oct 11, 2006 2:17 pm

If you're running 7.5.1 or later I'd try the single Transformer stage. They did address most of the performance drags in this stage for this release.

vmcburney · Post by **vmcburney** » Wed Oct 11, 2006 4:30 pm

The column export might be slower than a transformer. Hard to say. It does the null to value stuff but it might do some other conversion validations you don't need. The transformer should be the simpler design and the search and replace option lets you add a lot of nulltovalue commands quickly.