Page 1 of 1

Column Generator Vs Transformer

Posted: Thu Feb 16, 2006 3:35 am
by Nageshsunkoji
Hi Folks,

I have a requirement to create a new column called Empno and populate the Empno with parameter EMPNO_INT_PARM.
Now, I have two options to create the Empno
1) Use the Transformer to create the column and populate with vale
2) Use Column generator stage and create new column and generate with parameter value.
I am using the transformer in my job only to create this column and i am not performing any functions in my transformer.

Please suggest me which one is the better option to create new column ( I am using Datastage EE 7.5v )

Thanks and Regards
Nagesh.

Posted: Thu Feb 16, 2006 4:06 am
by ray.wurlod
In theory at least the Column Generator stage should be more efficient than a Transformer stage, (a) because it does so much less work and (b) because it doesn't need to be compiled into a separate C++ function.

Re: Column Generator Vs Transformer

Posted: Thu Feb 16, 2006 4:30 am
by manishsk
I believe Transformaer will be a better option, as far as perfromance is concerned and simplicity of the code is concerned.

As in transformer its very simple to just add one column.
Also from maintenance perspective if you need to add some processing for said column, it will be better to use transformer.

And for column genrator I believe you need to use schema file. I have not used this stage much but I think transformer will be a better option.

All> please put your thoughts. As this will also help us to get into the column gen stage and its uses.

Thanks & Regards,
Manish

Posted: Thu Feb 16, 2006 4:45 am
by ArndW
Manish,

you have made an assertion that the transformer is better; but then you are stating that other people should test this for you.

The column generator was created for a purpose and should be used. Ray has already stated the major difference in that a transformer stage needs to be code-generated, compiled and linked in to a job - which doesn't need to be done for a column generator. I don't know what a "schema file" is; you can use a schema within PX for column definitions, but you may also modify them in most places.

From a maintenance perspective - a transform can be used for many things, so the only way I know what a given object does is by it's name. A column generator stage is pretty explicit and I know what is probably going on "inside". Generating columns in a transform in PX jobs is, in my opinion, a bad maintenance option.

If you want, you can write a job to compare the performance of both and report back - but please don't ask others to do that work for you.

Posted: Thu Feb 16, 2006 4:53 am
by manishsk
Arnd,

Sry, but may be you got it wrong. I was not insisting others to do the work. Just asked thoughts on the same, as I have also not used the stage extensively so was doubtful on that.

I was not aware about ray's note as I think ray put his note at the same time I put. :wink:

Apologies if I did something wrong.

Now from ray's comment yes column genrator stage will be effective.
Still I felt from the simplicity of the code and less development work transformer stage will be good.

Thanks & Regards,
Manish

Posted: Thu Feb 16, 2006 5:09 am
by ArndW
I just wrote a test job to compare the relative speeds of generating a single VarChar(32) column and filling it with the constant value of "Test" between using a Transform and a Column Generator. After several runs of 5 minutes the average speeds worked out to

Transform Stage- 515,463 rows per second.
Column Generator - 555,565 rows per second.

The column generator is 7.2% faster on this machine.

Posted: Thu Feb 16, 2006 5:18 am
by Nageshsunkoji
Hi Arnd & ray,

Thanks a ton for your inputs.

I dont have huge data to test, that is the reason i posted the query here.

Thanks for your statistics.

Thanks and Regards,
Nagesh.

Posted: Thu Feb 16, 2006 9:05 am
by kumar_s
Nageshsunkoji wrote:Hi Arnd & ray,

Thanks a ton for your inputs.

I dont have huge data to test, that is the reason i posted the query here.

Thanks for your statistics.

Thanks and Regards,
Nagesh.
Row generator could also be used to produce huge clean test data.

-Kumar