Page 1 of 1

Difference between Transformer Stage and all other stage

Posted: Tue Jul 04, 2006 5:27 am
by opdas
Hi,
One interviewer asked me the difference between transformer Stage and the rest stages.
I couldnt answer this if anybody know about this.

Posted: Tue Jul 04, 2006 5:47 am
by vmcburney
What a pity! You know, you might have aced that interview if only you were a subscriber to my blog! I seem to remember a post on that very topic.

I did a blog on How do I get started with big expensive tools? about how difficult it is to get DataStage employment when you don't have experience with the tool. You need to emphasise your skills with programming languages, databases and a willingness to learn. The building blocks that help an ETL programmer. Then you need a bit of luck. Hopefully they are impressed by your honesty that you don't know the tool that well but are highly motivated to learn it.

Re: Difference between Transformer Stage and all other stage

Posted: Tue Jul 04, 2006 8:19 am
by prabu
opdas wrote:Hi,
One interviewer asked me the difference between transformer Stage and the rest stages.
I couldnt answer this if anybody know about this.
rest stage is used to "relax" :D while transformer stage does all the hard work.

Jokes aside, Transformer stage slows down things and it is a residue of the server job. Use it as a last resort if no other stage is going to accomplish your needs. Meaning, transformer can accomplish all of the things any PX stage does. Converse may not be true .


I remember reading something like "transformer stage requires a do a cotext-switching " like SQL - PL/SQL context swithching in Oracle.


hope DS Gurus will further explain


regards,
Prabu

Posted: Tue Jul 04, 2006 8:41 am
by kumar_s
Inclusion of a Transformer stage in a parallel job will require more time to compile, because source code has to be generated and compiled.
And there will be a small overhead in initial invocation of any libraries needed at run time. But this wont be the same for other stages such as Modify, Switch, Copy and other stage which may might replace transformer at some places. But is found that the job runs 25% faster when other replace for transformer (when possible).

Posted: Tue Jul 04, 2006 9:49 am
by DSguru2B
A basic transformer in px, its like putting a corvette in a school zone. If you absolutely have to, only then use it. I think you are better off trying to go through vincent's blogs or doing a little bit more searching on this site.

Posted: Wed Jul 05, 2006 1:43 pm
by pneumalin
Hi Guys,
I don't agree with the statement of Bad Performance in Transformer Stage PX. I have posted similar statement to defense PX Transformer couple months ago, but probably no one paid too much attention to it at that time. Actually, Ascential Senior Engineer has confirmed with me that the statement regarding to avoid using PX Transformer in Advanced Guide was out-dated and only applied to 7.0. They promised to remove that statement in next document refresh.
From my experience, I couldn't tell there is any significant difference in performance between using PX Transformer and other stages. I am interested in knowing how Kumar comes up with the number 25% faster. I agree with Kumar that the compilation time might takes longer time since it needs to generate the source and pass it to compiler when using Transformer, but once it is compiled as a Shared Object, the SO shall be loaded in Runtime as requested by DS Engine, so is the rest of SOs who contains the objects of other stages such as Modify, Copy, etc.. Therefore, by theory I don't it make such a big difference to use PX Transformer. Please comment on it if you have other views on this.

Posted: Wed Jul 05, 2006 4:05 pm
by vmcburney
As I said in Is the DataStage parallel transformer evil? the benefits of the transformer stage far outweigh any time overhead. I would always design with transformers and switch them later if extra performance is required. There was some bad press about transformers a couple ascentialworlds ago and they didn't do a very good job of correcting that information.

One big improvement I've noted is a transformer with constraints that is replaced by a filter stage followed by a transformer. The transformer does a transform followed by a filter. Therefore you are transforming a number of rows that are getting filtered out. The filter stage reverses the order and gives you a filter followed by a transform. This is good when you are removing a large number of rows such as a range based lookup.

Posted: Thu Jul 06, 2006 2:06 am
by opdas
I posted a topic few days back regarding the performance between filter stage and using constraint in PX transformer stage and have noted that PX transformer was many times fast in filtering and the filter stage performance has dwindled as the time progresses.
The filter was based on a string string match.

This was measured through rows/sec .