Page 1 of 1

Transformer Tuning

Posted: Tue Jul 13, 2010 1:22 pm
by dougcl
Hi folks, I've got a 15 column transformation stage, about 400 bytes per row, and I am doing a NullToValue replacement on each column (nothing else). My throughput drops from 108K rows/sec to 58K rows/sec as a result of adding this stage.

This seems like a big hit. Are there tuning options, or is this pretty much an expected overhead?

Thanks,
Doug

Posted: Tue Jul 13, 2010 2:35 pm
by ray.wurlod
NullToValue() is a Transformer stage function. The equivalent in a Modify stage is handle_null(). The manual is wrong.

Posted: Tue Jul 13, 2010 2:52 pm
by dsedi
ray.wurlod wrote:NullToValue() is a Transformer stage function. The equivalent in a Modify stage is handle_null(). The manual is wrong. ...
Ray Thanks for the correction...Truly Appreciate that. :)

Posted: Tue Jul 13, 2010 3:56 pm
by dougcl
Hi guys, thanks for your interest. While I am only doing null handling at the moment, there will be column derivation too. There may be other things. So is it appropriate to ask if this is a typical performance hit in a transformer stage?

Posted: Wed Jul 14, 2010 2:37 am
by ArndW
It is very hard to generalize performance impact. If your system were IO bound then adding complexity to transform or modify stages would make no difference in throughput. In this case the system was CPU bound and thus adding NullToValue() did make an impact. Perhaps reducing the number of processing nodes might bring the rows/second back up (I'm not recommending doing this, but in some cases it might impact throughput positively).

Posted: Wed Jul 14, 2010 2:53 am
by Sainath.Srinivasan
Try running in different config files - i.e. high and low number of nodes.

Posted: Wed Jul 14, 2010 4:20 am
by Shaanpriya
I have faced similar issues. Mine was a 300+ column transformer and it did take a performance hit. Best bet is to try and use the modify stage.

Posted: Wed Jul 14, 2010 5:49 pm
by ray.wurlod
You may have to use both (Modify and Transformer). The Modify stage, for all its simplicity, is very slick.

Posted: Thu Jul 15, 2010 1:23 pm
by dougcl
Hi folks, ramping up the number of partitions (two to sixteen) did indeed help considerably. I am now back up to 96K rows/sec.

Unfortunately though, running this many partitions affects lookup stages that happen to be using the "entire" partitioning strategy. That required some rework in this job. Since all of our transformation jobs will likely require transformer stages (!) this answer compromises the use of lookup stages, I think, at least if "entire" is being used in them.

In general, though does breaking a single transformer stage into a transformer+modify (as suggested above) provide performance gains? Is this something to try? In other words, should I reserve for the modify stage those transformations that it can do, and then leave only the more involved things for the transformation stage? I'm not sure why this would help as the transformer has to load all the rows anyway. Usually picking something up once is better than twice.

I can see coming up with some involved parallel transformer paths with each transformer only operating on rows that it needs to, followed by a funnel, with modify handling the bulk of the work.

Doug

Posted: Thu Jul 15, 2010 2:37 pm
by ray.wurlod
If you're using SMP architecture Entire partitioning is essentially free - one copy of the Data Set is emplaced in shared memory and used by all nodes.

Posted: Fri Jul 16, 2010 2:33 pm
by dougcl
ray.wurlod wrote:If you're using SMP architecture Entire partitioning is essentially free - one copy of the Data Set is emplaced in shared memory and used by all nodes.
Good to know Ray, thanks. It appears that our system is not, uh, utilizing this feature at the moment.