Page 1 of 1

Switch v/s filter stage

Posted: Fri Nov 17, 2006 1:14 am
by adasgupta123
Hi,


I will have to segregate the records from a dataset to 14 file sets depending upon the 14 different values in a column.

The number of records are 82460040.The records are not evenly distributed on all the output links.That means every output link does not carry same no. of records.

Can u plz advice me which stage is preferable in this case ,switch or filter
from the performance point of view?
This job is run every day once.


Thanks and Regards

Avik Dasgupta

Posted: Fri Nov 17, 2006 5:08 am
by ArndW
In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appropriate, but if each row that comes in to the stage is passed on depending upon the values in one column then there is nothing better than the switch stage to do it.

Posted: Fri Nov 17, 2006 7:50 am
by ray.wurlod
You might also consider using a parallel Transformer stage if you are using version 7.5.1 or later. That way your constraint expressions may be easier to construct.

Posted: Sat Nov 18, 2006 1:00 pm
by Kirtikumar
but using the PX transformer just for filtering - will it be a good idea if performance is considered to use transformer

Posted: Sat Nov 18, 2006 1:40 pm
by ArndW
The PX transformer stage will most likely use more CPU than the switch or filter stages. But chances are that the program will not be bottlenecked by CPU so it won't make that much of a difference.

Posted: Sun Nov 19, 2006 1:38 pm
by adasgupta123
ArndW wrote:In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appro ...
Hi,

Thank you very much for your advice.

Can you plz explain me why switch stage is preferable?

Regards

Avik

Posted: Sun Nov 19, 2006 4:31 pm
by ArndW
Avik,
what you are doing is exactly what the switch stage was designed for - using one column's values to direct output. As noted earlier, you can use moth a filter and a transform to do the same thing. Barring bad design or implementation issues, it makes sense to use a purpose-written stage to effect something instead of using a more generic stage. The number of CPU-cycles should be lowest in the switch stage. If you have doubts about this it is easy enough to test out.

Posted: Mon Nov 20, 2006 2:48 am
by adasgupta123
Hi ,

Thanks all of you for your advices.
I got clear and satisfactory answers.How can i close this topic?

Posted: Mon Nov 20, 2006 4:42 am
by kumar_s
You shouuld be able to find a attrative button on the top of the page. Dont you?