Switch v/s filter stage

adasgupta123 · Post by **adasgupta123** » Fri Nov 17, 2006 1:14 am

Hi,

I will have to segregate the records from a dataset to 14 file sets depending upon the 14 different values in a column.

The number of records are 82460040.The records are not evenly distributed on all the output links.That means every output link does not carry same no. of records.

Can u plz advice me which stage is preferable in this case ,switch or filter
from the performance point of view?
This job is run every day once.

Thanks and Regards

Avik Dasgupta

ArndW · Post by **ArndW** » Fri Nov 17, 2006 5:08 am

In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appropriate, but if each row that comes in to the stage is passed on depending upon the values in one column then there is nothing better than the switch stage to do it.

ray.wurlod · Post by **ray.wurlod** » Fri Nov 17, 2006 7:50 am

You might also consider using a parallel Transformer stage if you are using version 7.5.1 or later. That way your constraint expressions may be easier to construct.

Kirtikumar · Post by **Kirtikumar** » Sat Nov 18, 2006 1:00 pm

but using the PX transformer just for filtering - will it be a good idea if performance is considered to use transformer

ArndW · Post by **ArndW** » Sat Nov 18, 2006 1:40 pm

The PX transformer stage will most likely use more CPU than the switch or filter stages. But chances are that the program will not be bottlenecked by CPU so it won't make that much of a difference.

adasgupta123 · Post by **adasgupta123** » Sun Nov 19, 2006 1:38 pm

ArndW wrote:In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appro ...

Hi,

Thank you very much for your advice.

Can you plz explain me why switch stage is preferable?

Regards

Avik

ArndW · Post by **ArndW** » Sun Nov 19, 2006 4:31 pm

Avik,
what you are doing is exactly what the switch stage was designed for - using one column's values to direct output. As noted earlier, you can use moth a filter and a transform to do the same thing. Barring bad design or implementation issues, it makes sense to use a purpose-written stage to effect something instead of using a more generic stage. The number of CPU-cycles should be lowest in the switch stage. If you have doubts about this it is easy enough to test out.

adasgupta123 · Post by **adasgupta123** » Mon Nov 20, 2006 2:48 am

Hi ,

Thanks all of you for your advices.
I got clear and satisfactory answers.How can i close this topic?

kumar_s · Post by **kumar_s** » Mon Nov 20, 2006 4:42 am

You shouuld be able to find a attrative button on the top of the page. Dont you?