Switch v/s filter stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 42
- Joined: Fri Oct 20, 2006 1:58 am
Switch v/s filter stage
Hi,
I will have to segregate the records from a dataset to 14 file sets depending upon the 14 different values in a column.
The number of records are 82460040.The records are not evenly distributed on all the output links.That means every output link does not carry same no. of records.
Can u plz advice me which stage is preferable in this case ,switch or filter
from the performance point of view?
This job is run every day once.
Thanks and Regards
Avik Dasgupta
I will have to segregate the records from a dataset to 14 file sets depending upon the 14 different values in a column.
The number of records are 82460040.The records are not evenly distributed on all the output links.That means every output link does not carry same no. of records.
Can u plz advice me which stage is preferable in this case ,switch or filter
from the performance point of view?
This job is run every day once.
Thanks and Regards
Avik Dasgupta
In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appropriate, but if each row that comes in to the stage is passed on depending upon the values in one column then there is nothing better than the switch stage to do it.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You might also consider using a parallel Transformer stage if you are using version 7.5.1 or later. That way your constraint expressions may be easier to construct.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 437
- Joined: Fri Oct 15, 2004 6:13 am
- Location: Pune, India
-
- Participant
- Posts: 42
- Joined: Fri Oct 20, 2006 1:58 am
Hi,ArndW wrote:In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appro ...
Thank you very much for your advice.
Can you plz explain me why switch stage is preferable?
Regards
Avik
Avik,
what you are doing is exactly what the switch stage was designed for - using one column's values to direct output. As noted earlier, you can use moth a filter and a transform to do the same thing. Barring bad design or implementation issues, it makes sense to use a purpose-written stage to effect something instead of using a more generic stage. The number of CPU-cycles should be lowest in the switch stage. If you have doubts about this it is easy enough to test out.
what you are doing is exactly what the switch stage was designed for - using one column's values to direct output. As noted earlier, you can use moth a filter and a transform to do the same thing. Barring bad design or implementation issues, it makes sense to use a purpose-written stage to effect something instead of using a more generic stage. The number of CPU-cycles should be lowest in the switch stage. If you have doubts about this it is easy enough to test out.
-
- Participant
- Posts: 42
- Joined: Fri Oct 20, 2006 1:58 am