Switch v/s filter stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
adasgupta123
Participant
Posts: 42
Joined: Fri Oct 20, 2006 1:58 am

Switch v/s filter stage

Post by adasgupta123 »

Hi,


I will have to segregate the records from a dataset to 14 file sets depending upon the 14 different values in a column.

The number of records are 82460040.The records are not evenly distributed on all the output links.That means every output link does not carry same no. of records.

Can u plz advice me which stage is preferable in this case ,switch or filter
from the performance point of view?
This job is run every day once.


Thanks and Regards

Avik Dasgupta
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appropriate, but if each row that comes in to the stage is passed on depending upon the values in one column then there is nothing better than the switch stage to do it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You might also consider using a parallel Transformer stage if you are using version 7.5.1 or later. That way your constraint expressions may be easier to construct.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

but using the PX transformer just for filtering - will it be a good idea if performance is considered to use transformer
Regards,
S. Kirtikumar.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The PX transformer stage will most likely use more CPU than the switch or filter stages. But chances are that the program will not be bottlenecked by CPU so it won't make that much of a difference.
adasgupta123
Participant
Posts: 42
Joined: Fri Oct 20, 2006 1:58 am

Post by adasgupta123 »

ArndW wrote:In this case the switch stage is exactly tailored to what you want to do and will be more efficient. If you have a large percentage of records that are dropped then the filter stage will be more appro ...
Hi,

Thank you very much for your advice.

Can you plz explain me why switch stage is preferable?

Regards

Avik
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Avik,
what you are doing is exactly what the switch stage was designed for - using one column's values to direct output. As noted earlier, you can use moth a filter and a transform to do the same thing. Barring bad design or implementation issues, it makes sense to use a purpose-written stage to effect something instead of using a more generic stage. The number of CPU-cycles should be lowest in the switch stage. If you have doubts about this it is easy enough to test out.
adasgupta123
Participant
Posts: 42
Joined: Fri Oct 20, 2006 1:58 am

Post by adasgupta123 »

Hi ,

Thanks all of you for your advices.
I got clear and satisfactory answers.How can i close this topic?
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

You shouuld be able to find a attrative button on the top of the page. Dont you?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply