Switch vs Transform ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Switch vs Transform ?

Post by SettValleyConsulting »

So I've been developing in PX for 8 weeks or so now so still a bit of a novice, and having got the application working I am now looking at optimising its performance.

One situation that crops up repeatedly is where I want to send rows that meet conditions A or B down one stream and all other rows down the other stream. (Eg after a Left Outer Join where a match is found, go one way, where no match another.)

At the moment (my background is Server) I am generally using transformer and constraints to achieve this but I have read here and elsewhere that transformers are bad news performance-wise and to be avoided where possible.

So I looked at the 'native' PX equivalents - Switch and/or Filter. Seems to me that the options are:

- Use a Switch stage - my problem with Switch is the 'Else' path. That is rows that do not meet any of the selection criteria go down the 'Reject' link. This goes against the grain describing rows as rejects when they aren't actually really and more seriously and of course you cannot change the mapping or metadata on a reject link so this must be done, if required by an additional Modify or Copy stage.

- Use a Copy stage to split the stream into two then add a Filter to each to remove unwanted rows. This requires a minimum of three stages and involves duplicating the amount of data being moved, before it is filtered out.

Has anyone any thoughts on which of these is more efficient? Should I stay with my Transformer or is there another way of ahieving this that is more elegant?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

The filter stage is a copy stage. Only rather then copying down each output link unconditionally you can put a filter on each link. Refer to the Parallel Job Developers Guide for the correct filter syntax, unlike server jobs where DataStage BASIC is used throughout, the filter stage has its own syntax that is different to the Transformer stage which is different again to the Modify stage. They all use a C like language but with different functions.
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

Emabarrassingly swift reply.

Note to self RTFM before posting

Thanks, :oops:
Post Reply