Combinability mode. What is it used for?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Combinability mode. What is it used for?

Post by splayer »

Can anyone explain? The help says that it saves significant amount of data copying and preparation in passing data between operators. Which operators are we talking about? Stages? How can we save data copying and preparation time?
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Each stage can be of single operator or of many. So if you have n stages in a job, with m nodes, you will end up with n*m process running simulteneously.
There are some operators which can be combined during runtime. By allowing the combinability to true you make them to combine and hence, the required memory and resource for each individual process is combined to single.
I wont say if 2 process got combined, the requried cpu and memory will be 1/2, but will be less that the previous.
But you cannot expect very good log out of this, during any fault analysis.
So for any debuggin purpose it is advisable to turn if off.
I recall, there is a discussion on this by roy. Search.
In some cases it acts the other ways too.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,

By default the stage operators can be combined. If you want to override for a specfic stage you can set the combinability mode for the stage to Dont Combine.

For Fault analysis you have to use the environment variable APT_DISABLE_COMBINATION to True and run the job. The job log will show exactly which stage is causing the problem.

HTH
--Rich
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

kumar_s, can you explain the statement "Each stage can be of single operator or of many"? What is the difference between an operator and a stage?
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Stages is GUI term used in front end, operator term used in the underlying design of datastage.
For example, sort operator, funnel operator, sortfunnel operator... Can be explicitly available as a stage, or will be supplied by datastage along with other stages like, funnel, join, sequential file Stage, when called implictly.
You can refer Orchestrate Operators Reference for more detials.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply