about partitioning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vmachvava
Participant
Posts: 9
Joined: Tue May 31, 2011 2:38 am
Location: India

about partitioning

Post by vmachvava »

Hi all,

I have small doubt,we have following scenario's

oracleEnterprise----->dataset

Db2----------->dataset

xmlinput------->dataset


seqfile--------->dataset


What partininig technque will be used by datastage while loading the data from sorce systems(from oracleDb,db2,xml file)in above scenarios if i give auto.is it vary based on the source? or is it vary based the amount of data?

thanks&regards
vasu
mhester
Participant
Posts: 622
Joined: Tue Mar 04, 2003 5:26 am
Location: Phoenix, AZ
Contact:

Post by mhester »

The framework does a pretty decent job of determining the best partitioning and sorting methods to use at runtime. If there are keys defined on the link then hash or similar will likely be used. If no keys then likely round robin will be used. If there are keyed operators then it will partition and sort by the keys required in the operator.

I do not believe it has anything to do with source or data volumes.

You can also view the type of partitioners and sort operators which will be inserted by reviewing the score of the job by adding APT_DUMP_SCORE=1 to your job.
Post Reply