Hello Forum,
When source is sequential file and I pass this through a transformer with default partitioning methods specified ie 'Auto'. What exactly is done. Is round robin method used, or does it perform hash partitioning by the key fields from my column metadata?
Thanks,
Greg
Default Partitioning method
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
The default partitioning method is round robin.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
By default it is roundrobin. But it varies depends upon the stages you use. If you use the same auto partion in DB2 stage, it used DB2 partiton. If you use in sort or join, it automatically inserts hash. Since it is not predictable, it is alway advisable to use delibrate partition mehton.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
What if the data was already partitioned, will the join automatically repartition it by my join key?
For example. Suppose I have 2 Sequential file inputs which are linked to respective datasets using hash partitioning on key1. Then in second job I have these two datasets as inputs into join stage and am joining on key2. Will the join stage automatically repartition the datasets to key2?
- Greg
For example. Suppose I have 2 Sequential file inputs which are linked to respective datasets using hash partitioning on key1. Then in second job I have these two datasets as inputs into join stage and am joining on key2. Will the join stage automatically repartition the datasets to key2?
- Greg