Partitioning

G SHIVARANJANI · Post by **G SHIVARANJANI** » Sat May 31, 2008 6:11 am

Hi,

I have a job which runs on 4 node configuration.

It has a dataset followed by a transformer , the transformer routs the data to same table based on 3 different constrains.

I hav'nt applied any partition on dataset and transformer.

But for the three instences of table.. i have applied partition as same..

Can this lead to record lock errors..

ray.wurlod · Post by **ray.wurlod** » Sat May 31, 2008 4:32 pm

Possibly. Not if the constraints involve mutually exclusive conditions (that is, any one input row can be directed to only one table).

Partitioning is not relevant to this question; partitioning only ensures that any one row will be processed on only one processing node.

G SHIVARANJANI · Post by **G SHIVARANJANI** » Sun Jun 01, 2008 3:39 am

You mean partition can never cause record locking when the constraints are mutually exclusive ?

Putting 'same' at the very beginning and continuing with 'same' for the rest of the stages ..without applying any other partioning..will this lead to any errors...

As per my understanding 'same' is faster.. and when i do not have a key field for partitioning...i kept just 'same' as the partioning type.

ray.wurlod wrote:Possibly. Not if the constraints involve mutually exclusive conditions (that is, any one input row can be directed to only one table).

Partitioning is not relevant to this question; partitioning o ...

ray.wurlod · Post by **ray.wurlod** » Sun Jun 01, 2008 2:31 pm

(Auto) is the same as "Same" when the link is between two parallel stages (neither of which is a reference link).

Partitioning guarantees that each row is processed on only one node.

Mutually exclusive constraints guarantee that each input row (on any particular node) is directed to only one output link. It is this that guarantees that self-locking will not occur: partitioning is irrelevant.

keshav0307 · Post by **keshav0307** » Sun Jun 01, 2008 8:14 pm

i think the partition will be same as 'same' if you select Auto partition and preserve partition.
i will use Hash parition to avoid the record lock

ray.wurlod · Post by **ray.wurlod** » Sun Jun 01, 2008 8:56 pm

The only partitioning algorithms that could cause a self-deadlock (assuming you're not updating the same table as the one from which you are reading) are Entire or a key-based algorithm (Hash or Modulus) where the partitioning key differs from the table primary key.

Every other partitioning algorithm puts every row uniquely on a single processing node.