Partitioning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
G SHIVARANJANI
Participant
Posts: 137
Joined: Sun Jan 07, 2007 11:17 pm
Location: VISAKHAPATNAM

Partitioning

Post by G SHIVARANJANI »

Hi,

I have a job which runs on 4 node configuration.

It has a dataset followed by a transformer , the transformer routs the data to same table based on 3 different constrains.

I hav'nt applied any partition on dataset and transformer.

But for the three instences of table.. i have applied partition as same..

Can this lead to record lock errors..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Possibly. Not if the constraints involve mutually exclusive conditions (that is, any one input row can be directed to only one table).

Partitioning is not relevant to this question; partitioning only ensures that any one row will be processed on only one processing node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
G SHIVARANJANI
Participant
Posts: 137
Joined: Sun Jan 07, 2007 11:17 pm
Location: VISAKHAPATNAM

Post by G SHIVARANJANI »

You mean partition can never cause record locking when the constraints are mutually exclusive ?

Putting 'same' at the very beginning and continuing with 'same' for the rest of the stages ..without applying any other partioning..will this lead to any errors...

As per my understanding 'same' is faster.. and when i do not have a key field for partitioning...i kept just 'same' as the partioning type.

ray.wurlod wrote:Possibly. Not if the constraints involve mutually exclusive conditions (that is, any one input row can be directed to only one table).

Partitioning is not relevant to this question; partitioning o ...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

(Auto) is the same as "Same" when the link is between two parallel stages (neither of which is a reference link).

Partitioning guarantees that each row is processed on only one node.

Mutually exclusive constraints guarantee that each input row (on any particular node) is directed to only one output link. It is this that guarantees that self-locking will not occur: partitioning is irrelevant.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

i think the partition will be same as 'same' if you select Auto partition and preserve partition.
i will use Hash parition to avoid the record lock
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The only partitioning algorithms that could cause a self-deadlock (assuming you're not updating the same table as the one from which you are reading) are Entire or a key-based algorithm (Hash or Modulus) where the partitioning key differs from the table primary key.

Every other partitioning algorithm puts every row uniquely on a single processing node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply