Partioning techniques

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ramesh_inform
Participant
Posts: 57
Joined: Mon Dec 03, 2007 12:43 am
Location: hyderabad

Partioning techniques

Post by ramesh_inform »

8) Best partioning techniques for different stages..........
Can anyone list out different partioning techniques which suits best for different stages? :P
ramesh.n.
ramesh_inform
Participant
Posts: 57
Joined: Mon Dec 03, 2007 12:43 am
Location: hyderabad

Post by ramesh_inform »

I think Hash partitioning is best to use with join stage,aggregator stage and change capture stage
and entire for lukup[reference link] :lol:
any opinions....
ramesh.n.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The best is (Auto) in general. For a large reference table into a Lookup stage you may get some benefit from hash partitioning both the stream and the reference input based upon the lookup key.
Any stage that requires key adjacency (the previous poster mentioned some of them) requires partitioning on the join/grouping/remdup key using a key-based partitioning algorithm. Modulus is more efficient than hash, but is only available for a single-column integer key.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ramesh_inform
Participant
Posts: 57
Joined: Mon Dec 03, 2007 12:43 am
Location: hyderabad

Post by ramesh_inform »

Partitioning depends on the business logic u apply in ur project
So what I think is rather than allowing DS to select the partitioning technique it will be useful if we mention the partitioning technique for different stages. :arrow:
ramesh.n.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Partitioning does NOT depend on business logic. Can you please justify that claim?

As to appropriate partitioning algorithms, I believe I covered that in my previous post on this thread.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
subrat
Premium Member
Premium Member
Posts: 77
Joined: Tue Dec 11, 2007 5:54 am
Location: UK

Post by subrat »

Hi,
I also have same impression as the partioning does not depend upon the business logic. But it has a big impact on the column data on which u mentioned the partion key.

Hope u got it....

Subrat
ramesh_inform wrote:Partitioning depends on the business logic u apply in ur project
So what I think is rather than allowing DS to select the partitioning technique it will be useful if we mention the partitioning technique for different stages. :arrow:
RavishankarMS
Participant
Posts: 8
Joined: Tue Oct 23, 2007 5:12 am

Post by RavishankarMS »

Hi,

What i believe is neither the stage nor business logic is as important as the Data itself is. The deeper you understand the data easier is to find partitioning technique.
Regards,
Ravishankar M S
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

subrat wrote:Hi,
I also have same impression as the partioning does not depend upon the business logic. But it has a big impact on the column data on which u mentioned the partion key.

Hope u got it....

Subrat
U hasn't posted in a white. Why do you hope U got it? As far as I can tell U never mentioned anything of the kind (U hasn't posted very often, so a search by author was quite short).

Incidentally, the second person personal pronoun in English is "you".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply