Question Regarding partitons

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ramesh_c
Participant
Posts: 27
Joined: Thu Dec 14, 2006 3:37 am
Location: delhi

Question Regarding partitons

Post by ramesh_c »

Hi All,

I am new to data stage.I need some information regarding the partition key which need to be appliedon the fallowing stages.

1)Aggregator
2)Stored procedure.
3)filter
4)funnel
5)look up
6)transformer

Please provide me normally which partitions are applied on these stages.
we are applying same partiton on the transformer stage and entire partition on the reference table and hash partition on main table in look up stage .
Please let me know if the partions whcih im am applying is imcorrect.

Thanks,
Ramesh,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Key-based partitioning involves the Hash algorithm unless the partitioning key is a single, essentially unbroken, integer sequence, in which case Modulus consumes fewer resources.
(1) Aggregator: partition on grouping keys
(2) Stored procedure: usually executed in sequential mode
(3) Filter: any partitioning algorithm will do
(4) Funnel: any partitioning algorithm will do
(5) Lookup stage: see below
(6) Transformer stage: any partitioning algorithm will usually do

When any partitioning algorithm will do, one ideally chooses one that gives the most even spread of processing over the available processing nodes, at the same time trying to minimize re-partitioning, particularly in an MPP environment.

With a Lookup stage there are two possibilities.
(a) Any kind of partitioning on the stream input with Entire on the reference input(s)
(b) Identical key-based partitioning on the stream and reference inputs

Other stage types, such as Join, Merge and Remove Duplicates, also need their data to be key partitioned.

The Transformer stage usually does not care about how the data are partitioned. However, if you are using stage variables to compare one row with the previous one, then key-based partitioning must be used to ensure that the things being compared are on the same processing node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ramesh_c
Participant
Posts: 27
Joined: Thu Dec 14, 2006 3:37 am
Location: delhi

Post by ramesh_c »

Hi Ray,
Thanks for the response.But i am not the premiun member so i cant able to read the answer which was provided by you.If you dont mine can you please provide me the answer .

Thanks ,
Ramesh.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, for that would undermine the entire reason for having premium membership, which is one means by which the bandwidth costs of DSXchange are met. And without those being met there would be no more DSXchange. It's less than 30 cents per day.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ramesh_c
Participant
Posts: 27
Joined: Thu Dec 14, 2006 3:37 am
Location: delhi

Post by ramesh_c »

Ok Ray, Thanks alot for replying for my post.

Thanks,
ramesh.
Post Reply