How the Entire Partioning works

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pklcnu
Premium Member
Premium Member
Posts: 50
Joined: Wed Aug 06, 2008 4:39 pm

How the Entire Partioning works

Post by pklcnu »

Dear Experts

In parallel jobs how does the Entire Partitioning property works, suppose if the data gets divided in to 5 partitions all the 5 partitions will have the same data as the original dataset .

As the original single dataset is divided into 5 datasets is not a overhead itself, instead of copying the same data in to 5 different datasets which are same, is it not better to keep the original dataset itself. How exactly does it works ?

I have seen few posts in this site ,but non of them clearly explains and I am not convinced myself .

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Entire is entire just like it sounds, so your first paragraph is correct. I have no idea what you are asking in the second one, however.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

all the 5 partitions will have same data as the original dataset

Regards
Sreeni
pklcnu
Premium Member
Premium Member
Posts: 50
Joined: Wed Aug 06, 2008 4:39 pm

Post by pklcnu »

Sreenivasulu wrote:all the 5 partitions will have same data as the original dataset

Regards
Sreeni
Yes I know that what you and Chulett said, but what is the pupose of partitioning if it is the same set after partitioning?
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

It is the way the parallel logic works.
If you have some stages which are parallel and want this stage where-in you need to use 'entire' but use in parallel mode.

I think i am not clear but hope you have got the message.

Regards
Sreeni
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Are you trying to ask under what circumstances Entire partitioning would be appropriate to use?
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I will try to be a bit more verbose and see if that helps.


Entire is typically used for the reference link to a lookup stage, and typically only non-sparse lookups with a small to medium sized number of values in the lookup set (hundreds or thousands).

When you select "Entire", a full copy of all of the data values is made available for each partition in the lookup stage. This means the incoming data does not need to be partitioned in any particular manner, as all partitions have access to all lookup values.

The other option is to insure both the reference link and the input link are both partitioned identically to insure that partitions have access to the subset of reference values that could possibly match their subset of incoming data values.

Side note: At release 8.0, if you are on an SMP (single box) configuration you can reduce memory usage by selecting "Auto". By default it will select "Entire" and use a single copy stored in shared memory that can be accessed by all partitions.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

asorrell wrote:Side note: At release 8.0, if you are on an SMP (single box) configuration you can reduce memory usage by selecting "Auto". By default it will select "Entire" and use a single copy stored in shared memory that can be accessed by all partitions.
This happens in 7.x as well, from what I've seen.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

To summarize what Andy contributed extremely briefly:
Entire guarantees that every lookup will work on any partition.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply