~~~Warning in job~~~

vigneshra · Post by **vigneshra** » Mon Nov 29, 2004 4:52 am

Hi

My job runs perfectly well with a smaller volumes in the order of 2 or 3 millions. But once when the volume shoots up, my job throws the following warning continuously but job runs to completion without aborting.

APT_ParallelSortMergeOperator,2: Unbalanced input from partition 2: 10000 records buffered

How can I avoid this warning ?? Any Ideas welcome !!

Vignesh.

kcbland · Post by **kcbland** » Mon Nov 29, 2004 8:53 am

Well, go to the Search page here and enter "Unbalanced input from partition" and check the Exact match option. You'll find it's been covered before.

richdhan · Post by **richdhan** » Tue Nov 30, 2004 2:30 am

Hi Vignesh,

This is due to wrong partitioning. You have done the HASH partitioning twice. If you have done HASH partitioning once use the SAME partitioning for the remaining stages until you do the ROUND ROBIN partitioning. Instead of SAME partitioning you have again done the HASH partitioning and that is the reason you get this warning.

HTH
--Rich

Pride comes before a fall
Humility comes before honour

mandyli · Post by **mandyli** » Tue Nov 30, 2004 3:14 am

Yes Rich you are correct b'cus of wrong partitioning you are getting this warning. check you partitioning method..

Thanks
Man

ganive · Post by **ganive** » Mon Dec 19, 2005 6:00 am

Can this error happen even if you're using HASH Partitioning but twice but on two different Key values ??

richdhan wrote:Hi Vignesh,

This is due to wrong partitioning. You have done the HASH partitioning twice. If you have done HASH partitioning once use the SAME partitioning for the remaining stages until you do the ROUND ROBIN partitioning. Instead of SAME partitioning you have again done the HASH partitioning and that is the reason you get this warning.

HTH
--Rich

Pride comes before a fall
Humility comes before honour

ray.wurlod · Post by **ray.wurlod** » Mon Dec 19, 2005 1:54 pm

Yes, of course, particularly if you are depending on the partitioning being the same, for example to perform a lookup.

ganive · Post by **ganive** » Tue Dec 20, 2005 8:16 am

Woow... so that mean that you can't use the same Dataset if you have to perform Join/Lookup/Merge... operations on different key values ?

I mean, each Time I had to do that, I used to "repartition" my "Reference" Dataset by hashing it on another key depending on the data I wanted to join with

ray.wurlod wrote:Yes, of course, particularly if you are depending on the partitioning being the same, for example to perform a lookup.

ray.wurlod · Post by **ray.wurlod** » Tue Dec 20, 2005 1:48 pm

No, it doesn't mean that at all. You can use the same Data Set, but you do have to ensure that the lookup will find all the keys it needs to, and that means that the stream and reference inputs must be partitioned identically (and sorted identically for Join and Merge stages) on the keys being used for the lookup. Or, if it's a small enough Data Set, use Entire partitioning.

ganive · Post by **ganive** » Wed Dec 21, 2005 10:06 am

OK... I thought I had used PX 2 years without understanding anything... Saved !!

So let's get back to the subject :

Far earlier, Rich said that the Error "Unbalanced Input From partition 0..." is due to a Hash partitioning made Twice.
Is it the real problem then ?

According to my usage, and what I can read, you can re-Hash whenever you want it, even if it is not optimum (Same partition should be used if you are using the same Hash Keys).

In my case, I have a stream input, coming from a Flat File. I'm using the partition method to Hash on the same Key as my reference Dataset (the reference input is using same partition as it has been partitionned earlier).
Every other link is same... except for a writing in a dataset where I use Round Robin.

And I'm getting the famous error "Unbalanced Input From Partition 0...".

Any Idea ??

ray.wurlod wrote:No, it doesn't mean that at all. You can use the same Data Set, but you do have to ensure that the lookup will find all the keys it needs to, and that means that the stream and reference inputs must be partitioned identically (and sorted identically for Join and Merge stages) on the keys being used for the lookup. Or, if it's a small enough Data Set, use Entire partitioning.

ganive · Post by **ganive** » Wed Jan 04, 2006 5:22 am

Okay, Just solved the problem.

It seems that PX version 7.1, while using the Join Stage, automatically sort your input data.
I'm used to partition my data before the Join stage and check the Sort box as well... that's why I get the Warning !!

All I had to do to solve it, was to uncheck the sort box.
Maybe this will help some of you.

ganive wrote:OK... I thought I had used PX 2 years without understanding anything... Saved !!

So let's get back to the subject :
Far earlier, Rich said that the Error "Unbalanced Input From partition 0..." is due to a Hash partitioning made Twice.
Is it the real problem then ?
According to my usage, and what I can read, you can re-Hash whenever you want it, even if it is not optimum (Same partition should be used if you are using the same Hash Keys).

In my case, I have a stream input, coming from a Flat File. I'm using the partition method to Hash on the same Key as my reference Dataset (the reference input is using same partition as it has been partitionned earlier).
Every other link is same... except for a writing in a dataset where I use Round Robin.

And I'm getting the famous error "Unbalanced Input From Partition 0...".

Any Idea ??

ray.wurlod wrote:No, it doesn't mean that at all. You can use the same Data Set, but you do have to ensure that the lookup will find all the keys it needs to, and that means that the stream and reference inputs must be partitioned identically (and sorted identically for Join and Merge stages) on the keys being used for the lookup. Or, if it's a small enough Data Set, use Entire partitioning.

madhukar · Post by **madhukar** » Mon Jan 09, 2006 6:19 am

chk the order of the keys in partition and the sort key are different...

kumar_s · Post by **kumar_s** » Mon Jan 09, 2006 8:11 am

ganive wrote:Okay, Just solved the problem.

It seems that PX version 7.1, while using the Join Stage, automatically sort your input data.
I'm used to partition my data before the Join stage and check the Sort box as well... that's why I get the Warning !!

All I had to do to solve it, was to uncheck the sort box.
Maybe this will help some of you.

ganive wrote:OK... I thought I had used PX 2 years without understanding anything... Saved !!

So let's get back to the subject :
Far earlier, Rich said that the Error "Unbalanced Input From partition 0..." is due to a Hash partitioning made Twice.
Is it the real problem then ?
According to my usage, and what I can read, you can re-Hash whenever you want it, even if it is not optimum (Same partition should be used if you are using the same Hash Keys).

In my case, I have a stream input, coming from a Flat File. I'm using the partition method to Hash on the same Key as my reference Dataset (the reference input is using same partition as it has been partitionned earlier).
Every other link is same... except for a writing in a dataset where I use Round Robin.

And I'm getting the famous error "Unbalanced Input From Partition 0...".

Any Idea ??

ray.wurlod wrote:No, it doesn't mean that at all. You can use the same Data Set, but you do have to ensure that the lookup will find all the keys it needs to, and that means that the stream and reference inputs must be partitioned identically (and sorted identically for Join and Merge stages) on the keys being used for the lookup. Or, if it's a small enough Data Set, use Entire partitioning.

Hi,

As far as i know, the join stage wont sort automaticlly unless AUTO partition is specified otherwise.

-Kumar