Hi,
I have job with many stages.
Here is where i need some help.
Sort -> Transformer -> Aggregator --> Transformer --> Pivot
Currently in the sort stage i am sorting on key field "Product_ID"
All the stages are using the Auto partitioning.
I would like to performance tune the job.
I tried using hash partitioning on sort stage and then using "same" partitioning all the way upto pivot stage.
When i use the Same partitioning with Transformer i get the Warning
"Input dataset 0 has a partitioning method other than entire specified; disabling memory sharing".
Please help me which partitioning to use for getting best performance.
Help to use correct partitioning
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
This is an unusual message. What is your Transformer stage doing? Can you please post the exact, and entire, message, so we can be certain before offering advice?
(Auto) will probably achive optimum partitioning in this job design. I'm only curious about where it thinks that Entire might be appropriate - this is normally only on the reference input of a Lookup stage.
(Auto) will probably achive optimum partitioning in this job design. I'm only curious about where it thinks that Entire might be appropriate - this is normally only on the reference input of a Lookup stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
My Key column in this job is "product_id"ray.wurlod wrote:This is an unusual message. What is your Transformer stage doing? Can you please post the exact, and entire, message, so we can be certain before offering advice?
(Auto) will probably achive optimum partitioning in this job design. I'm only curious about where it thinks that Entire might be appropriate - this is normally only on the reference input of a Lookup stage.
My transformer is using the key change column defined in sort stage on field "product_id" to do counting logic.
I was under impression that the sort and aggregator stages should NEVER use auto partioning and should you hash partioning on the grouping field. I am using sort method in aggregator.
So, to improve performance i am using the hash partitioning on sort stage and all other stages use same partitioning.
Then i am getting this warning:
xTransform: Input dataset 0 has a partitioning method other than entire specified; disabling memory sharing.
Appreciate your help in to achieve best performance tuning.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Ignore the message. It's only alerting you to the fact that shared memory will not be used because not all keys are on all nodes. Use a message handler to demote to informational.
Next step is to look at the score, to see what partitioning (Auto) is actually giving you. The Sort stage requires its input to be hash partitioned on the first sort key. Assuming that you are grouping by product_id, then Same should be used to carry that partitioning through the job.
Next step is to look at the score, to see what partitioning (Auto) is actually giving you. The Sort stage requires its input to be hash partitioned on the first sort key. Assuming that you are grouping by product_id, then Same should be used to carry that partitioning through the job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You're between a rock and a hard place. Entire partitioning will yield incorrect results in your job design. The responst to "doesn't like" therefore has to be "tough".highpoint wrote:My company doesn't like demoting messages and also does not accept any warnings.
Check partitioning in the score. Partitioning is indicated between pairs of data sets in that section of the score.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Re: Help to use correct partitioning
There is another post related to this problem
viewtopic.php?p=179155&sid=3c15e5709b0c ... 6f43f257e2
I got the same error message sometimes on transformer and I set the "Preserve partitioning" to "Clear" in the previous stage. In your situation, it is a sort stage, I am not sure would it cause another problem.
viewtopic.php?p=179155&sid=3c15e5709b0c ... 6f43f257e2
I got the same error message sometimes on transformer and I set the "Preserve partitioning" to "Clear" in the previous stage. In your situation, it is a sort stage, I am not sure would it cause another problem.