Partitions and Nodes configuration problem
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 16
- Joined: Fri Oct 27, 2006 6:27 am
Re: Partitions and Nodes configuration problem
Please elobrate the problem.......
-
- Participant
- Posts: 16
- Joined: Fri Oct 27, 2006 6:27 am
I'm not sure what the previous poster meant about data always having to be partitioned. In most cases the default partitioning and default configuration file works just fine and the developer is not forced to think about the partitioning with regards to performance. Only when doing such things as lookups and sorts does the designer have to understand the implications of partitioning.
The most common case is doing a lookup in a job and using more than 1 node in the configuration file. Unless the data is partitioned according to the lookup key or the designer explicitly specifies "entire" partitioning for the lookup link the result of the job will be wrong.
The most common case is doing a lookup in a job and using more than 1 node in the configuration file. Unless the data is partitioned according to the lookup key or the designer explicitly specifies "entire" partitioning for the lookup link the result of the job will be wrong.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
If your job delivers different results in a 1 node than in any multinode configuration you have made design mistakes. Most likely (as per my previous post) in lookup stages.
So, as per my previous post, do you have lookups? If so, is the reference link set to "entire" partitioning or have you ensured that you have partitioned both streams on the lookup key? (These are rhetorical questions, since I am sure that this is your problem)
So, as per my previous post, do you have lookups? If so, is the reference link set to "entire" partitioning or have you ensured that you have partitioned both streams on the lookup key? (These are rhetorical questions, since I am sure that this is your problem)
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 16
- Joined: Fri Oct 27, 2006 6:27 am
I know the design for PX jobs are not partition base and hence the results are coming different with 4 nodes config file. But in production, even with 4 nodes, results are coming proper. In most of the places, the join stage is used.
There is no problem with lookup.
As in production we can't test same job running number of times, we tried in different environment where results are coming different. To avoid this, we tried with 1 node and result is proper.
So is this problem with environment?
There is no problem with lookup.
As in production we can't test same job running number of times, we tried in different environment where results are coming different. To avoid this, we tried with 1 node and result is proper.
So is this problem with environment?
Humour me please, set the lookup reference link(s) to "entire" partitioning and see if a multinode configuration works.lokesh_chopade wrote:...There is no problem with lookup. ...
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Lokesh - 'auto' DOES do partitioning. What happened when you made the reference links 'entire'?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 16
- Joined: Fri Oct 27, 2006 6:27 am
lokesh - how often does this need repeating? Using "auto" partitioning will not guarantee that a job will work correctly!
Let's take a very simple example. The source file has 2 columns, a numeric sequential row number and a numeric Employee number. Let us assume the default "auto" partitioning is done using a round robin algorithm on the first column and that you have a 2 node configuration.
Assuming source data
This means rows 1 and 3 go into node 0 and 2 & 4 and so on go to node 1.
Now you put in a lookup stage looking up your Employee table, which has the same key as column 2 of the source file and contains the employee name in another column. You also set this to AUTO and we'll just use round robin partitioning as well.
So now node 0 on the lookup gets values 1,3 and node 1 gets 2,4.
Now take the first row, node 0 looks up employee 002 from the lookup node 0 and gets no match. The next row on node 0 tries to find employee key 003 and also gets no match. The same procedure now happens twice on source node 1 and both lookups fail there as well.
In this example you will get 4 failed lookups using "AUTO"; if you went to a 1-node configuration you would get the program to work correctly, but only because you are masking the real programming error. This is why I always recommend designing on more than 1 node configuration, since if it work in one multinode configuration it will work in any configuration.
The correct solution in this case is to make the lookup file link "ENTIRE" so that each node gets a full copy of all the rows; or to partition the source and the lookup on Employee Key.
Let's take a very simple example. The source file has 2 columns, a numeric sequential row number and a numeric Employee number. Let us assume the default "auto" partitioning is done using a round robin algorithm on the first column and that you have a 2 node configuration.
Assuming source data
Code: Select all
Key,Emp
001,001
002,002
003,003
004,004
Now you put in a lookup stage looking up your Employee table, which has the same key as column 2 of the source file and contains the employee name in another column. You also set this to AUTO and we'll just use round robin partitioning as well.
So now node 0 on the lookup gets values 1,3 and node 1 gets 2,4.
Now take the first row, node 0 looks up employee 002 from the lookup node 0 and gets no match. The next row on node 0 tries to find employee key 003 and also gets no match. The same procedure now happens twice on source node 1 and both lookups fail there as well.
In this example you will get 4 failed lookups using "AUTO"; if you went to a 1-node configuration you would get the program to work correctly, but only because you are masking the real programming error. This is why I always recommend designing on more than 1 node configuration, since if it work in one multinode configuration it will work in any configuration.
The correct solution in this case is to make the lookup file link "ENTIRE" so that each node gets a full copy of all the rows; or to partition the source and the lookup on Employee Key.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>