Hash Partitioning with Multiple Keys

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
visvacfirvin
Premium Member
Premium Member
Posts: 49
Joined: Fri Dec 14, 2007 1:43 pm

Hash Partitioning with Multiple Keys

Post by visvacfirvin »

Hi,
I have a doubt in using Hash Partition using multiple keys. My logic is to calculate some values based on current key previous key logic. Consider that i've 3 nodes and i want to do the following operation.

Code: Select all

                  A           B                C             D         E       F
                  P1          1                2              3         1       2
                  P1          1                2              4         3       2
                  P1          1                2              5         2       3
                  P2          2                3              1         2       3 
                  P2          2                3              2         3       4
Now a record is uniquely identified by keys A,B and C. But i want to compare current and prev key to do some computation on all the columns till F. Now if the hash parition using all 6 column, what'll happen.

1. Will the record P1 comes into a single node??
2. Or will it work when i have 2 nodes??
3. Or is it necessary that i hash partition using first 3 colums which identifies a record???

Thanks in advance.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You will have to partition your data using A,B and C as keys. The number of nodes isn't relevant, just that for each node the keys used to compare with the last record are identical.
visvacfirvin
Premium Member
Premium Member
Posts: 49
Joined: Fri Dec 14, 2007 1:43 pm

Post by visvacfirvin »

Can I take it as hash partition using all 6 keys wont work?? I'm just wondering what difference will it make if there's only 2 nodes if i partition using all 6 keys or with only A, B and C.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you use all 6 columns for hashing, then you will get different ordering and will get different ABC column value combinations onto the same node. Just a mathematical certainty as a funciton of the hashing algorithm.
visvacfirvin
Premium Member
Premium Member
Posts: 49
Joined: Fri Dec 14, 2007 1:43 pm

Post by visvacfirvin »

Oh thts cool...I got it. Thanks a lot. :D
Post Reply