*Note* Message composed prior to OP's last update, which indicates that partitioning is not the problem, so please treat the following as a general suggestion on partitioning and not as a solution to actual problem.
ArndW wrote:To keep things simple, hash both input links to the join on "Col1" and then sort both on Col1, Col2, Col3, Col5 and Col6.
Whether or not this is good advice depends on the cardinality of Col1, by which I mean "how many distinct values the column contains", and also how long the value in the column is. The ideal column to partition on is something short that has very high cardinality. A postcode or ZIP code in a data set that covers a large geographical area is a good example of this - short, but with lots of different values. Account numbers, Employee IDs, these are good hash partitioning keys. Country Name, not so good, especially if the only values are "USA" and "Canada". If one of the values Col1 Col2 and Col3 matches the "good" criteria that I described, then pick that as your partitioning key, and sort by all three.
So if Col1 is "Country", Col2 is "ADDRESS_TEXT" and Col3 is "Zip" then Col3 is a good choice as long as it is populated for the large majority of your data set. Col1 is a reasonable choice if you have data for a lot of countries and the zip code is blank for a lot of your data. Col2 is the last resort as it is a long string value and thus expensive to calculate a hash value from, and should only be chosen if Col1 is low cardinality and Col3 is blank for a large portion of your data (more than, say, 10% at a guess).
You could partition on all three keys that the joins have in common, but in actuality you only need a subset of the keys that provide high cardinality.