Page 1 of 1

In line sort before join

Posted: Fri Aug 24, 2012 10:20 am
by soumya5891
I am joining two table in the datastage join stage. I did hash partition on the join key in both the link. And by mistake I forgot made in line sort in one of the link. In this case for the matched records also the join did not happen properly. But when I made the in line sort then it worked fine. So is it mandetory for join that if we sort in one link then we must sort in the other link also.

Posted: Fri Aug 24, 2012 12:07 pm
by ThilSe
For join to work, the sort order of records should match in both input links.

Regards,
Senthil

Posted: Fri Aug 24, 2012 12:09 pm
by ThilSe
For join to work, the sort order of records should match in both input links.

Regards,
Senthil

Posted: Fri Aug 24, 2012 12:14 pm
by bhasds
Join stage by default will insert tsort operator in both the dataset prior to join if we do not sort the datasets explicitly.This can be seen in the Job score.Please refer to the below post for more information.
viewtopic.php?p=191759

Posted: Sat Aug 25, 2012 12:04 am
by soumya5891
When I 'm partitioning in the input then same key combination records will be in same node,it doesn't depend on the sort. So is it something like that it will join not by the value directly but with some other means.That's why sorted order is mandatory in both the links.

I will also try one thing. I will make the sorted order different in the two links with same partitions.

Posted: Sat Aug 25, 2012 12:56 am
by ArndW
soumya5891,

when you experiment by changing the sort order, make sure that look at the score, since DataStage will automatically insert a tsort operator at runtime should the links be sorted differently. You can disable this behaviour using the $APT_NO_SORT_INSERTION and setting to to "true" - then you will get incorrect results.

Posted: Sat Aug 25, 2012 1:07 pm
by soumya5891
Thanks a lot ArndW for your suggestion. Surely I will keep in mind your suggestion when I will do the checking.

Posted: Mon Aug 27, 2012 9:31 am
by jwiles
For your reference, the documentation for the Join stage does specifically state that all inputs must be key partitioned and sorted in ascending order.

Regards,