Join stage warning
Moderators: chulett, rschirm, roy
-
- Charter Member
- Posts: 193
- Joined: Tue Sep 05, 2006 8:01 pm
- Location: Australia
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
have you explicitly partitioned and sorted again at the join stage.
in the remove duplicate stage, it must have already been partitioned and sorted.
in the remove duplicate stage, it must have already been partitioned and sorted.
Last edited by keshav0307 on Wed Jul 02, 2008 3:04 am, edited 1 time in total.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Put specific Sort stages on the inputs of stages that require sorted input.
If the data are already sorted, set the sort mode to "don't sort, previously sorted".
The presence of the Sort stages will obviate the need for DataStage to insert any tsort operators into the score.
If the data are already sorted, set the sort mode to "don't sort, previously sorted".
The presence of the Sort stages will obviate the need for DataStage to insert any tsort operators into the score.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
[quote="keshav0307"]have you explicitly partitioned and sorted again at the join stage.
in the remove duplicate stage, the it must have already been partitioned and sorted.[/quote]
Please check the job dump score. Look if there is a tsort added again.
If it is added, it will hurt your performance.
Try the following:
1. Use the same partitioning in join stage because you already know it is sorted.
or
2. Use the environment variable $APT_SORT_INSERTION_CHECK_ONLY. It prevents the automatic tsort insertion. (I havent tried this. Just theoretical info)![Smile :)](./images/smilies/icon_smile.gif)
in the remove duplicate stage, the it must have already been partitioned and sorted.[/quote]
Please check the job dump score. Look if there is a tsort added again.
If it is added, it will hurt your performance.
Try the following:
1. Use the same partitioning in join stage because you already know it is sorted.
or
2. Use the environment variable $APT_SORT_INSERTION_CHECK_ONLY. It prevents the automatic tsort insertion. (I havent tried this. Just theoretical info)
![Smile :)](./images/smilies/icon_smile.gif)