I have met this error in AGGREGATOR Stage twice in different cases:
First time I set the Sort Columns in Input Link and it ran OK. But Second
time ran another loading (a little bit small file than first time), the error occurred again. After removed the Sort setting, it ran OK.
The question is:
The first file (a big file) loading needs column being sorted and
The second file (a small file) Loading doesn't need column being sorted.
Why does this happen?
How to handle this problem?
My DataStage version is 5.2
Appreciate for any suggestions.
Row out of sequence
Moderators: chulett, rschirm, roy
Row out of sequence
Bill
Unsorted data requires the aggregator to work much harder, and there are limitations as to how much unsorted data it can aggregate. If you have 1 million rows that group to 1 million rows, the aggregator will have performance issues. If you have 1 million rows that group to 10 thousand rows, the aggregator can handle it.
Now, if you sorted the data first, then told the aggregator the sort order, it can rely on the data being pre-grouped and simply output its results as each grouping changes, as opposed to accumulating ALL rows before output, because it won't know when a group is finished.
So, always sort your data if you're going to have volume concerns and give the aggregator that assistance.
Now, if you sorted the data first, then told the aggregator the sort order, it can rely on the data being pre-grouped and simply output its results as each grouping changes, as opposed to accumulating ALL rows before output, because it won't know when a group is finished.
So, always sort your data if you're going to have volume concerns and give the aggregator that assistance.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Indicating on the Input link to the Aggregator that the data are sorted does NOT sort the data! This has caught others in the past.
What's happening if you indicate that data are sorted is that you are asserting to the Aggregator stage that the data are indeed already sorted as indicated, allowing it to use a much more efficient algorithm.
You are not allowed to lie; either you were lucky on your first run or its data were sorted as indicated.
The efficient algorithm fails if the data are not sorted as indicated, so the Aggregator stage keeps a check on whether the expected sorted order is being adhered to, and aborts if it is not.
What's happening if you indicate that data are sorted is that you are asserting to the Aggregator stage that the data are indeed already sorted as indicated, allowing it to use a much more efficient algorithm.
You are not allowed to lie; either you were lucky on your first run or its data were sorted as indicated.
The efficient algorithm fails if the data are not sorted as indicated, so the Aggregator stage keeps a check on whether the expected sorted order is being adhered to, and aborts if it is not.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.