If I'm pulling data directly from a table I'll need to switch to pull the data from a sorted view, right?rwierdsm wrote:The database will be much, much faster then DataStage.
Rob
Time Out waiting for mutex
Moderators: chulett, rschirm, roy
Thanks, DSguru for all your help. I did add order by clause in the select statement in Oracle stage. In my question I was referring to rwierdsm's statement that it will be much faster in database then DataStage ("If you can do the sort as part of your database read, do it there! The database will be much, much faster then DataStage.").
-
- Premium Member
- Posts: 209
- Joined: Fri Jan 09, 2004 1:14 pm
- Location: Toronto, Canada
- Contact:
Tatiana,
No view required, simply add an order by clause, as below
Before:
After
You can do this by setting the database stage to either 'User Defined SQL', or 'Generated SQL' and specifying your sort criteria in the proper place.
Craig,
Rob
No view required, simply add an order by clause, as below
Before:
Code: Select all
Select ColumnA, ColumnB, ColumnC
From Table1
Code: Select all
Select ColumnA, ColumnB, ColumnC
From Table1
order by ColumnA, ColumnB
Craig,
?!?!? I've used the aggregator after a sort without specifying any sort criteria. Am I writing buggy code?!?Actually, asserting the 'sort' order in the Aggregator is required or else all your database sorting will be wasted.
Rob
Rob Wierdsma
Toronto, Canada
bartonbishop.com
Toronto, Canada
bartonbishop.com
Well by specifying the sort order to the aggregator, the aggregator becomes much happier as it can blindly group by those keys without actually storing much info in the temp files. The aggregator becomes much faster (kinda like a super charged car
) when it knows how the data is coming in.
And no my name is not Chuck.![Twisted Evil :twisted:](./images/smilies/icon_twisted.gif)
![Cool 8)](./images/smilies/icon_cool.gif)
And no my name is not Chuck.
![Twisted Evil :twisted:](./images/smilies/icon_twisted.gif)
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Not buggy, per se - but you are making it do more work than necessary.
If you do not assert a sort order in the Aggregator stage, it has no idea what order the incoming data is in. It then errs on the side of caution and so needs to take all of the incoming data into the stage, order and then aggregate it. That's why you see everything go in before anything starts coming out.
When you mark those sort fields in the Aggregator, you aren't telling it how to sort the data but how the incoming data is sorted - hence my use of the term 'assert'. And much like a normal assertion, if you lie to the stage it will bust you for it with an Abort for 'row out of sequence'.
If you get it right - presort in a manner that supports the aggregation being done (another key point) and assert that sort order in the stage - it is no longer a 'bottleneck'. It basically only needs to hold one 'aggregation group' in memory at a time and when the first record from the new group comes in the old group is flushed through.
Hope that makes sense...
![Razz :P](./images/smilies/icon_razz.gif)
If you do not assert a sort order in the Aggregator stage, it has no idea what order the incoming data is in. It then errs on the side of caution and so needs to take all of the incoming data into the stage, order and then aggregate it. That's why you see everything go in before anything starts coming out.
When you mark those sort fields in the Aggregator, you aren't telling it how to sort the data but how the incoming data is sorted - hence my use of the term 'assert'. And much like a normal assertion, if you lie to the stage it will bust you for it with an Abort for 'row out of sequence'.
If you get it right - presort in a manner that supports the aggregation being done (another key point) and assert that sort order in the stage - it is no longer a 'bottleneck'. It basically only needs to hold one 'aggregation group' in memory at a time and when the first record from the new group comes in the old group is flushed through.
Hope that makes sense...
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Is there an index on the sorting column?
Because a B-tree is stored in sorted order, it can facilitate sorting in an unconstrained query, and in certain classes of constrained query.
Because a B-tree is stored in sorted order, it can facilitate sorting in an unconstrained query, and in certain classes of constrained query.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.