Regarding aggregator

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Regarding aggregator

Post by Bilwakunj »

Hello,
When we say sort the data in the aggregator or any other stage, where exactly it's done? Where the data resides being sort. Could anyone please explain this process? Is this same for Server and PX?

Thanks in advance.
THEDSKID
Premium Member
Premium Member
Posts: 11
Joined: Thu Apr 29, 2004 10:51 am
Location: DALLAS TX
Contact:

Re: Regarding aggregator

Post by THEDSKID »

The Sort checkbox is your way of telling the aggregator stage that your incoming data is sorted. By providing this information to the aggregator stage you can cut down on the processing time within the aggregator and you will notice sizeable increases in the performance of large data sets.

So the sorting is done outside of the aggregator within your input file. I believe this is the same for Server and EE.

Hope this helps
Bilwakunj wrote:Hello,
When we say sort the data in the aggregator or any other stage, where exactly it's done? Where the data resides being sort. Could anyone please explain this process? Is this same for Server and PX?

Thanks in advance.
-Chris
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Re: Regarding aggregator

Post by Bilwakunj »

Thanks for your reply. But if data is sorted in the aggregator, is it sorted in memory or scratch disk?


THEDSKID wrote:The Sort checkbox is your way of telling the aggregator stage that your incoming data is sorted. By providing this information to the aggregator stage you can cut down on the processing time within the aggregator and you will notice sizeable increases in the performance of large data sets.

So the sorting is done outside of the aggregator within your input file. I believe this is the same for Server and EE.

Hope this helps
Bilwakunj wrote:Hello,
When we say sort the data in the aggregator or any other stage, where exactly it's done? Where the data resides being sort. Could anyone please explain this process? Is this same for Server and PX?

Thanks in advance.
vcannadevula
Charter Member
Charter Member
Posts: 143
Joined: Thu Nov 04, 2004 6:53 am

Re: Regarding aggregator

Post by vcannadevula »

Bilwakunj wrote:Thanks for your reply. But if data is sorted in the aggregator, is it sorted in memory or scratch disk?


THEDSKID wrote:The Sort checkbox is your way of telling the aggregator stage that your incoming data is sorted. By providing this information to the aggregator stage you can cut down on the processing time within the aggregator and you will notice sizeable increases in the performance of large data sets.

So the sorting is done outside of the aggregator within your input file. I believe this is the same for Server and EE.

Hope this helps
Bilwakunj wrote:Hello,
When we say sort the data in the aggregator or any other stage, where exactly it's done? Where the data resides being sort. Could anyone please explain this process? Is this same for Server and PX?

Thanks in advance.


IF your data is not sorted and you check the option in the input link,

If the data is less than 25MB, it will sort in the memory, else it will use the scratch disk with sort pool, if it is full use the default scratch disk, if it is full it will use the TMPDIR, if it is full it will usee the /tmp in the server box.
If it is full, it will abort.
Post Reply