Page 1 of 2

Job aborts after 754K recs with Aggregator: %s

Posted: Tue Jun 28, 2005 3:22 pm
by vinaymanchinila
Hi,

My job aborts after 754519 records.
I have a job that extracts data from ORACLE source (17 Coulmns), then look it up against 8 columns, followed by AGGREGATOR (to get sum of POs) and finally load into the target DB.


8 LOOKUPFILES
|
|
ORACLE/SOURCE---TRANSFORMER--AGGREGATOR---ORACLE/TARGET
\
\
REJECTFILE (Based on condition in Tx)


I have observed that all the records are getting accumulated in the source link of the Aggregator (I guess because it needs to group the columns and sum one of the column) and then the job fails, is this because of some space/memory issue ?

When I run the job for less than 750000 it works fine!

The 2 warnings that appear in the director are:

AggregatorPOBalance: %s
Abnormal termination of stage LdPOFact..XfmDWIDProcess detected

Thanks,

Posted: Tue Jun 28, 2005 3:36 pm
by Sainath.Srinivasan
Did you monitor the memory or disk usage when this agg stage was running.

What is the derivation in the agg stage?

Posted: Tue Jun 28, 2005 3:54 pm
by sjacobk
We faced a similar problem once. It was because one of the lookup link was fetching a large amount of data and the operations defined in the transformer was failing. You try to see the performance statistics in the designer while the job runs and see any of your look up links really stucks while processing the 754519th Record from the input.

Posted: Wed Jun 29, 2005 7:09 am
by vinaymanchinila
Hi Sai,
Did not check the memory or disk usage, let me know where to look at it, it might be the cause as lot of jobs are operating under the same file system.

Hi sjacobk,
I am using the utility "UtilityHashLookup('HshLkpPOXref', PO_SRC_OUTPUT.POKEY,1)" anyway will try to build a hash on the target to process the 754519th Record from the input, but still i will be extracting all the records again, is there a way I can start from this point, I guess I need to search this forum for check points.

Thanks,

Posted: Wed Jun 29, 2005 8:04 am
by elavenil
Hope vmstat would give the perform statistics of the UNIX server resources. Alternatively, you can ask the UNIX admin to monitor this statistics while running this job.

Regards
Saravanan

Posted: Wed Jun 29, 2005 9:24 am
by lebos
Check your /tmp directory also.

But, I couldn't get around this problem and had to design a solution that did not include the Aggravator stage.

Good luck. (Great error msg isn't it?)

Larry

Posted: Wed Jun 29, 2005 10:46 am
by Sainath.Srinivasan
lebos wrote:Check your /tmp directory also.

But, I couldn't get around this problem and had to design a solution that did not include the Aggravator stage.

Good luck. (Great error msg isn't it?)

Larry
Yep...it is the 'Aggregator' stage and not 'Aggravator' stage. Why do you want to aggrevate his problem? :wink: (Just kidding).

Posted: Wed Jun 29, 2005 12:53 pm
by nag0143
try to sort the data coming in from oracle... order by... just a thought , ordered data requires less space to do aggregations...

Posted: Wed Jun 29, 2005 3:10 pm
by Sainath.Srinivasan
Yes. Having the data sorted makes aggregator stage work efficiently.

Posted: Wed Jun 29, 2005 3:18 pm
by vinaymanchinila
Hi Guys,

First extracted the data into a flat file , then sorted it (on one key) before passing it to the AGGREGATOR stage, and it still aborts !

Posted: Wed Jun 29, 2005 3:36 pm
by lebos
I don't see how pre-sorting data will do any good as there is no way (that I know of anyway) to tell Aggravator that the data has already been sorted. Some parts of DataStage are pretty good, but none are clarvoiyant (sp?).

Find a solution outside of Aggragator!

Larry

Posted: Wed Jun 29, 2005 3:42 pm
by vinaymanchinila
Hi,
I need to do an internal loop/group by and then sum couple of columns on the extracted data before loading it into the fact target table, hmmmm looks like this is agrevating !

Posted: Wed Jun 29, 2005 5:23 pm
by amsh76
I have encountered this problem in past when I didn't have my data sorted before aggregation..

Larry, sorting data is always efficient before aggregation, and there is column available in Aggregator Stage where you specify how the data is sorted..

Now geeting back to the problem, as you tried sorting the data and you are still getting the error, my guess would be problem within the data..i mean its value, is it possible that any of the column have NULL ?

Posted: Wed Jun 29, 2005 7:06 pm
by ray.wurlod
Sort on ALL grouping columns, and tell the Aggregator stage that you have done so (on the Inputs properties). There's no point sorting otherwise.

This allows the Aggregator stage to free up memory as soon as any change occurs in a grouping column.

At a pinch, you can tune the memory consumption of the Aggregator stage in DS.TOOLS, but let's not go there just yet.

Posted: Thu Jun 30, 2005 7:02 am
by lebos
Although I have looked a hundred times for how to tell Aggregator that the input is already sorted, I didn't see it until I for the hundred and first! :oops:

Sorry folks.

Larry