Aggregator performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Aggregator performance

Post by vijay.barani »

Hi friends,
May i have some info. on Aggregator.I have a job in which there are all together 10 stages.


3 lookups
| | |
source drs---->aggr---->trsfrmr----->trsfrmr----->target drs
| |
2 more targets


My source have app. 10 lakh records.I have given derivation for one of the columns in the agg. stage i have taken sum() function and group by 5 other columns.It is taking more than 02:30 hours.What might be the issue.
Warm Regards,
Vijay
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Re: Aggregator performance

Post by vijay.barani »

The last transformer stage has 3 lookups and 3 targets,not the first
Warm Regards,
Vijay
muruganr117
Participant
Posts: 40
Joined: Sun Jan 21, 2007 1:52 pm
Location: Chennai
Contact:

Re: Aggregator performance

Post by muruganr117 »

vijay.barani wrote:Hi friends,
May i have some info. on Aggregator.I have a job in which there are all together 10 stages.


3 lookups
| | |
source drs---->aggr---->trsfrmr----->trsfrmr----->target drs
| |
2 more targets


My source have app. 10 lakh records.I have given derivation for one of the columns in the agg. stage i have taken sum() function and group by 5 other columns.It is taking more than 02:30 hours.What might be the issue.
Is the input pre sorted based on the mentioned GROUP BY in AGG ,before being processed in your Job?

regards
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Re: Aggregator performance

Post by vijay.barani »

I have taken 6 columns directly from a table without any conditions in the source table.


Is the input pre sorted based on the mentioned GROUP BY in AGG ,before being processed in your Job?

regards[/quote]
Warm Regards,
Vijay
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Encase your design in Code tags so we can understand it better.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Post by vijay.barani »

ray.wurlod wrote:Encase your design in Code tags so we can understand it better. ...

Code: Select all



                                                      3 lookups 
                                                          | | | 
source drs---->aggr---->trsfrmr----->trsfrmr----->target drs 
                                                           | | 
                                                 2 more targets 
Warm Regards,
Vijay
Pagadrai
Participant
Posts: 111
Joined: Fri Dec 31, 2004 1:16 am
Location: Chennai

Post by Pagadrai »

Hi,
You can try the following:

1) remove the aggregator and see if you can fetch the SUM value from
the DB itself.
2) remove 2 transformerrs and develop the job using a single transformer.
3) i dont know your lookup logic, but see if there is any possibility of combining the 3 look ups into one stage.
4) You can also check replacing the target with a file to see if that stage is causing the performance issue.
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Post by vijay.barani »

Pagadrai wrote:Hi,
You can try the following:

1) remove the aggregator and see if you can fetch the SUM value from
the DB itself.
2) remove 2 transformerrs and develop the job using a single transformer.
3) i dont know your lookup logic, but see if there is any possibility of combining the 3 look ups into one stage.
4) You can also check replacing the target with a file to see if that stage is causing the performance issue.
Hi,
Thank you
I have removed Agg stage and directly taken sum from DB,But it yet it is taking same time
Yes I have run the new modified job after removing first job !!
The thre lookups are on a single Trsformer stage,That too for only one column !!
Also there is no change is i replace the Target DRS stage with simple Seq. file.

Don't know ehy it is consuming much time. :!: :?:
Warm Regards,
Vijay
vijay.barani
Participant
Posts: 78
Joined: Wed Jun 04, 2008 2:59 am

Post by vijay.barani »

ray.wurlod wrote:Encase your design in Code tags so we can understand it better. ...

Code: Select all


              3 lookups 
                     | | | 
source drs---->trsfrmr----->target drs 
                    
Now this is my new job design !!
But there must be some use for an AGG. Stage ??
Warm Regards,
Vijay
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If you can arrange to have your source data sorted by the five grouping columns, and advise the Aggregator that this is the case, your execution time for that stage will reduce by orders of magnitude.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply