To improve job performance.

Das · Post by **Das** » Fri Apr 20, 2007 11:06 pm

Hi,
Have an issue as follows.
I having a query from an input informix stage.

select A,Max(B) from tab
Where A not null and B not null
Group by B

Here both columns A,B are indexed.
As the source table contains 5crores of records,the performance is very slow.[22row/Sec].
Please suggest a way to improve te performane.

Thanks.

chulett · Post by **chulett** » Fri Apr 20, 2007 11:26 pm

Not much you can do there, it's a full table scan regardless of the indexes, which would not be used. You sure you don't mean 'Group by A' ?

ray.wurlod · Post by **ray.wurlod** » Sat Apr 21, 2007 2:18 am

This is not a DataStage issue - it's the time required by Informix to effect the GROUP BY. Until all the rows are processed (so that all the groups can be determined) Informix can not deliver any rows. Yet the clock is still running, so rows/sec appears low. It is not a useful metric.

What else is happening in the job?

You can test the query performance by creating a simple job consisting of the same Informix stage, a Transformer stage and a Sequential File stage. Use a constraint in the Transformer stage that will never be true, such as the system variable @FALSE.

Any difference between that and your other job is related to processing with in the job itself.

Das · Post by **Das** » Mon Apr 23, 2007 1:15 am

chulett wrote:Not much you can do there, it's a full table scan regardless of the indexes, which would not be used. You sure you don't mean 'Group by A' ? ...

insted of doing group by in query If i use aggregator stage ,Will it give any performane improvement..
Suggest.....

chulett · Post by **chulett** » Mon Apr 23, 2007 6:45 am

Suggest you try and see. Too many variables specific to your environment for anyone to give a definitive answer. I doubt it, though. Perhaps a chat with your DBA is in order?

ray.wurlod · Post by **ray.wurlod** » Mon Apr 23, 2007 4:44 pm

The total amount of time will probably be about the same - after all, the same amount of work has to be done. If the Informix table has an index on the grouping column, then grouping in the database may be faster than grouping in an Aggregator stage. Conversely, the Aggregator stage forms its groups in memory, so that may be faster if the sorted rows can be delivered to it quickly enough. The answer to your most recent question on this thread is a definite "maybe".