To improve job performance.

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Das
Participant
Posts: 87
Joined: Tue Oct 24, 2006 9:58 pm
Location: india

To improve job performance.

Post by Das »

Hi,
Have an issue as follows.
I having a query from an input informix stage.

select A,Max(B) from tab
Where A not null and B not null
Group by B


Here both columns A,B are indexed.
As the source table contains 5crores of records,the performance is very slow.[22row/Sec].
Please suggest a way to improve te performane.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not much you can do there, it's a full table scan regardless of the indexes, which would not be used. You sure you don't mean 'Group by A' ? :?
Last edited by chulett on Sat Apr 21, 2007 6:27 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This is not a DataStage issue - it's the time required by Informix to effect the GROUP BY. Until all the rows are processed (so that all the groups can be determined) Informix can not deliver any rows. Yet the clock is still running, so rows/sec appears low. It is not a useful metric.

What else is happening in the job?

You can test the query performance by creating a simple job consisting of the same Informix stage, a Transformer stage and a Sequential File stage. Use a constraint in the Transformer stage that will never be true, such as the system variable @FALSE.

Any difference between that and your other job is related to processing with in the job itself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Das
Participant
Posts: 87
Joined: Tue Oct 24, 2006 9:58 pm
Location: india

Post by Das »

chulett wrote:Not much you can do there, it's a full table scan regardless of the indexes, which would not be used. You sure you don't mean 'Group by A' ? :? ...

insted of doing group by in query If i use aggregator stage ,Will it give any performane improvement..
Suggest.....
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Suggest you try and see. Too many variables specific to your environment for anyone to give a definitive answer. I doubt it, though. Perhaps a chat with your DBA is in order?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The total amount of time will probably be about the same - after all, the same amount of work has to be done. If the Informix table has an index on the grouping column, then grouping in the database may be faster than grouping in an Aggregator stage. Conversely, the Aggregator stage forms its groups in memory, so that may be faster if the sorted rows can be delivered to it quickly enough. The answer to your most recent question on this thread is a definite "maybe".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply