Perfomace Improvement??

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Thu Jul 09, 2009 4:33 am

Are you saying that 'select * from table' takes 25 minutes to start !!??

ArndW · Post by **ArndW** » Thu Jul 09, 2009 4:42 am

balu536 wrote:...Query looks something like "Select * from table".This Query takes 25 mins for execution...

There has to be more there, since a SELECT * without other selection criteria would start pushing rows to DataStage almost immediately. It sounds like your query is doing a full-table-scan (or more) before passing the first row. If you post the query I am sure that you will get some suggestions.
But if your query is the bottleneck and the customer is unwilling to change either the query or the database then you cannot do anything else to make the job run faster. End of discussion on the subject, there are no other options (well, I suppose the customer could buy more hardware, but that is unlikely).

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Thu Jul 09, 2009 4:53 am

What if you place a big file (say 1 or 2Gb for now) in the Oracle server and attempt to ftp it across to DataStage server (and viceversa)?

How long does that take ?

Or if you design a job with "select dummy from dual", how long does that take to start?

chulett · Post by **chulett** » Thu Jul 09, 2009 5:48 am

What makes you guys think it doesn't "start" immediately? If it takes 25 minutes to push everything out, all you can do to improve a query like that is to use a parallel hint. And perhaps bump up the array size.

ArndW · Post by **ArndW** » Thu Jul 09, 2009 6:25 am

I thought that the poster meant it took 25 minutes before the first row arrives in DataStage - but perhaps I misunderstood.

balu536 · Post by **balu536** » Thu Jul 09, 2009 7:02 am

Sainath:
"Are you saying that 'select * from table' takes 25 minutes to start !!?? "

Balu : Here 25 minutes i mentioned implies the total time required to extract the complete data.In mean time the records are extracted(i.e for 5 minutes some 15 million,10 minutes- 29 million...etc)

chulett:
"What makes you guys think it doesn't "start" immediately? If it takes 25 minutes to push everything out, all you can do to improve a query like that is to use a parallel hint. And perhaps bump up the ... "

Balu : Chulett,thanks for the suggestion,i'll try to re-run the job after implementing the hints and will let you know if any perfomance improval is observed.

Regards,
Balakrishna

priyadarshikunal · Post by **priyadarshikunal** » Thu Jul 09, 2009 7:20 am

if it is "select * from table" try chaging the read mode to table instead of putting user defined query and see if it takes less time.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Thu Jul 09, 2009 7:20 am

Balu,

DataStage runs pipeline parallelism mode. So the 29 minutes of Oracle extration will have impact due to stages further down the line.

Did you try applying @FALSE in the transformer stage?

If that shows any improvement, we can work on other stages to carry it till the end.