Page 2 of 2

Posted: Thu Jul 09, 2009 4:33 am
by Sainath.Srinivasan
Are you saying that 'select * from table' takes 25 minutes to start !!??

Posted: Thu Jul 09, 2009 4:42 am
by ArndW
balu536 wrote:...Query looks something like "Select * from table".This Query takes 25 mins for execution...
There has to be more there, since a SELECT * without other selection criteria would start pushing rows to DataStage almost immediately. It sounds like your query is doing a full-table-scan (or more) before passing the first row. If you post the query I am sure that you will get some suggestions.
But if your query is the bottleneck and the customer is unwilling to change either the query or the database then you cannot do anything else to make the job run faster. End of discussion on the subject, there are no other options (well, I suppose the customer could buy more hardware, but that is unlikely).

Posted: Thu Jul 09, 2009 4:53 am
by Sainath.Srinivasan
What if you place a big file (say 1 or 2Gb for now) in the Oracle server and attempt to ftp it across to DataStage server (and viceversa)?

How long does that take ?

Or if you design a job with "select dummy from dual", how long does that take to start?

Posted: Thu Jul 09, 2009 5:48 am
by chulett
What makes you guys think it doesn't "start" immediately? If it takes 25 minutes to push everything out, all you can do to improve a query like that is to use a parallel hint. And perhaps bump up the array size.

Posted: Thu Jul 09, 2009 6:25 am
by ArndW
I thought that the poster meant it took 25 minutes before the first row arrives in DataStage - but perhaps I misunderstood.

Posted: Thu Jul 09, 2009 7:02 am
by balu536
Sainath:
"Are you saying that 'select * from table' takes 25 minutes to start !!?? "

Balu : Here 25 minutes i mentioned implies the total time required to extract the complete data.In mean time the records are extracted(i.e for 5 minutes some 15 million,10 minutes- 29 million...etc)

chulett:
"What makes you guys think it doesn't "start" immediately? If it takes 25 minutes to push everything out, all you can do to improve a query like that is to use a parallel hint. And perhaps bump up the ... "

Balu : Chulett,thanks for the suggestion,i'll try to re-run the job after implementing the hints and will let you know if any perfomance improval is observed.


Regards,
Balakrishna

Posted: Thu Jul 09, 2009 7:20 am
by priyadarshikunal
if it is "select * from table" try chaging the read mode to table instead of putting user defined query and see if it takes less time.

Posted: Thu Jul 09, 2009 7:20 am
by Sainath.Srinivasan
Balu,

DataStage runs pipeline parallelism mode. So the 29 minutes of Oracle extration will have impact due to stages further down the line.

Did you try applying @FALSE in the transformer stage?

If that shows any improvement, we can work on other stages to carry it till the end.