Page 1 of 1

How to improve performance of this job?

Posted: Tue Jan 11, 2011 5:27 pm
by mavrick21
Hello,

We've a job as shown below

Code: Select all

                                                Hashed File
                                                   |
DRS (Source) -> IPC -> Row merger -> Trans1 ->  Trans2 -> Row splitter -> IPC -> Trans3 -> DRS (Target)

Source & target table is Oracle.

The DRS (source) contains roughly 9 million records and contains a select statement.

The reference Hashed file too contains roughly 9 million records. The hashed file details below -
Type 30 (64-bit)
1296359424 Jan 11 16:00 DATA.30
374102016 Jan 11 15:52 OVER.30

This job runs every night and the problem is it processes 10 rows/second and takes 1 hour 45 minutes to complete.

What I could analyze is, the extraction from source and lookup on Hashed file stage take a long time. Are there ways to improve the run time of this job by improving the lookup speed?

Please let me know if you require any more details.

Thanks.

Posted: Tue Jan 11, 2011 5:38 pm
by mavrick21
Just another thought/question

Does increasing the priority of this job allocate more CPU time(if any) to it? If yes then how can it be achieved?

Thanks.

Posted: Tue Jan 11, 2011 11:59 pm
by Kirtikumar
Are you sure it is not the DRS extraction? Have you tried running a simple job to extract from DRS and load in DRS and check how long are they taking?

when job is running, what are the CPU and memory stats of the server?

Posted: Wed Jan 12, 2011 12:23 am
by mavrick21
Kirtikumar,

I missed the long running extraction stage and I've included it in my original post.

Re: How to improve performance of this job?

Posted: Wed Jan 12, 2011 12:50 am
by BI-RMA
Hi Mavrick21,

How fast can Oracle write the data from DRS (Source) to /dev/null? If the Select-Statement on Oracle is slow, there is no use in analysis of the overall job design.

Posted: Wed Jan 12, 2011 7:35 am
by chulett
Best way to check that the target writes are not the issue, put an @FALSE constraint in the last transformer so nothing goes to the target.

Posted: Wed Jan 12, 2011 10:39 am
by mavrick21
This is really strange! Yesterday night the job took 54 minutes to complete. It processed 2000 rows/second.

There has been no change in the job, the record count or the hardware. The database and DataStage engine are on the same box.

My thought is this job runs slow when CPU and memory are being utilized to the maximum due to other process. Yesterday night it might have run when CPU and memory were relatively free. Now how can I trace where the bottleneck is?

Thank you.

Posted: Wed Jan 12, 2011 4:35 pm
by ray.wurlod
Unless someone was monitoring the system at the time, you can't. It's gone. It's history.