Page 1 of 1

Help on tuning the job

Posted: Mon Jul 17, 2006 11:02 pm
by rcil
Hello All,

I need some help in tuning my datastage load job.

I am using DRS stage as source and target to load the sql server tables. I am trying to load 3 million records from source to target which is taking close to 5 hrs and there are no transformations involved in this job.

I want some ideas to reduce the load time.
Is it possible to make the job multi-instance? If yes then any kind of ideas will be appreciated.
Will there be any improvement in the performance if we use the IPC stage in between?

Thanks for all the inputs.

RCil

Posted: Mon Jul 17, 2006 11:18 pm
by DSguru2B
First you will have to see whether the select or the insert is taking more time.
Pull the records to a sequential file stage and make a note of the performance. Do it for just 100k records.
Then load from seq. file stage to the target. Note the performance level again. This will give you an idea on where exactly is the bottle neck.
Increase the array size on both the source and target.
The IPC stage wont help if you have a single stream.
I dont think parallelism will help as i doubt sql server supports multiple inserts to the same table. Someone else here will have to confirm that.
The other option that you have is to load it to a sequential file, ftp it to the sql server and use its load utility.

Posted: Tue Jul 18, 2006 1:56 am
by ray.wurlod
An IPC stage, or even inter-process row buffering, will only help if the machine has spare CPU capacity.

Other than that, is there a particular requirement for using DRS? Have you tried a comparison using ODBC stage? It's a thinner interface to the same ODBC driver used in each case.

Do try the benchmark tests suggested; you are likely to find that it's the "L" part of ETL that's hurting, so a bulk load technique would appear to be a possible solution.