Page 1 of 1

DS Tunning

Posted: Tue Feb 18, 2003 4:16 pm
by devaraj4u
Dear All,

I have more than 85,000 Rows(SOURCE oracle TABLE) each Record length is more than 450 bytes.Now i want to transform this table to another target(Oracle Table).I designed the job using DATASTAGE.It is running fine but it is slow.Is there any paramerter I can tune so that my ETL will become faster.


Regards,
K.S.Rajan.

Posted: Tue Feb 18, 2003 5:59 pm
by vmcburney
If you are using an Oracle OCI stage to write to Oracle then have a look at the Array Size field on the Input - General tab. This controls how many rows are processed before a commit is performed. Fiddle with this figure to increase performance, anything from 1000 to 50000 should give you faster performance.

If you are using ODBC stages then you will find Array Size on the Transaction Handling tab.

The other option is to do an Oracle bulk load of the data, this is usually faster as it disables keys and indexes during the load.


Vincent McBurney
Data Integration Services
www.intramatix.com

Posted: Tue Feb 18, 2003 6:10 pm
by chulett
Some will depend on exactly what kind of transforms you are doing in your job and also if the Oracle database is on the same server as DataStage.

One (usually) easy place to gain performance is to bump the 'array size' when reading and writing to Oracle. There is no hard-and-fast rule that I am aware of (anybody?), but if you pull something like 100 or 200 rows (or more) at a time over the network instead of one it will help. The same applies when writing, in essence 'buffering' the writes into larger batches. Give that a shot if you haven't already...

-craig

Posted: Tue Feb 18, 2003 6:14 pm
by chulett
Dueling answers. [:)]

One small correction to Vince's post. The array size has nothing to do with commit levels, there is a separate control for that. It just controls how many rows come over the network at a time.

-craig

Posted: Wed Feb 19, 2003 1:09 am
by sumitgulati
It depends upon the transformation you are applying and also on the Update strategy used. If it is a plain Insert (without any update in any stage) then I guess the only option is fiddling with Array Size and Transaction size.
If in any Stage in the Job you are also Updating the existing Rows in the Target Table the try creating index on the columns based on which the update works.

Good Luck
Sumit

Posted: Wed Feb 19, 2003 1:45 am
by vmcburney
Thanks Craig! Got my array and transaction fields mixed up.


Vincent McBurney
Data Integration Services
www.intramatix.com

Posted: Wed Feb 19, 2003 4:23 am
by ray.wurlod
Another way to speed the throughput of DataStage is to be selective about what you extract. For example, do you need every column from the source table? If not, select only those that you need. Do you need every row from the source table? If not, select only those that you need (specify a WHERE clause for the SQL, ideally have an index on the column or columns used in the WHERE clause).


Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Posted: Wed Feb 19, 2003 7:51 am
by kjanes
Is it DataStage that is slow or possibly the network speed, traffic on the network, or the database server? There is a lot more to tuning than just some parameters in DataStage. We know nothing about the environment this is running in. One should evaluate the entire picture before pointing at DataStage. The input is small in rows and record length. How long does it take?

An alternative if it is too long could be to write it to a sequential file and then use a load utility to the database. Like I said, there are many variables to account for. I am not familiar with the stages available for Oracle but do you do a direct insert into the table with intermittant commits or can you use a bulk loaded if there is a load file? Probably both I guess....


Kevin Janes