Page 1 of 1

Improving Job Performance

Posted: Tue Sep 25, 2007 1:31 am
by saikir
Hi All,

I am preparing some notes on DataStage and want to have some general gudelines on ways to improve job performance. Here are some that i have noted in my proejct:

- Array Size in Case of OCI Stages
- Transaction Size in Case of OCI Stages
- Using HFC.exe
- Using In Process and Inter Process where ever applicable
- Using IPC stages
- Using Link Partitioner and Link Collector
- Playing around with the hash files
- If possible try sorting at the DataBase rather then at ETL level

I would be glad if people from the forum can add some additional tips.

Sai

Posted: Tue Sep 25, 2007 2:25 pm
by abhi989
In server job
-By doing most of the things in your source query (database lever - if the source is database).
-Going to job properties - performance - messing around with row buffers
-eliminating redundancy
-combining stages if it's possible
-Also creating indexes on database level (if updating large table in database)
-etc etc... this list can go on adn on..

Posted: Tue Sep 25, 2007 5:09 pm
by ray.wurlod
Eschew rows/sec as your measure of "performance". Prefer MB/min or simply elapsed time.

Posted: Tue Sep 25, 2007 6:44 pm
by chulett
And eschew 'messing around with' or 'playing around with' as performance improvement techniques. :wink:

Posted: Tue Sep 25, 2007 11:29 pm
by ray.wurlod
Use stage variables rather than evaluating the same expression more than once.

Posted: Thu Oct 18, 2007 2:07 am
by baglasumit21
Some addition from my side based on my experiance in current project...


-Supply pre-sorted data to a aggregator stage
-Avoid extraxtion and loading in the same job. If possible stage the data into a sequential file.
-Avoid using a "union all" in source query. Instead use a link collector to collect data from the two queries.
-Play around with transaction size and array size.
-Use of the two update actions viz 'Insert new and update existing' and 'Update existing and insert new' correctly.
-Avoid using more than 8 look-ups in a single transformer.
-Replace joins in query with the hashed file look-up