Improving Job Performance

saikir · Post by **saikir** » Tue Sep 25, 2007 1:31 am

Hi All,

I am preparing some notes on DataStage and want to have some general gudelines on ways to improve job performance. Here are some that i have noted in my proejct:

- Array Size in Case of OCI Stages
- Transaction Size in Case of OCI Stages
- Using HFC.exe
- Using In Process and Inter Process where ever applicable
- Using IPC stages
- Using Link Partitioner and Link Collector
- Playing around with the hash files
- If possible try sorting at the DataBase rather then at ETL level

I would be glad if people from the forum can add some additional tips.

Sai

abhi989 · Post by **abhi989** » Tue Sep 25, 2007 2:25 pm

In server job
-By doing most of the things in your source query (database lever - if the source is database).
-Going to job properties - performance - messing around with row buffers
-eliminating redundancy
-combining stages if it's possible
-Also creating indexes on database level (if updating large table in database)
-etc etc... this list can go on adn on..

ray.wurlod · Post by **ray.wurlod** » Tue Sep 25, 2007 5:09 pm

Eschew rows/sec as your measure of "performance". Prefer MB/min or simply elapsed time.

chulett · Post by **chulett** » Tue Sep 25, 2007 6:44 pm

And eschew 'messing around with' or 'playing around with' as performance improvement techniques.

ray.wurlod · Post by **ray.wurlod** » Tue Sep 25, 2007 11:29 pm

Use stage variables rather than evaluating the same expression more than once.

baglasumit21 · Post by **baglasumit21** » Thu Oct 18, 2007 2:07 am

Some addition from my side based on my experiance in current project...

-Supply pre-sorted data to a aggregator stage
-Avoid extraxtion and loading in the same job. If possible stage the data into a sequential file.
-Avoid using a "union all" in source query. Instead use a link collector to collect data from the two queries.
-Play around with transaction size and array size.
-Use of the two update actions viz 'Insert new and update existing' and 'Update existing and insert new' correctly.
-Avoid using more than 8 look-ups in a single transformer.
-Replace joins in query with the hashed file look-up