Hi All,
I am preparing some notes on DataStage and want to have some general gudelines on ways to improve job performance. Here are some that i have noted in my proejct:
- Array Size in Case of OCI Stages
- Transaction Size in Case of OCI Stages
- Using HFC.exe
- Using In Process and Inter Process where ever applicable
- Using IPC stages
- Using Link Partitioner and Link Collector
- Playing around with the hash files
- If possible try sorting at the DataBase rather then at ETL level
I would be glad if people from the forum can add some additional tips.
Sai
Improving Job Performance
Moderators: chulett, rschirm, roy
In server job
-By doing most of the things in your source query (database lever - if the source is database).
-Going to job properties - performance - messing around with row buffers
-eliminating redundancy
-combining stages if it's possible
-Also creating indexes on database level (if updating large table in database)
-etc etc... this list can go on adn on..
-By doing most of the things in your source query (database lever - if the source is database).
-Going to job properties - performance - messing around with row buffers
-eliminating redundancy
-combining stages if it's possible
-Also creating indexes on database level (if updating large table in database)
-etc etc... this list can go on adn on..
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 132
- Joined: Wed Mar 01, 2006 11:12 pm
- Location: Pune
Some addition from my side based on my experiance in current project...
-Supply pre-sorted data to a aggregator stage
-Avoid extraxtion and loading in the same job. If possible stage the data into a sequential file.
-Avoid using a "union all" in source query. Instead use a link collector to collect data from the two queries.
-Play around with transaction size and array size.
-Use of the two update actions viz 'Insert new and update existing' and 'Update existing and insert new' correctly.
-Avoid using more than 8 look-ups in a single transformer.
-Replace joins in query with the hashed file look-up
-Supply pre-sorted data to a aggregator stage
-Avoid extraxtion and loading in the same job. If possible stage the data into a sequential file.
-Avoid using a "union all" in source query. Instead use a link collector to collect data from the two queries.
-Play around with transaction size and array size.
-Use of the two update actions viz 'Insert new and update existing' and 'Update existing and insert new' correctly.
-Avoid using more than 8 look-ups in a single transformer.
-Replace joins in query with the hashed file look-up
SMB