Improving performance of DB2 insert
Posted: Tue Aug 02, 2011 9:45 pm
Requirement:
We are trying to improve the performance of a simple load job:
Sequential File -> Transformer (adding some metadata columns) -> Db2Connector
The job was taking ~15 minutes to load 4,117,660 rows (the average rowsize is 244 bytes).
The job was run using 2 nodes, which looks like it runs 2 separate Insert statements.
In production, we expect about 10,000,000 rows, so we are concerned about the runtime.
The environment is:
- DataStage v8.5
- Unix AIX v6.1
- DB2 v9.7
- DataStage and DB2 are running on separate LPARs.
- The DB2 LPAR has 12 CPU's and 49 GB of memory.
- The DataStage LPAR has 4 CPU's and 8 GB of memory.
- The DataStage LPAR has a local DB2 instance that has the remote db cataloged.
- A SAN is used for storage: at the moment, only the DB2 LPAR uses high-performance disk.
What's been tried:
- Replacing the DB2Connector with a Copy stage: the job runs in 35 seconds.
- Replace the SequentialFile with a RowGenerator (4117660 rows): the job ran in 15 minutes.
- Messing with DB2 bufferpools, tablespaces, containers, heap sizes: the job ran in 14 minutes.
- Upping the arraysize and recordcount made little difference (but a size of 20000 resulted in a timeout error for 4 nodes but no error with 2 nodes).
- Upping the number of nodes made little difference.
- The Unix sysadmin reports no bottlenecks with CPU, memory, disk or network.
- The job only uses 2 of the 4 CPU's but I don't know why.
- Removing the indexes: the job ran in 9 minutes.
Current problem:
What we are now trying is the "Bulk Load" option of the DB2Connector.
However, the job aborts with the error:
Parallel job reports failure (code 262276)
No other errors or warning appear in the log.
Has anyone come across this error, or can anyone suggest a solution?
Also, feel free to suggest other ways we can improve performance.
We are trying to improve the performance of a simple load job:
Sequential File -> Transformer (adding some metadata columns) -> Db2Connector
The job was taking ~15 minutes to load 4,117,660 rows (the average rowsize is 244 bytes).
The job was run using 2 nodes, which looks like it runs 2 separate Insert statements.
In production, we expect about 10,000,000 rows, so we are concerned about the runtime.
The environment is:
- DataStage v8.5
- Unix AIX v6.1
- DB2 v9.7
- DataStage and DB2 are running on separate LPARs.
- The DB2 LPAR has 12 CPU's and 49 GB of memory.
- The DataStage LPAR has 4 CPU's and 8 GB of memory.
- The DataStage LPAR has a local DB2 instance that has the remote db cataloged.
- A SAN is used for storage: at the moment, only the DB2 LPAR uses high-performance disk.
What's been tried:
- Replacing the DB2Connector with a Copy stage: the job runs in 35 seconds.
- Replace the SequentialFile with a RowGenerator (4117660 rows): the job ran in 15 minutes.
- Messing with DB2 bufferpools, tablespaces, containers, heap sizes: the job ran in 14 minutes.
- Upping the arraysize and recordcount made little difference (but a size of 20000 resulted in a timeout error for 4 nodes but no error with 2 nodes).
- Upping the number of nodes made little difference.
- The Unix sysadmin reports no bottlenecks with CPU, memory, disk or network.
- The job only uses 2 of the 4 CPU's but I don't know why.
- Removing the indexes: the job ran in 9 minutes.
Current problem:
What we are now trying is the "Bulk Load" option of the DB2Connector.
However, the job aborts with the error:
Parallel job reports failure (code 262276)
No other errors or warning appear in the log.
Has anyone come across this error, or can anyone suggest a solution?
Also, feel free to suggest other ways we can improve performance.