How to improve performance
Moderators: chulett, rschirm, roy
How to improve performance
Hi Everyone,
I have a job design as
DRS(DB2 database) ---> Copy stage--->DRS( Sqlserver)
|
Aggregator --->ODBC(audit Source rows)
The job is taking around 15 mins to complete in datastage where as the same job is taking 8mins in SSIS. The volume of records is 2million and the rows for transaction is around 1000/sec. After monitoring the job i got to know that 14mins of time is taking while extracting process only. Even i played by changing the DRS stage to ODBC connector /ODBC EE and also with the array size ( 1 to 5000) but no luck .
Can any one please suggest how to improve the performance, ( need to reduce the processing time to 8 minutes) .
Thanks In Advance
I have a job design as
DRS(DB2 database) ---> Copy stage--->DRS( Sqlserver)
|
Aggregator --->ODBC(audit Source rows)
The job is taking around 15 mins to complete in datastage where as the same job is taking 8mins in SSIS. The volume of records is 2million and the rows for transaction is around 1000/sec. After monitoring the job i got to know that 14mins of time is taking while extracting process only. Even i played by changing the DRS stage to ODBC connector /ODBC EE and also with the array size ( 1 to 5000) but no luck .
Can any one please suggest how to improve the performance, ( need to reduce the processing time to 8 minutes) .
Thanks In Advance
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What read method are you using to get the data out of DB2? Use the Resource Estimator and/or Job Monitor to give more targeted metrics. If the import really is the resource hog, this is the area upon which your investigations need to be more closely focussed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
No idea. What is your user-defined SQL doing? Are any constraints, joins, groupings or sorts supported by indexes in the database? Could this be faster using bulk unload and processing in DataStage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pradkumar wrote:Hi Ray,
Iam using user defined SQL as read method to pull data from DB2 database.In the job monitor i found that 11.34 mins of total time 12 mins is taking for the source stage. can i play by changing the array size , read committed ? What changes have to make to improve it. Please sugest.
Thanks
If u r using joins in the source query
then specify alias for each table and have each column with alias table name.column name ...
This reduces the time taken to execute the query ..as it will not search each table for the column
Thanks
Deepti
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi
Thanks for your replys ..
But my source(db2 database/odbc EE stage) has 2 million records flows to a copy stage ; from their one link directly loads in to target sql server(drs stage) and 2nd link from copy stage to aggregator where it takes record count as aggregation type and loads just 1 record to the audit table... The same job design is taking 8 mins of time in SSIS where as its taking 15 mins in datastage .. is there any way can we improve the performace ( like doing partitioning etc)
Thanks in Advance
Thanks for your replys ..
But my source(db2 database/odbc EE stage) has 2 million records flows to a copy stage ; from their one link directly loads in to target sql server(drs stage) and 2nd link from copy stage to aggregator where it takes record count as aggregation type and loads just 1 record to the audit table... The same job design is taking 8 mins of time in SSIS where as its taking 15 mins in datastage .. is there any way can we improve the performace ( like doing partitioning etc)
Thanks in Advance
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I don't know if this will help in your case, but we found a bottleneck in how the DB2 database was configured.
The symptoms were similar: our jobs would spend >80% of their time waiting for rows to be returned from the DB2 read.
We reduced this to about 20%, by completely restructuring DB2:
- defining Indexes, Logs and Data to be on separate filesystems (ideally different physical disk spindles);
- creating multiple "containers" for the tablespaces on separate filesystems (again, ideally on different disk spindles);
- maximising some of the db2 parameters (such as the buffer size), and the way that Unix manages its memory. [Sorry, but I do not remember all of the parameters].
So this might be more of a DB2 issue rather than a DataStage issue.
Note that if DB2 is on the same server as DataStage, then there's a bit of disagreement over the optimum Unix settings. They don't really play nicely together.
The symptoms were similar: our jobs would spend >80% of their time waiting for rows to be returned from the DB2 read.
We reduced this to about 20%, by completely restructuring DB2:
- defining Indexes, Logs and Data to be on separate filesystems (ideally different physical disk spindles);
- creating multiple "containers" for the tablespaces on separate filesystems (again, ideally on different disk spindles);
- maximising some of the db2 parameters (such as the buffer size), and the way that Unix manages its memory. [Sorry, but I do not remember all of the parameters].
So this might be more of a DB2 issue rather than a DataStage issue.
Note that if DB2 is on the same server as DataStage, then there's a bit of disagreement over the optimum Unix settings. They don't really play nicely together.