hi
how can in i improve the performance while fetching the data from oracle , using ODBC stage. in input i m having 13 lacks rows , and looking for a speed near to 1000 rows/ sec. if i attempt to view the data i am getting error that metadata mismatch and followed by an 'data source empty' message , inspite of having data in table.
performance improving while fetching data from oracle stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 67
- Joined: Thu Apr 06, 2006 7:13 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
First off, get rid of the metadata mismatch. Import the Oracle table definition and load that (just the columns that you need) into your DataStage job. Even that will improve the speed a little.
For the rest, the problem is what you are doing with the rows once they arrive. Chances are that the problem is not extraction from Oracle. To prove this, create a job that extracts from Oracle and writes to a text file, performing no transformation. Get performance metrics for this job. This is the upper limit; any processing you add to it will reduce throughput.
Add transformation, keeping the text file for output. Performance metrics will now reveal the impact of transformation processing.
Any other slowness is necessarily the load phase. Split this out; use sqlldr (or other bulk loader if the target is not Oracle) to load from the text file into the target table. This will be faster than INSERT statements.
Consider making the job multi-instance, each instance processing a similar proportion of the source data. This will not yield much gain on a single-CPU machine, but may well do so on a multi-CPU machine.
If it is still too slow, consult with your DBA about how to improve the load performance (temporarily removing constraints, indexes, etc.) You have exhausted the possibilities in DataStage by now.
For the rest, the problem is what you are doing with the rows once they arrive. Chances are that the problem is not extraction from Oracle. To prove this, create a job that extracts from Oracle and writes to a text file, performing no transformation. Get performance metrics for this job. This is the upper limit; any processing you add to it will reduce throughput.
Add transformation, keeping the text file for output. Performance metrics will now reveal the impact of transformation processing.
Any other slowness is necessarily the load phase. Split this out; use sqlldr (or other bulk loader if the target is not Oracle) to load from the text file into the target table. This will be faster than INSERT statements.
Consider making the job multi-instance, each instance processing a similar proportion of the source data. This will not yield much gain on a single-CPU machine, but may well do so on a multi-CPU machine.
If it is still too slow, consult with your DBA about how to improve the load performance (temporarily removing constraints, indexes, etc.) You have exhausted the possibilities in DataStage by now.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
<peeve>It's "hashed" file, please get it right! A hash file is used for filing hash.</peeve>
"Hashed" refers to the fact that keys are located using a hashing algorithm (rather than a table scan or primary key index).
"Hashed" refers to the fact that keys are located using a hashing algorithm (rather than a table scan or primary key index).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.