Speed up ETL Process in Datastage job
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 48
- Joined: Tue Jan 26, 2010 4:05 am
- Location: Indonesia
Speed up ETL Process in Datastage job
Hi, is there any tips to speed up an extract and loading process in datastage using Oracle database as source /target? Is the load method at oracle Enterprise stage equals to SQLLoader? And is append method in Oracle Enterprise stage equals to append DML in Oracle?
for your information, my table is partitioned and have indexes. Will it slow down the loading data process?
Thanks in advance,
Caterin
for your information, my table is partitioned and have indexes. Will it slow down the loading data process?
Thanks in advance,
Caterin
Re: Speed up ETL Process in Datastage job
Try disabling the indexes,constraints etc and load them and enable them back provoided you checked the duplicates beforecaterin_huang wrote:Hi, is there any tips to speed up an extract and loading process in datastage using Oracle database as source /target? Is the load method at oracle Enterprise stage equals to SQLLoader? And is append method in Oracle Enterprise stage equals to append DML in Oracle?
for your information, my table is partitioned and have indexes. Will it slow down the loading data process?
Thanks in advance,
Caterin
RK
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The local manager of VMARK (first vendor of DataStage) here was fond of overclocking the CPU, much to the annoyance of the IT people who had to replace his fried CPU from time to time, but it did go faster!
Systematically investigate your design to determine whether you're doing anything you don't need to do, both within DataStage and on its boundaries to data.
Systematically investigate your design to determine whether you're doing anything you don't need to do, both within DataStage and on its boundaries to data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 48
- Joined: Tue Jan 26, 2010 4:05 am
- Location: Indonesia
Re: Speed up ETL Process in Datastage job
Hi, thanks for answering..g_rkrish wrote:Try disabling the indexes,constraints etc and load them and enable them back provoided you checked the duplicates beforecaterin_huang wrote:Hi, is there any tips to speed up an extract and loading process in datastage using Oracle database as source /target? Is the load method at oracle Enterprise stage equals to SQLLoader? And is append method in Oracle Enterprise stage equals to append DML in Oracle?
for your information, my table is partitioned and have indexes. Will it slow down the loading data process?
Thanks in advance,
Caterin
![Smile :)](./images/smilies/icon_smile.gif)
-
- Participant
- Posts: 48
- Joined: Tue Jan 26, 2010 4:05 am
- Location: Indonesia
If your DataStage partitioning is identical to your database partitioning then, when loading to the database, each process only needs to load to one partition and is therefore much faster and more efficient than otherwise. This is not necessarily DataStage but general database and, in this case, Oracle specific. You stated that your table is partitioned and indexed, so I assumed you meant that the database is partitioned and if you are not sure you can ask your DBA for confirmation and details.
Exactly. Your questions are pretty common when starting off and are all related to loading Oracle data regardless of the tool being used. Talk to your DBA (or co-workers familiar with the subject) and see what they suggest. Then people here can help with the DataStage specific implementation details.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Re: Speed up ETL Process in Datastage job
Have a shellscript in which call the sql for drop those constrints and indexes after the job finishes try rebuliding them using a shell script.Use before after subroutine if you use at job level,use rotuine activity if you use them at the sequencer level.
RK
Another possibility to improve the load performace, but not using the DataStage, which we used on database other than Oracle.
Have your Load files exported per partition.
As others recommended drop the constraints & indexes on the table, this can happen at the same time when your load files are being written.
In the next script, load your files using bulkloader utility in Oracle (Please check with your DBA).
Rebuild indexes & constraints.
Benefit here is, you are not engaging DS Engine for such loads.
As you would have load files ready with you, it is going to be easy for the maintenance, in case any data problem arises.
Let us know, if it helps ...
Have your Load files exported per partition.
As others recommended drop the constraints & indexes on the table, this can happen at the same time when your load files are being written.
In the next script, load your files using bulkloader utility in Oracle (Please check with your DBA).
Rebuild indexes & constraints.
Benefit here is, you are not engaging DS Engine for such loads.
As you would have load files ready with you, it is going to be easy for the maintenance, in case any data problem arises.
Let us know, if it helps ...
Thanks with regards,
videsh.
videsh.
Be careful with all of the "drop all indexes and constraints" suggestions. Not always as easy as it sounds (RI) and can be problematic if any other process is accessing your table while it is being loaded. Never mind the fact that it is not guaranteed to be any kind of faster overall, you'd need to test to confirm or deny that.
Last edited by chulett on Wed Jun 02, 2010 5:57 am, edited 1 time in total.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 48
- Joined: Tue Jan 26, 2010 4:05 am
- Location: Indonesia
Yes, the database is partitioned. hmm, if I export the files per partition, how can we make datastage to read all the files? Should I make another job for one partition text file?videsh77 wrote:Another possibility to improve the load performace, but not using the DataStage, which we used on database other than Oracle.
Have your Load files exported per partition.
As others recommended drop the constraints & indexes on the table, this can happen at the same time when your load files are being written.
In the next script, load your files using bulkloader utility in Oracle (Please check with your DBA).
Rebuild indexes & constraints.
Benefit here is, you are not engaging DS Engine for such loads.
As you would have load files ready with you, it is going to be easy for the maintenance, in case any data problem arises.
Let us know, if it helps ...
btw, can we user oracle parallel or hint insert in datastage?
Thanks,
Caterin
Last edited by caterin_huang on Wed Jun 02, 2010 1:36 am, edited 1 time in total.
Nothing special. As ArndW noted before, partition your input stream identical to the database partitioning. Let datastage do the rest.
If this is a one time deal so as to load historical data, I have involved the dba who drops all the indexes for me and turns off logging. That helps a lot. But if this is an ongoing thing then you need to nail your design.
If this is a one time deal so as to load historical data, I have involved the dba who drops all the indexes for me and turns off logging. That helps a lot. But if this is an ongoing thing then you need to nail your design.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.