I have a simple job that reads a sequential file passes it through a transformer and then deletes those records from a table. The seuqential file has about 250 million rows. I'm experiencing very low performance reading the file. (71 rows/ sec) according to stats in designer.
I've verified that the AIX box isn't anywhere near capapcity. There are only a couple of MicroStrategy reports being run at the same time.
Is there a more efficient way to delete such a large amount of data using DataStage?
Poor performance using sequential file stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 15
- Joined: Fri Mar 09, 2007 8:22 am
-
- Premium Member
- Posts: 425
- Joined: Sat Nov 19, 2005 9:26 am
- Location: New York City
- Contact:
I guess that the botleneck is not in the sequential file but in the oracle stage ...yoou can test this by replacing your Oracle stage with another sequential stage
What's your commit strategy?
Delete operation is database intensive and time consuming, mainly because it forces the database to generate and save to disk significant amounts of redo and undo data ... so you might need to manipulate the logging capability of your database
What's your commit strategy?
Delete operation is database intensive and time consuming, mainly because it forces the database to generate and save to disk significant amounts of redo and undo data ... so you might need to manipulate the logging capability of your database
Julio Rodriguez
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
ETL Developer by choice
"Sure we have lots of reasons for being rude - But no excuses
-
- Participant
- Posts: 15
- Joined: Fri Mar 09, 2007 8:22 am
I tested the bottleneck by replacing the database stage with another sequential file. It read roughly 2500 rows per sec. I'm working with the DBA to look at the stats on this table as it has been inserted & deleted into numerous times.
I was reading the file sequentially. I'll have to change the file to a fixed length file before I'm able to test the multiple nodes per reader option
I was reading the file sequentially. I'll have to change the file to a fixed length file before I'm able to test the multiple nodes per reader option
Are you trying to delete 250 million rows from the table? How many rows are in the table prior to starting?
Would sure hope this is a one-off job!
You may be better off identifying those records you don't want to delete, moving them to a temp table, truncating the existing table and moving retained records back.
Would sure hope this is a one-off job!
You may be better off identifying those records you don't want to delete, moving them to a temp table, truncating the existing table and moving retained records back.