Very slow updates vs. inserts
Posted: Tue Mar 29, 2005 2:57 pm
I am seeing VERY slow times when using an update vs. an insert (3 rows/second vs. 3000 rows/second). Is this normal for DataStage to exhibit this drastic of a difference?
I am relatively new to DataStage but have successfully created an ETL sequence that works very well using a type 2 (Kimball) model where each change in pertinent data creates a new record via inserts. No updates are done to the existing rows.
However, I needed to add a new element to the target and that required a way to initialize data in that new element outside of the normal ETL sequence. I did this by creating a new job that builds a hash of the keys and initial values, then uses an MSOLEDB stage to export the data to a sequential file. Another job then runs through this sequential file and feeds the output to an MSOLEDB update stage to update the new element in each row. When done, it regenerates the CRC hash file so that I don't get a huge set of inserts on the next run of the ETL. I am using an "Update existing records only" Input Action on the update stage.
What I found was it all worked but the update stage was incredibly slow. Probably there are some settings that will help me tune this?
Thanks for any ideas or suggestions.
I am relatively new to DataStage but have successfully created an ETL sequence that works very well using a type 2 (Kimball) model where each change in pertinent data creates a new record via inserts. No updates are done to the existing rows.
However, I needed to add a new element to the target and that required a way to initialize data in that new element outside of the normal ETL sequence. I did this by creating a new job that builds a hash of the keys and initial values, then uses an MSOLEDB stage to export the data to a sequential file. Another job then runs through this sequential file and feeds the output to an MSOLEDB update stage to update the new element in each row. When done, it regenerates the CRC hash file so that I don't get a huge set of inserts on the next run of the ETL. I am using an "Update existing records only" Input Action on the update stage.
What I found was it all worked but the update stage was incredibly slow. Probably there are some settings that will help me tune this?
Thanks for any ideas or suggestions.