Hi,
I have a job like
Dataset ---> Copy Stage --> Db2Enterprise stage and my
Write Method is : Upsert.
Upsert mode is : Userdefined Update.
Configuration File: 4 Nodes.
Now, I am getting 1 million records from input and its taking 3 hours to update the records.
Is there any way to make better performance.
thanks in advance
How to get performance while using UserdefinedUpdate stmt
Moderators: chulett, rschirm, roy
Madhu, all timings are relative, so nobody here can tell you what to do or even if these values are good or bad.
What is your effective speed if you do only inserts? Is your database partitioned? Do you have multiple constraints?
What is your effective speed if you do only inserts? Is your database partitioned? Do you have multiple constraints?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Re: How to get performance while using UserdefinedUpdate stm
Since it taking more time, you can try the follwoing option.
do the upsert process in datastage and do all records insert into the table.
do the upsert process in datastage and do all records insert into the table.
Another possibility is to take advantage of DB2's inherent parallel processing capabilities - write all of your data to a work table, then use normal DB2 sql to handle your inserts and updates (or use DB2 V8's merge syntax to do an upsert). That will allow you to take advantage of indexes that DataStage can't use.
Here's what we often do:
Hope this helps,
Brad.
Here's what we often do:
- 1) In your DataStage job, left outer join your new data (left) to the existing table's keys (right). The table read pulls keys only and an indicator field, no data. Example:
select account_nr, cast('UPDATE' as char(6) as ld_action
from schema.tablename
2) Use the indicator field to show whether a reocrd is an 'INSERT' or 'UPDATE'. Because you used a left outer join, the ld_action will be NULL if the record is new, and 'UPDATE' if not. You can use a modify stage to change NULL to 'INSERT'. Now your output record layout is input_data + ld_action.
3) Write all records to a work table. Add an index to your keys, and to the ld_action field. You could consider using the 'load' mode instead of 'write' to avoid logging.
4) Use SQL to insert where ld_action = 'INSERT' (or NULL if you didn't modify it before), and update where ld_action = 'UPDATE'. Make sure your work table and target table are indexed the same, and that you do a runstats on your work table after it is loaded and indexed.
Hope this helps,
Brad.