Query reg. Update records
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India
Query reg. Update records
Hi,
Plz answer my queries (though they are sillY :D )
1. DB2 Connector is my target stage, so whats the difference between "Insert then Update" and "Update then Insert" mode?
performance wise, logic wise etc.
2. One of my friend told me to keep "Record Count" = 1(i.e. commit after every record) while updating in the target?
Plz answer my queries (though they are sillY :D )
1. DB2 Connector is my target stage, so whats the difference between "Insert then Update" and "Update then Insert" mode?
performance wise, logic wise etc.
2. One of my friend told me to keep "Record Count" = 1(i.e. commit after every record) while updating in the target?
Thanx and Regards,
ETL User
ETL User
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Insert then update tries to insert then, if that fails because the row exist, issues an update. Update then insert tries to update then, if that fails because the row does not exist, issues an insert. Choose the one that best meets your requirements - if there will be more inserts than updates prefer the former; if there will be more updates than inserts choose the latter. Making a wrong choice will carry a performance penalty.
Record Count governs commit behaviour. Ask your friend why that particular suggestion was made. It's not always going to be the best choice. Perhaps your friend has only a little knowledge. Does your friend also recommend always running parallel jobs in sequential mode?
Record Count governs commit behaviour. Ask your friend why that particular suggestion was made. It's not always going to be the best choice. Perhaps your friend has only a little knowledge. Does your friend also recommend always running parallel jobs in sequential mode?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India
@Ray,
Yes I am having more updates than my inserts. Thats why I chose Update then Insert mode. But the issue is in both modes I get different no. of records in my target table. Performance is not an issue because in either mode my job finishes in a minute.
That's why my friend suggested to limit Record Count to 1. This will ensure that no inserts will be skipped.
Right now I have kept both Record Count and Array Size = 2000.
Please suggest that why some of my inserts are getting skiiped.
Yes I am having more updates than my inserts. Thats why I chose Update then Insert mode. But the issue is in both modes I get different no. of records in my target table. Performance is not an issue because in either mode my job finishes in a minute.
That's why my friend suggested to limit Record Count to 1. This will ensure that no inserts will be skipped.
Right now I have kept both Record Count and Array Size = 2000.
Please suggest that why some of my inserts are getting skiiped.
Thanx and Regards,
ETL User
ETL User
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India
@Pandeesh
I am just checking the count in both source and target table.
Like count(distinct primary key) in Src should be equal to count(*) in Tgt.
Also in my job I am upserting my Tgt table with the latest record i.e. I can have multiple records of my PK from thr Src(no constraints in Src table) but the latest record of the PK(based upon a column seq_no) should go in the Tgt. To achieve this I am using a Sort stage and then a Remove Duplicate Stage.
I am just checking the count in both source and target table.
Like count(distinct primary key) in Src should be equal to count(*) in Tgt.
Also in my job I am upserting my Tgt table with the latest record i.e. I can have multiple records of my PK from thr Src(no constraints in Src table) but the latest record of the PK(based upon a column seq_no) should go in the Tgt. To achieve this I am using a Sort stage and then a Remove Duplicate Stage.
Thanx and Regards,
ETL User
ETL User
Not true. The outcome of each is entirely dependant on the unique constraint that you may (or may not) on the record. Understand that the first action must fail for the second action to fire and the way inserts fail is if there is a unique constraint being violated.pandeesh wrote:I am curious why you are getting different counts update then insert mode and Insert then update mode.
Both cases the count should be the same
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India