Page 1 of 1

Warning, Insert had a high number of retries

Posted: Fri Aug 16, 2013 2:00 pm
by asorrell
I'm trying to help another developer whose job has an informational message I've never seen before. It is a straight "Source -> Transformer -> Target" kind of job with Oracle as the target.

The warning is "Warning, insert has a high number of retries: 1,999."

This message appears several times in the director log. Performance started out around 5K RPS, but is now down to 280 RPS with over a million rows output.

Our DBA is currently unavailable to tell me if Oracle logs can shed any light on this, but I thought I'd toss it out to see if any of you have seen this before.

I do know that the Oracle system is very stressed right now, lots of things hitting it. I've just never seen it where it would have to re-process an insert, much less 2,000+ times.

Posted: Wed Aug 21, 2013 4:01 am
by priyadarshikunal
Seems you are using oracle connector. Can you check what is the number of retries defined in the connector stage? It shows the number of retries when you are setting the properties for the link in oracle connector.

Posted: Wed Aug 21, 2013 7:44 am
by asorrell
I'm not concerned with hitting the limit - I'm trying to determine what is causing the retries in the first place.

Posted: Wed Aug 21, 2013 8:31 am
by ArndW
My initial guess is the locking mechanism on the table is getting locked rows (or perhaps pages) caused either by another query or by the parallelism behind the load itself.

Are you running this in parallel? What level of locking did you specify (probably "read uncommitted") in DS and do you know of another process which might be locking your output table?

committ limit

Posted: Wed Aug 21, 2013 8:45 am
by sureshreddy2009
This kind of oracle messages always depends on api vs direct load
Which kind of load you are doing
If you are not committing the records, then the database is open for ever which cause this kind of issues

Posted: Wed Aug 21, 2013 9:21 am
by asorrell
Unfortunately, there's no oversight on the group writing this code, so there are no best practices being used. I can safely assume their stages all use the default settings for ODBC stages mainly because very few developers have done anything other than "take the defaults". They are not using Oracle connectors (which are available) because the first template anyone got to work used an ODBC enterprise stage.

These jobs are all parallel jobs, but they are all "transfer from table A to table B with one transformer in the middle" kind of jobs, with "Auto" partitioning. There are hundreds of jobs in dozens of sequencers active at any given time, so it is quite possible that they are "stepping" on each other by accessing the same file at the same time.

Thanks for the suggestions - I'll see if the DBA's can give me more detail from the logs.

Posted: Wed Aug 21, 2013 10:25 am
by chulett
As George would say... "OH MY".