Page 1 of 1

slow execution of simple job using Oracle connector

Posted: Fri Nov 15, 2013 5:16 pm
by iq_etl
I have a very simple job which reads a dataset of changes with eight columns plus a change code, and uses a PL_SQL CASE statement to apply the changes using an Oracle connector. To process UPDATEs to about 230,000 records, it's taking about 3 1/2 hours. By contrast, a very similar job that actually has more columns is processing about 50,000 records in 16 seconds.

I asked our DBA to take a look at what it was doing, and she said that for most of the time the database is just waiting. The job finishes successfully with no warnings - I just can't figure out why it's so slow!

I've tried changing it to run in sequential mode, but that had no noticeable impact. (By default it is running in parallel mode on two nodes.)

Any suggestions or debugging techniques would be much appreciated. I'd also be happy to provide any other information that might be useful.

Posted: Fri Nov 15, 2013 5:20 pm
by chulett
Can we start with an explanation of what "uses a PL_SQL CASE statement to apply the changes" means? Thanks. That and what settings you are using in the Connector.

Posted: Mon Nov 18, 2013 12:12 pm
by iq_etl
Absolutely. In the 'Properties' tab of the Oracle Connector stage, the write mode is set to PL/SQL and the following PL/SQL block is executed:

Code: Select all

declare 

change_code nchar(1);
ccyys int(5);
snapshot nvarchar2(3);
sri int(9);
class_unique nvarchar2(5);
pe_occurrence nchar(3);
dept_abbr nvarchar2(3);
course_number nvarchar2(6);

begin

CASE :change_code
  when '1' then
    INSERT into UTDBA.NR_STG_STUDENT_CLASSES 
      (CCYYS, SNAPSHOT, SRI, CLASS_UNIQUE, PE_OCCURRENCE, DEPT_ABBR, COURSE_NUMBER, UIN)
    VALUES
      (:CCYYS, :SNAPSHOT, :SRI, :CLASS_UNIQUE, :PE_OCCURRENCE, :DEPT_ABBR, :COURSE_NUMBER, :UIN);
  when '2' then
    DELETE from UTDBA.NR_STG_STUDENT_CLASSES
      where CCYYS = :CCYYS
          and SNAPSHOT = :SNAPSHOT
          and SRI = :SRI
          and PE_OCCURRENCE = :PE_OCCURRENCE;
  when '3' then
    UPDATE UTDBA.NR_STG_STUDENT_CLASSES
       SET CLASS_UNIQUE = :CLASS_UNIQUE, 
               DEPT_ABBR = :DEPT_ABBR,
               COURSE_NUMBER = :COURSE_NUMBER,
               UIN = :UIN
       WHERE CCYYS = :CCYYS
             AND SNAPSHOT = :SNAPSHOT
             AND SRI = :SRI
             AND PE_OCCURRENCE = :PE_OCCURRENCE;
end case;

end;
For the run times I specified, all rows in the file have a change code of 3 (so it is UPDATE-ing).

The table action is set to Append, and pretty much all of the other settings are set to the default.

Posted: Mon Nov 18, 2013 12:36 pm
by chulett
Gotcha... an anonymous block. Let's start with the obvious first question, are the fields in your 'where' clause indexed in such a way that supports the update (and delete)?

Posted: Mon Nov 18, 2013 1:19 pm
by iq_etl
I think so. The unique key is defined as the combination of CCYYS, SNAPSHOT, SRI, CLASS_UNIQUE, and PE_OCCURRENCE. As it happens, CLASS_UNIQUE is not really needed - the combination of the other four columns is still unique. I've tried it with and without 'AND CLASS_UNIQUE = :CLASS_UNIQUE' included in the query, and it does not appear to make a difference in performance either way.

Posted: Mon Nov 18, 2013 2:00 pm
by chulett
Have your DBA trace the session and see if the index is being used for the updates. If you don't use one of the fields in the composite index (unless it is on the 'trailing' edge) that would preclude index use but using all five of of them should fare much better.

I'm assuming your issue is that the index isn't being used and you are doing a full table scan for each update. Your DBA should be able to confirm/deny.

Posted: Mon Nov 18, 2013 6:36 pm
by iq_etl
Thanks. I've asked our DBA to look at this again, and will post the results.

Posted: Wed Aug 10, 2016 3:57 pm
by koti9
I know its an old post, but just thought of providing solution to the issue. I was facing the same issue with slow running job with inserts/updates/deletes, obvious choice was plsql to do all 3 in one shot, but unlike case statement mentioned in previous solution,i tried merge update/delete/insert, but merge too was way too slow. Then i split Update/Inserts and Deletes separately, job finished way too fast,Just process update/insert with Update then insert, deletes with the deletes..
We just need to make sure we aren't touching the same rows inserts/update/deletes though.

Thanks & Regards
Koti