Double Resource consume - Insert/Update Job

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
arunverma
Participant
Posts: 90
Joined: Tue Apr 20, 2004 8:20 am
Location: MUMBAI
Contact:

Double Resource consume - Insert/Update Job

Post by arunverma »

We have developed Billing Delta Application , which extract incremental data based on last_modified_date from source system and load the same in DSS server , In data loading job we have choosen option " Update Existing Rows or Insert New Row" . and following two sql statement generated in SQL tab - one is for Insert and other is for Update " . our requirement is if any data already exisit then update or insert data . but Our DBA and System Administrator Analysis server log and Oracle databse log that at the time of data loading job and found that if we got 100 Record from source system ,
then it consume resource and excute insert for 100 record and update for 100 record also .

So for each got it consume resources double and execute twies .

Pl. help me , due to this cause my server is running very slow .

Thanks and Regards

Arun Verma .
Arun Verma
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Double Resource consume - Insert/Update Job

Post by ogmios »

Are you using insert/update or update/insert?

Check whether there's a primary key or a unique constraint on the table you're loading to.

Check whether the key fields are set correctly in the stage writing to the database.

Ogmios
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Double Resource consume - Insert/Update Job

Post by ogmios »

Assmuing you use update/insert this is what probably happens (for 1 row):

DataStage tries to update the row.
This fails because all of your rows are new rows.
It then inserts the row.

So it executes "twice". If you have insert/update it will happen the other way around: the insert will be tried and if that fails the update will be tried.

Ogmios
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

Are you using ODBC to update/insert? My experience has me avoiding ODBC update/insert because of this slow processing.

Is it possible for you to extract the existing data from Oracle and then:

1. in a transform determine which records are new (to insert) and which are for update by comparing what is in the DSS db to what you are pulling from the source.
2. Have the two option write to different targets - one to insert (in the current job) and the other to update -- possibly in the next job.
3. You could even use a bulk load stage for the insert and an odbc for the update?

perhaps the oracle users better confirm what i am suggesting.

then you might suggest sybase IQ for you DSS db :twisted:
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You need to re-think the design.

Have one link doing insert only, and another link doing update only.
Make the decision by reference lookup against the target table or - ideally - a copy of the relevant rows and columns from it in a hashed file.

You will be amazed at the difference this makes!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
arunverma
Participant
Posts: 90
Joined: Tue Apr 20, 2004 8:20 am
Location: MUMBAI
Contact:

Post by arunverma »

Hi All ,

Our source is Oracle Database and destination also , each table have primary key , and at the time of data loading we mark that filed as key .

As suggest by Mr. ray and Denzilsyb , we need to redesige our job ,
and create Hash file for each table and first lookup source file with this has file , filter file , one for insert and one for update , this will solve my problem .

But my question is that why Datastage can not identified both record , option is also availade that " Update Existing Rows or Insert New Row" . :roll:


Thanks

Arun
Arun Verma
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The only way that DataStage (or you, or anything else for that matter) could do it is to make reference to the target Oracle table, to determine whether the key exists already.

If it does, do an update. If it doesn't, do an insert.

Given that there is a key, this is all that you need to load into the hashed file to make that existence decision. So your hashed file is like a key-only table, and extremely fast to access.

The "update existing row or insert new row" method performs its "reference" against the target table by attempting an update, and checking on whether the update failed because the row does not exist. It's a much more laborious process than a key-only lookup to a hashed file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply