Conditional INSERT/UPDATE

ririr · Post by **ririr** » Mon May 24, 2004 4:52 pm

Guys,

I have the following scenario :

Target Table (TABLE A) has the following 4 columns, the KEY column and few other columns

Column_Name TYPE
CREATED_DATE DATE
CREATED_BY CHAR
UPDATED_DATE DATE
UPDATED_BY CHAR

Now, for above columns I am trying to comeup with an CONDITIONAL INSERT/UPDATE logic. ONLY UPDATE THE UPDATED_DATE, UPDATED_BY Columns for the data existing in the target table and ONLY POPULATE THE CREATED_DATE AND CREATED_BY columns if it is a NEW DATA ROW.

ANy help is appreciated.

Thanks

chulett · Post by **chulett** » Mon May 24, 2004 5:11 pm

This is a pretty straight-forward thing to accomplish.

First, pull existing key column(s) from the target table into a Hash file. Then, when preparing data for the target table, check the hash file for the current key and split off two links from the transformer based on the result of the lookup. One for updates (key exists) to handle the UPDATED columns and another for inserts (key does not exist) to handle the CREATED columns. Run both links into the same Stage/Table with the appropriate Update Action set on each.

vmcburney · Post by **vmcburney** » Mon May 24, 2004 9:14 pm

Craig's approach is probably the best way to tackle it, you can also do this without a hash file lookup. If you look at your database stage you'll see it is set to insert/update or update/insert, there will be a generated SQL tab which shows you the insert and update statements. You can set this up to insert into CREATED_BY and CREATED_DATE field, switch it across to Generated SQL and modify the UPDATE statement to replaced CREATED with UPDATED.

In my book this approach is less robust, I hate using custom SQL on database output stages as they can be problematic and difficult to debug, but it is a simple way to make your change. If you do it this way put down a great big notation box around your database stage with a message indicating what you have done.

acool · Post by **acool** » Tue May 25, 2004 2:58 pm

I have the same situation. and I am following what you suggested. I am using multiple lookup, but I don't know how I need to define the constraints for insert and update.

Do I use the statement " IsNull(Lookup.xxx@TRUE)"

chulett wrote:This is a pretty straight-forward thing to accomplish.

First, pull existing key column(s) from the target table into a Hash file. Then, when preparing data for the target table, check the hash file for the current key and split off two links from the transformer based on the result of the lookup. One for updates (key exists) to handle the UPDATED columns and another for inserts (key does not exist) to handle the CREATED columns. Run both links into the same Stage/Table with the appropriate Update Action set on each.

chulett · Post by **chulett** » Tue May 25, 2004 3:07 pm

No. Simply check one of the key fields in the lookup you are using for null. If you put data in and it comes back null, the lookup has failed:

Code: Select all

IsNull(Lookup.xxxxx)              - for Inserts
Not(IsNull(Lookup.xxxxx))         - for Updates

ririr · Post by **ririr** » Sat May 29, 2004 12:42 pm

Vincent

Can you please be more specific. Do you mean to create two target transofrmers, one for inserts and one for updates.

Currently, I have one transformer with the target columns include CREATED_DATE, UPDATED_DATE, CREATED_BY AND UPDATED_BY. I have set the Action type to insert new rows or update existing rows. The SQL is generated to INSERT in ALL COLUMNS AND UPDATE ALL COLUMNS.

THE REQUIREMENT I HAVE IS SOMETHING LIKE THIS:

DO NOT INSERT INTO THE UPDATED_DATE & UPDATED_BY COLUMNS WHEN IT IS AN INSERT ACTION

AND

ONLY UPDATE THE UPDATED_DATE & UPDATED_BY COLUMNS, IF IT IS AN UPDATE ACTION AND NOT UPDATE THE CREATED_DATE & CREATED_BY COLUMNS.

I appreciate all the help!

Thanks

chulett · Post by **chulett** » Sat May 29, 2004 1:46 pm

ririr wrote:Can you please be more specific. Do you mean to create two target transofrmers, one for inserts and one for updates.

Yes and No. The only way to use different columns when inserting or updating is to use two seperate links, one for Inserts and one for Updates. However, you only need one transformer to feed them and one Target database stage to do the actual work.

In the insert link, use all of the columns but set the 'updated' columns to NULL. Either that or do not include them in the link and they will automatically default to null. (This could depend on your database, which you haven't mentioned)

In the update link, do not include the 'created' columns, only the actual columns you want to update.

ririr · Post by **ririr** » Sun May 30, 2004 10:08 pm

When you say two seperate links, are those on the target side. The target table that is being inserted/updated has to laoded twice for the two stage types(DRS or OCI9) individually.

The database platform that I am on is ORACLE.

Thanks

ray.wurlod · Post by **ray.wurlod** » Mon May 31, 2004 1:02 am

Yes, they're on the target side. You can use one stage or two to connect to Oracle. The links are handling disjoint sets of data; rows to be inserted (because they do not yet exist in the target database) on one link, and rows to be updated (because they do already exist in the target database) on the other link.

chulett · Post by **chulett** » Mon May 31, 2004 5:31 am

It is best (IMHO) to use a single OCI stage whenever possible when doing things like this. I've seen some people who, when needing to write to 6 different tables in the same instance, will use 6 OCI stages. This works, but keep in mind the fact that this will cause 6 connections to the database just from this one job, rather than all work being done via one connection if they had used a single OCI stage. This can easily overwhelm a server if several jobs do something like this in concert.

From my experience, I haven't seen any performance disadvantanges from doing it like this - minimizing connections or 'connection sharing'. I'm curious if anyone else has any Words of Wisdom on this particular subject?

ray.wurlod · Post by **ray.wurlod** » Mon May 31, 2004 1:31 pm

If I were selling Oracle licences I'd be encouraging folks to use as many connections as possible!

My normal practice is to use as few connections as possible, mainly to preserve licence seats rather than any fear of overwhelming the database server; any serious updating is done via a bulk loader.

chulett · Post by **chulett** » Mon May 31, 2004 2:02 pm

The 'overwhelming' I meant was more on the order of the number of connections or threads per user or session rather than the amount of work being done. I've seen jobs hit some of these kind of limits on busy (or under configured) systems.

vmcburney · Post by **vmcburney** » Mon May 31, 2004 4:56 pm

Each database connection uses up database resources so connection pooling is advantageous. I really can't think of any drawbacks.

ririr · Post by **ririr** » Tue Jun 01, 2004 9:34 am

Vincent

You mentioned that you were able handle the update action by switching it across to Generated SQL and modify the UPDATE statement to replaced CREATED with UPDATED:

When I try to update the column (UPDATE_DATE)

ORA-00957: duplicate column name

It is throwing the above oracle error.

Below is the generated sql

UPDATE TEST SET EFFECTIVE_DATE=TO_DATE(:3, 'YYYY-MM-DD HH24:MI:SS'),EFF_FLAG=:4,CREATION_DATE=TO_DATE(:5, 'YYYY-MM-DD HH24:MI:SS'),CREATED_BY=:6,UPDATE_DATE=TO_DATE(:7, 'YYYY-MM-DD HH24:MI:SS'),LAST_UPDATED_BY=:8 WHERE KEY1=:1 AND KEY2=:2

I have updated the above SQL in the userdefinedsql tab

UPDATE TEST SET UPDATE_DATE=TO_DATE(:3, 'YYYY-MM-DD HH24:MI:SS'),EFF_FLAG=:4,UPDATE_DATE=TO_DATE(:5, 'YYYY-MM-DD HH24:MI:SS'),CREATED_BY=:6,UPDATE_DATE=TO_DATE(:7, 'YYYY-MM-DD HH24:MI:SS'),LAST_UPDATED_BY=:8 WHERE KEY1=:1 AND KEY2=:2

ORA-00957: duplicate column name

It is throwing the above oracle error.

Please help!