Conditional INSERT/UPDATE

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ririr
Participant
Posts: 84
Joined: Sun Apr 04, 2004 8:59 pm

Conditional INSERT/UPDATE

Post by ririr »

Guys,

I have the following scenario :

Target Table (TABLE A) has the following 4 columns, the KEY column and few other columns

Column_Name TYPE
CREATED_DATE DATE
CREATED_BY CHAR
UPDATED_DATE DATE
UPDATED_BY CHAR

Now, for above columns I am trying to comeup with an CONDITIONAL INSERT/UPDATE logic. ONLY UPDATE THE UPDATED_DATE, UPDATED_BY Columns for the data existing in the target table and ONLY POPULATE THE CREATED_DATE AND CREATED_BY columns if it is a NEW DATA ROW.

ANy help is appreciated.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

This is a pretty straight-forward thing to accomplish.

First, pull existing key column(s) from the target table into a Hash file. Then, when preparing data for the target table, check the hash file for the current key and split off two links from the transformer based on the result of the lookup. One for updates (key exists) to handle the UPDATED columns and another for inserts (key does not exist) to handle the CREATED columns. Run both links into the same Stage/Table with the appropriate Update Action set on each.
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Craig's approach is probably the best way to tackle it, you can also do this without a hash file lookup. If you look at your database stage you'll see it is set to insert/update or update/insert, there will be a generated SQL tab which shows you the insert and update statements. You can set this up to insert into CREATED_BY and CREATED_DATE field, switch it across to Generated SQL and modify the UPDATE statement to replaced CREATED with UPDATED.

In my book this approach is less robust, I hate using custom SQL on database output stages as they can be problematic and difficult to debug, but it is a simple way to make your change. If you do it this way put down a great big notation box around your database stage with a message indicating what you have done.
acool
Participant
Posts: 29
Joined: Tue Feb 17, 2004 4:31 pm

I have the same situation.

Post by acool »

I have the same situation. and I am following what you suggested. I am using multiple lookup, but I don't know how I need to define the constraints for insert and update.

Do I use the statement " IsNull(Lookup.xxx@TRUE)"


chulett wrote:This is a pretty straight-forward thing to accomplish.

First, pull existing key column(s) from the target table into a Hash file. Then, when preparing data for the target table, check the hash file for the current key and split off two links from the transformer based on the result of the lookup. One for updates (key exists) to handle the UPDATED columns and another for inserts (key does not exist) to handle the CREATED columns. Run both links into the same Stage/Table with the appropriate Update Action set on each.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No. Simply check one of the key fields in the lookup you are using for null. If you put data in and it comes back null, the lookup has failed:

Code: Select all

IsNull(Lookup.xxxxx)              - for Inserts
Not(IsNull(Lookup.xxxxx))         - for Updates
-craig

"You can never have too many knives" -- Logan Nine Fingers
ririr
Participant
Posts: 84
Joined: Sun Apr 04, 2004 8:59 pm

Post by ririr »

Vincent

Can you please be more specific. Do you mean to create two target transofrmers, one for inserts and one for updates.

Currently, I have one transformer with the target columns include CREATED_DATE, UPDATED_DATE, CREATED_BY AND UPDATED_BY. I have set the Action type to insert new rows or update existing rows. The SQL is generated to INSERT in ALL COLUMNS AND UPDATE ALL COLUMNS.

THE REQUIREMENT I HAVE IS SOMETHING LIKE THIS:

DO NOT INSERT INTO THE UPDATED_DATE & UPDATED_BY COLUMNS WHEN IT IS AN INSERT ACTION

AND

ONLY UPDATE THE UPDATED_DATE & UPDATED_BY COLUMNS, IF IT IS AN UPDATE ACTION AND NOT UPDATE THE CREATED_DATE & CREATED_BY COLUMNS.

I appreciate all the help!

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ririr wrote:Can you please be more specific. Do you mean to create two target transofrmers, one for inserts and one for updates.
Yes and No. The only way to use different columns when inserting or updating is to use two seperate links, one for Inserts and one for Updates. However, you only need one transformer to feed them and one Target database stage to do the actual work.

In the insert link, use all of the columns but set the 'updated' columns to NULL. Either that or do not include them in the link and they will automatically default to null. (This could depend on your database, which you haven't mentioned)

In the update link, do not include the 'created' columns, only the actual columns you want to update.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ririr
Participant
Posts: 84
Joined: Sun Apr 04, 2004 8:59 pm

Post by ririr »

When you say two seperate links, are those on the target side. The target table that is being inserted/updated has to laoded twice for the two stage types(DRS or OCI9) individually.

The database platform that I am on is ORACLE.

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, they're on the target side. You can use one stage or two to connect to Oracle. The links are handling disjoint sets of data; rows to be inserted (because they do not yet exist in the target database) on one link, and rows to be updated (because they do already exist in the target database) on the other link.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It is best (IMHO) to use a single OCI stage whenever possible when doing things like this. I've seen some people who, when needing to write to 6 different tables in the same instance, will use 6 OCI stages. This works, but keep in mind the fact that this will cause 6 connections to the database just from this one job, rather than all work being done via one connection if they had used a single OCI stage. This can easily overwhelm a server if several jobs do something like this in concert.

From my experience, I haven't seen any performance disadvantanges from doing it like this - minimizing connections or 'connection sharing'. I'm curious if anyone else has any Words of Wisdom on this particular subject?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If I were selling Oracle licences I'd be encouraging folks to use as many connections as possible! :lol:
My normal practice is to use as few connections as possible, mainly to preserve licence seats rather than any fear of overwhelming the database server; any serious updating is done via a bulk loader.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The 'overwhelming' I meant was more on the order of the number of connections or threads per user or session rather than the amount of work being done. I've seen jobs hit some of these kind of limits on busy (or under configured) systems.
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Each database connection uses up database resources so connection pooling is advantageous. I really can't think of any drawbacks.
ririr
Participant
Posts: 84
Joined: Sun Apr 04, 2004 8:59 pm

Post by ririr »

Vincent

You mentioned that you were able handle the update action by switching it across to Generated SQL and modify the UPDATE statement to replaced CREATED with UPDATED:

When I try to update the column (UPDATE_DATE)

ORA-00957: duplicate column name

It is throwing the above oracle error.


Below is the generated sql

UPDATE TEST SET EFFECTIVE_DATE=TO_DATE(:3, 'YYYY-MM-DD HH24:MI:SS'),EFF_FLAG=:4,CREATION_DATE=TO_DATE(:5, 'YYYY-MM-DD HH24:MI:SS'),CREATED_BY=:6,UPDATE_DATE=TO_DATE(:7, 'YYYY-MM-DD HH24:MI:SS'),LAST_UPDATED_BY=:8 WHERE KEY1=:1 AND KEY2=:2

I have updated the above SQL in the userdefinedsql tab

UPDATE TEST SET UPDATE_DATE=TO_DATE(:3, 'YYYY-MM-DD HH24:MI:SS'),EFF_FLAG=:4,UPDATE_DATE=TO_DATE(:5, 'YYYY-MM-DD HH24:MI:SS'),CREATED_BY=:6,UPDATE_DATE=TO_DATE(:7, 'YYYY-MM-DD HH24:MI:SS'),LAST_UPDATED_BY=:8 WHERE KEY1=:1 AND KEY2=:2

ORA-00957: duplicate column name

It is throwing the above oracle error.


Please help!
Post Reply