Page 1 of 1

Trigger type functionality in DATASTAGE

Posted: Thu May 20, 2004 8:13 am
by ririr
I have a situation where I am loading data to TABLE A (Parent table). The Job has the UPDATE ACTION AS 'UPDATE EXISTING ROWS AND INSERT NEW ROWS'. If the record is being updated I need to load the MATCHING ROW (KEY=KEY) before UPDATE into a HISTORY TABLE (TABLE B) and then UPDATE THE DATA IN TABLE A.

Is this BEFORE UPDATE TRIGGER TYPE FUNTIONALITY IS POSSIBLE IN DATASTAGE?

Any Help is apprecited

Posted: Thu May 20, 2004 8:46 am
by kcbland
NO. You have an ETL tool, not a database. It's up to you to figure out how a row of source data needs to affect the target. If you put triggers into a database, you're removing T logic from your ETL tool and putting it into the database, where you cannot control performance, processing, or keep track of what is going on.

You should consider preparing all rows for insert/update into all target tables and then simply loading the final results into all tables, rather than propagate and stretch transformation logic all over the place.

Posted: Thu May 20, 2004 11:55 am
by ketfos
Hi,
You are having one source table(TBL A).
Your job will have two outputs from a transformer.
LinK A will have table with output UPDATE ACTION AS 'UPDATE EXISTING ROWS AND INSERT NEW ROWS'
Link B will have INSERT ROWS WITHOUT CLEARING. (History Table)

So Link A will have updated record.
Link B will have all history record.
The table structure for link A and B will be same. You can add a timestamp column to it for audit purpose.

Posted: Thu May 20, 2004 1:24 pm
by ogmios
Just use real database triggers for what you're trying to do... a little bit less nice but much faster.

Ogmios

no

Posted: Thu May 20, 2004 1:43 pm
by 1stpoint
Just use real database triggers for what you're trying to do... a little bit less nice but much faster.
I would not recommend this approach. The DataStage engine is designed to be truly multi-threaded and having 2 outputs from the transformer are much easier to control than an INSERT/UPDATE trigger.

Plus you can set the commit/transaction size to 20,000 rows per thread this way.

Re: no

Posted: Thu May 20, 2004 2:17 pm
by ogmios
LOL :D

How do you get 20.000 as commit size, I can only go to 9999.

DataStage will probably even limit your transaction size to 1 using 2 links.

The only way to be sure, test it for yourself. For small amounts of data the speed probably doesn't matter. For bigger volumes go for the fastest approach.

In my experience the database is more "truly multi-threaded" than DataStage: use the database Luke.

Ogmios

Posted: Thu May 20, 2004 2:32 pm
by ririr
Let me try to explain with an example on what I am trying to accomplish.

I am using DRS stage type(similar to OCI STAGETYPE) FOR READING/WRITING DATA FROM/TO SOURCE AND TARGET TABLES.

I am reading data from the SOURCE TABLE (TABLE X) and trying to LOAD into JUST ONE TABLE (TABLE A). The UPDATE ACTION ACTION ON THE TARGET IS SET to 'UPDATE EXISTING ROWS AND INSERT NEW ROWS'.

IF THE RECORD EXISTS IN THE TARGET TABLE, IT IS UPDATING THE RECORD WITH THE NEW RECORD, UNDERSTANDABLE..

BUT, I need to write the ROW IN THE (TABLE A) to another TABLE (TABLE B) before UPDATING THE TABLE (TABLE A)

Any help is appreciated!

Posted: Thu May 20, 2004 2:40 pm
by ketfos
Hi Raj,
This is what I mentioned earlier
------------------
Hi,
You are having one source table(TBL A).
Your job will have two outputs from a transformer.
LinK A will have table with output UPDATE ACTION AS 'UPDATE EXISTING ROWS AND INSERT NEW ROWS'
Link B will have INSERT ROWS into another table WITHOUT CLEARING. (History Table)

So Link A will have updated record.
Link B will have all history record.
The table structure for link A and B will be same. You can add a timestamp column to it for audit purpose.
--------------------------
This should work for you and is simple and neat solution/

Posted: Thu May 20, 2004 2:51 pm
by chulett
Not really... not if the record should go to Table B only on an Update action. :? If that's the case...

You really should move away from the "Insert or Update" actions as they don't give you a way to differentiate between the two. Try this - Preload the existing keys into a Hash and do a lookup against it to determine if you need to do an Insert or an Update. Use separate output links for each. When the record doesn't exist in the hash, only do the Insert. If it does exist in the hash, first send the information to Table B and then do the Update action against Table A. All in all, there will be three output links from the Transformer going to two Oracle tables.

If there is a possibility of an Insert and an Update coming in the same load, spin off a link to write any non-hits to the current hash file so that the next time the key comes through it will know it is now an update. This would mean you'd have four links out of the Transformer.

Posted: Thu May 20, 2004 3:52 pm
by ririr
Thank you for the valuable input.
I appreciate it!