trigger performance (server and parallel)

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
tostay2003
Participant
Posts: 97
Joined: Tue Feb 21, 2006 6:45 am

trigger performance (server and parallel)

Post by tostay2003 »

Hi all,


I have two tables namely a) emp b) emp_chg (emp contains the history data and emp_chg contains new data). Now i need to compare both these tables and load data back in 'emp', with a flag being set if at all there has been update to any column or any new record has been inserted.

I have done this (in server jobs) by writing the condition for flag in variable. Eg.

if( (emp.empno=emp_chg.empno) and
(emp.ename<>emp_chg.empno or
emp.job <> emp_chg.job or
.....
.....
......
)
) then
"y"
......
.....
.......

This worked well just for practicing. But how about in real case where there are millions? IS IT EFFICIENT TO WRITE A TRIGGER OR USE THESE CONDITIONS????

How about in ParallelExtender? will the usage of Change Capture and Change Apply be more efficient than trigger.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Two input to a join, emp and emp_chg. Check the same in a transformer.
if( (emp.empno=emp_chg.empno) and
(emp.ename<>emp_chg.empno or
emp.job <> emp_chg.job or
.....
.....
......
)
) then
"y"
......
.....
.......

Load it in to emp.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
mirja
Participant
Posts: 6
Joined: Sat Mar 11, 2006 7:45 am
Location: chennai
Contact:

Post by mirja »

trigger wont be good.....as u have to write a row lavel trigger
and for each row it will be evoked........
So for the above problem u can compare both the table in transformer and in the constraint u write ur whole condition to filter the required data and then in the coresponding column u can write the required value u need........
othewise if u r having oracle table u can use merge statement.....it is effective for comparing two table........
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Observation: the time spent typing those dots could have been better spent typing "you" and "are" appropriately.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mirja
Participant
Posts: 6
Joined: Sat Mar 11, 2006 7:45 am
Location: chennai
Contact:

Post by mirja »

Ray,
rather than obseving my reply if you could reply on the problem it would be better.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your original question regards comparing two rows and if they are different acting on that change. A trigger is anathema to ETL, in that you are embedding transformation and loading logic within a database, removing it from the visibility of the ETL framework.

Consider the various techniques found on this forum: CRC32 value storage, column by column comparison, or all columns as a string comparison.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Mirja: You wasted your last post unfortunately, doing the same as you said.

tostay: You can how two output links, each for insert and the other for update. This will be after doing the delta on the dataset. So while inserting or updating you can pass the required field and have it pass as a string.


----XFM ---- Insert
Delta -----|
-----XFM ----- Update
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

mirja wrote:Ray,
rather than obseving my reply if you could reply on the problem it would be better.
Maybe after you've posted 1000 responses you might be qualified to judge the appropriateness of Ray's commentaries on this forum. In the meantime, remember who Ray is and the obvious time he devotes to making this forum the rich resource it is.

Part of that involves "teaching" posters how to properly format questions and posts so that we can give meaningful answers more readily and have a legible resource for future members.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
mirja
Participant
Posts: 6
Joined: Sat Mar 11, 2006 7:45 am
Location: chennai
Contact:

Post by mirja »

sorry Ray,
Extremely sorry Roy.Actually i am using Dsxchange first time and honestly dont have the feel of it till now.so when i posted my reply ,i am waiting anxiously for reply.so when i got the reply from u,i repled blindly.
so sorry once again.[/quote]
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

OK, you're forgiven. It's all part of the learning experience. Remember that probably more than 50% of this world-wide resource do not have English as their first language, so abbreviations of this kind are even more confusing. Contemplate if j started using French abbreviations for v.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
tostay2003
Participant
Posts: 97
Joined: Tue Feb 21, 2006 6:45 am

Post by tostay2003 »

Hi All,

DSGuru, to be honest, didnt understand what you have suggested.
As per Kcbland, i have to get premium content soon.

My basic question is,

Is trigger good or writing that logic in state variable and adding to the derivation column of the target. Why and How, pls compare with how the logic deals as well? Easy description will be really appreciated.

Next question is, how about this in parallel extender. using change capture and change apply will be helpful or trigger has to be used.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Check out the Difference and Compare stages. Change Capture stage is really only preparing a structure that can be used by a Change Apply stage to reconstruct the "after" input from the "before" input.
Compare stage generates a result (enumerated) and two subrecords containing the original rows.
Difference stage generates the "after" row plus the result (enumerated).
Either of the last two would allow you to compare source with target. You would take the insert route if the "after" row (target) was found not to exist in the "before" row (your new source), and the update route otherwise.
You can also determine whether to take the update route or not depending on whether there's actually been a change.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Sorry for the miscommunication.
What I as trying to say is that, if you are using server jobs to develop this logic, the what I would suggest is that have two outputs after you have done the Delta( the change in records) and each of them going to the target table. Write a user defined SQL so that one link updates the records and the other inserts it. If for instance the records go in DSLinkInsert then the update records will be rejected. Meanwhile you will have a trasformer before the target stage in which a flag will be defaulted "I"( insert). Similarly the second link will work. You need not worry about the rejects.
Or you can have one link as an output only inserting and capture the rejected records, and update them in an another job.

If you have parrallel, I would personally prefer that. There you can use Change Capture stage. The change capture stage compares the two data sets, before and after, and flags them as 0,1,2,3, each of them having a different meaning. After the Change capture you can filter the records and have the desired updates or inserts you want to do. You may refer the EE manual for operating the stage.
I hope this helps.
You might not be able to see Rays, post so he has also adviced using Change Capture.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply