Page 1 of 1

Easy way to do an update only when needed

Posted: Fri May 20, 2016 11:32 am
by simc
so I`m a newbie with datastage, learning the ropes, so this might be a dumb question.

Is there and easy way/stage to do this?

So I want to insert new, and only update rows that have changed. Something like the SCD stage, but without the surrogate key.

For simplicity sake, three fields: Key, Data, Mod_Date

- new Key.....insert with sysdate as Mod_Date
- existing Key with existing Data.... do nothing
- Existing Key with new data... update row with sysdate as Mod_Date

It`s a simple/common task, I'm hoping there is a simple/common solution that I just don't know yet.

Thanks!
Charles

Posted: Fri May 20, 2016 12:16 pm
by rkashyap
Depending on the size of source and target data, this requirement can be met with multiple designs. For a "reasonable size" of target, see Change Capture and Change Apply stages in Parallel Job Developer's Guide.

Posted: Fri May 20, 2016 2:08 pm
by simc
Awesome, Thank you, that's worked great.

Also, that book is gonna be my bible. I almost want a hard copy!

Posted: Mon May 23, 2016 4:32 am
by ray.wurlod
Maybe you would also like to know that there's a manual called the Parallel Job Advanced Developer's Guide, plus a whole heap of good technical documentation on developerWorks, plus a number of RedBooks dealing with various DataStage topics.

Posted: Tue May 24, 2016 7:52 am
by simc
Thanks Ray! I'll look those up too.