'Unit Of Work' commits?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

'Unit Of Work' commits?

Post by mczern »

I'm working with a client that is relatively new to DataStage and has mainly dealt with COBOL on the mainframe. They are enquiring about the ability of DataStage to mimic the concept of 'Unit of Work'....

Is there a way to package a DataStage job to commit related records to multiple tables?
Is there a way to force upserts of a table to rollback if one of the table upserts, within that 'Unit of Work,' fails?
And finally, if both of the above are possible, implement the concept of continuing to process 'Units Of Work' after a failure (ie: If during the 3rd of 5 units of work, a failure occurs, restart the process from the 3rd 'Unit of Work' instead of from the very beginning of the cycle)?

This smells like one or more sequences of jobs, but I'm really not sure. I've never had to implement this concept before within DataStage.
Mike Czerniawski
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Some of this you can do with server. I think version 8 has transaction groupings in PX as well. The hard part is starting from the point of failure. You need to figure that out on your own. Transaction groups will allow you to group records across links and commit them all as one transaction. It is rather crude or simple but does work.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Distributed Transaction stage (version 8.1.2 and later) allows for transaction grouping, two-phase commit and lots of related things.

(I have to check which of this functionality is currently available and which is in the next release. But I am not currently able to do so.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The ODBC stage can be used to enforce 'Unit of Work' commits, I've done it in the past (with Ernie's help) in webservice jobs. You have to manage the 'commit' and 'rollback' statements at the proper times / boundaries, but it can be done.

As Kim noted, restarts are completely on you. Typically, it is a matter of leaving breadcrumbs, knowing where the last (failed) run ended and skipping incoming data (usually via a parameter driven constraint on row counts) until you get past that point. And, of course, this really is only possible with a static source.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Yes...it can be done...it's not "easy", but it is definitely do-able. A Server Transformer can send multiple links into an ODBC target.... one link (say) for an insert to Parent table, another link for (say) the insert to a child.... depending on what is happening in each row upstream, there could be many children for each parent..... What happens if the 72nd child row gets rolled back because of a problem in the rdbms? ....or if your own logic decides that the 72nd child row is no good? You use custom SQL with two additional links....one that is a "rollback" link and another that is a "commit" link. You decide explicitly when, how and where to fire those links, via a complex set of constraints that checks your own transaction validity rules, and also rigorously checks the return codes that come back from each of the parent and child links.
This is very explicit control....I have found that it is most consistent with the ODBC stage, due to flucutations in return codes from the rdbms. It is a complex technique, but it works.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Aquilis
Participant
Posts: 204
Joined: Thu Apr 05, 2007 4:54 am
Location: Bangalore
Contact:

Post by Aquilis »

In Server canvas, 'transaction grouping' comes into picture when the ORACLE stage(ODBC/OCI) receives multiple links from the same transformer stage. The problem with the PX job is that multiple processes are started, and there is no way for one process to rollback another process.
For PX jobs you will need to look at the Unit of Work operator. For PX Jobs
I raised an IBM ticket on this long back and IBM has came up to me saying it will included into 8.1 & next versions. now I am using 8.1 but it's not having this functionality yet.

Hopefully needs to veify with 8.1.x.
Aquilis
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Transaction Grouping is another beast - similar, perhaps, but not the same as what the OP needs.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply