Transaction Grouping
Moderators: chulett, rschirm, roy
Transaction Grouping
I am using Transaction Grouping to group 2 input streams to the same OCI Stage. One stream is an update to set a flag in a table and the second is an insert of a new record in the same table. What I want to happen is that when either stream fails, both transactions need to roll back.
However, when I force the job to abort during the insert and look at the underlying table, the previous update has been committed.
I have done the following;
Enabled transaction grouping
Stream Name: Update OnSkip: Continue OnFail: Rollback
Stream Name: Insert OnSkip: Continue OnFail: Rollback
Is there another option I neeed to set up?
However, when I force the job to abort during the insert and look at the underlying table, the previous update has been committed.
I have done the following;
Enabled transaction grouping
Stream Name: Update OnSkip: Continue OnFail: Rollback
Stream Name: Insert OnSkip: Continue OnFail: Rollback
Is there another option I neeed to set up?
It was a generic reference, so I should have ditched the caps. You are correct that the field actually called 'Transaction Size' next to the 'Array Size' is indeed there for backwards compatability only and is ignored. The 'Rows Per Transaction' tab is the only one that applies anymore.
That's what happens when you do stuff from memory.![Confused :?](./images/smilies/icon_confused.gif)
That's what happens when you do stuff from memory.
![Confused :?](./images/smilies/icon_confused.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thanks for the responses. Was using a colleagues logon when I originally posted whilst getting my login sorted out.
I don't want to commit after each record, I want to commit after the entire process has completed. Therefore if either of the streams fails, rollback everything. You are right, array size is always 0 for transaction grouping.
I don't want to commit after each record, I want to commit after the entire process has completed. Therefore if either of the streams fails, rollback everything. You are right, array size is always 0 for transaction grouping.
Now that I've re-read the Help... since 'Rows Per Transaction' is always forced to 1 when 'Transaction Grouping' is enabled then all you can rollback is the current 'transaction' - in your case the pair of rows you working with at any given time. Doesn't sound like that is what you want.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I assume Omigos is talking about the practice of splitting your Inserts and Updates into seperate jobs / streams to maximize the throughput of both, perhaps even bulk-loading your Inserts. Please correct me if I'm wrong.
However, since this thread is about Transaction Grouping, that may not be an option for you. For the record, how important is this concept to what you are doing or need to do? Or is it just something you are trying out for fun?![Laughing :lol:](./images/smilies/icon_lol.gif)
However, since this thread is about Transaction Grouping, that may not be an option for you. For the record, how important is this concept to what you are doing or need to do? Or is it just something you are trying out for fun?
![Laughing :lol:](./images/smilies/icon_lol.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You are correct, bulk loading or splitting into separate stages is not an option for us (unless we can somehow ensure Transaction Grouping). That is, if we perform an update then an insert, yet the insert fails, the previous update must not take effect. We do not have failures often, however if we do, we want the table to remain in the same state as before the job commenced. Performance is also an issue as we need to run these jobs in the optimal amount of time.
Can you use a temporary table for this and once this job is finished successfully then load the data from this temp table to the original table.
In this way you will not be altering the data of the original table. this shld be possible as you are updating the records you are inserting into.
The transaction grouping definitely commits one record as far as I know and you can't change the number less than one and yes there is a performance issue here.
All this shld be fixed if you use a temp table.
In this way you will not be altering the data of the original table. this shld be possible as you are updating the records you are inserting into.
The transaction grouping definitely commits one record as far as I know and you can't change the number less than one and yes there is a performance issue here.
All this shld be fixed if you use a temp table.
Performance and transaction grouping are cross-purposes. Loading multiple tables simultaneously with referential integrity and parent-child relationships is not performant. You must commit the transaction across all grouped tables as a set. You do not have a whole session commit capability, only the set level.EmmaP wrote:You are correct, bulk loading or splitting into separate stages is not an option for us (unless we can somehow ensure Transaction Grouping). That is, if we perform an update then an insert, yet the insert fails, the previous update must not take effect. We do not have failures often, however if we do, we want the table to remain in the same state as before the job commenced. Performance is also an issue as we need to run these jobs in the optimal amount of time.
If you are doing a real-time solution using Datastage Server (unknown since you hijacked this post, friendly advice: I suggest you start new ones next time so we get your system/release information) then maybe you should look at the new RTI release.
Performant batch data warehousing is easy, high performance loading is done using bulk loaders. DML versus bulk loading is 100's times slower. You have eliminated parallel DML as an option, as well as performance options (don't know your database, can't speak directly to point). Furthermore, you should be validation all constraints, be it data characteristics (type, length, precision, etc) as well as RI in your ETL solution. Any load failures should be 100% restartable, as in out of space, simply extend the tablespaces and restart the loads.
I have avoided this post and others because the issue is not the tool. Its the design concept that is the problem. Everything descends from this. Unless you're doing real-time, tried-and-true batch data warehouse concepts should apply. If you're struggling with RI, foreign key constraints, and lack of ability to use bulk loaders, you need to read up on Kimball and every post at TDWI's website on do's and don'ts in ETL architecture.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Actually Ken, you must have missed this a few posts up:kcbland wrote:If you are doing a real-time solution using Datastage Server (unknown since you hijacked this post, friendly advice: I suggest you start new ones next time so we get your system/release information) then maybe you should look at the new RTI release.
So, you can call off the Air Marshalls, there's no hijacking here.EmmaP wrote:Thanks for the responses. Was using a colleagues logon when I originally posted whilst getting my login sorted out.
![Wink :wink:](./images/smilies/icon_wink.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers