DSXchange

Posted: **Sun Oct 09, 2005 4:36 am**

Hi,

I am testing transaction handling feature in DS. The database is SQL Server.

My test is very simple. I just want to import 2 rows of records from a text file into SQL Server. These 2 rows of records are duplicated in the text file and I have specified unique constraint on SQL Server side. I have supposed that SQL Server will rollback the transaction when it finds duplication. However, there is still one row inserted into SQL Server after the test. Although SQL Server has rejected the second row, I still want the whole transaction to be rollback that is no rows should be inserted.

I have tried ODBC stage and MSOLEDB stage and I set transaction size to be different numbers, but still with no success.

Thanks.

Posted: **Sun Oct 09, 2005 7:32 am**

Try zero. It means all or nothing.

Posted: **Sun Oct 09, 2005 7:55 am**

There's a little bit more to it than that.

Yes, zero means 'all or nothing' and is needed in order to rollback an entire transaction if there are problems. However, in order to trigger the rollback, the job must abort - not finish in any fashion, including 'with warnings'.

Accomplish this by various methodologies. Set a very low Warnings limit when the job runs, like 1. Some stages (like OCI) have a 'Treat warnings as Fatal' option which aborts the job on the first warning, much like the previous method. Then there is the Reject link with the 'Abort after X rows' set to something other than zero, again a 1 in your case to trigger the rollback.

I'm sure there's more, but there are three off the top of my head.

Posted: **Sun Oct 09, 2005 8:55 am**

You are right. Setting rows per transaction to be zero cannot resolve the problem.

I used your method that is to set abort job after 1 warning and it worked. However, I really don't think it is a good method because some warnings are not related to database issue.

Do you have any better idea? Can we leverage the constraint setting in the transformer stage?

Thanks.

Posted: **Sun Oct 09, 2005 9:16 am**

gordonxiao wrote:Do you have any better idea? Can we leverage the constraint setting in the transformer stage?

Yes. As noted, you can setup a specific reject row link and constrain it so that only 'database issues' are sent down it using the Output Link Variable of REJECTEDCODE. Set that constraint to 'Abort after 1 row'.

Posted: **Sun Oct 09, 2005 5:42 pm**

Okay, okay, I completed my thought but failed to type it all in. 0 commit means commit at the end, any db rejects require a reject link, with an abort after setting to blowup the job after N links. As standard practice I always require absolute perfect db work so setting abort after to 1 is a standard. I didn't articulate that.

Craig suggests differentiating rejects into degrees of issue. A row rejected for constraint violations can be handled differently from a row that was rejected for truncation or data type reasons. However, I don't believe warning messages contribute their row to the reject link, because they didn't reject.

Posted: **Sun Oct 09, 2005 7:45 pm**

I think the key point here is, that if we want to rollback the entire transaction, we should make the job abort. However, doing this will sometimes make things in confusion because we need to take into account job reset and job restart.

Posted: **Sun Oct 09, 2005 7:55 pm**

True, you can't pick and choose groups of rows out of a data set to commit and others to rollback. If you're using Server to do something like that, you're probably using the wrong flavour of the tool. Probably the TX side of the house handles that type of processing better.

Server and PX are aimed towards batch style processing. Usually, the load data set is meant to be loaded in its entirety. Batch ETL style processing usually picks out inserts from updates, dedupes, and generally avoids all database issues (surrogate key assignments during the T in ETL, and then bulk loading) so that when the L in ETL comes along it's usually pretty straightforward.

Posted: **Sun Oct 09, 2005 8:51 pm**

No, the key point here is that if you want to rollback the entire transaction, you must make the job abort.

Job reset should be handled automatically by your job control, and restart is as easy as it gets in an all-or-nothing load - just run the job again. If there's something in your job design preventing that, redesign the job.

Posted: **Mon Oct 10, 2005 12:41 am**

Working with rollback and the transactions is a concept originally created for oprational databases where you have a lot of tables concerned but only a few rows.

Where we made good experience was adding each table a column, where we set an Id for the loading process, so that we are able (even after a long time) to delete a wrong loading.

Wolfgang

Posted: **Wed Oct 19, 2005 10:42 am**

I know this not what you are planning...
but sometimes different approaches are helpful.

Could you write to a staging table, then, if the job completes successfully, you can move data from staging into production. Otherwise truncate the staging table ?

B.

DSXchange

How to rollback transaction?

How to rollback transaction?