Insert New Rows Only using ODBC Connector to MSSQL
Posted: Wed Feb 11, 2015 8:15 am
Hi All
I am having a little bit of an issue when executing a process to update a target MSSQL DB using an ODBC connector stage with the Write mode set to Insert New Rows Only.
The issue is, that it works for some target tables and not others. We have checked for differences in the table set up between tables that did work and tables that didn't work and can see no obvious difference.
Further Detail:
Job Design: The job design is very simple.
1. It reads from Oracle using an ODBC Connector with RCP on. Ie no columns are specified. The source tables do not have key values specified.
2. The data passes through a transformer stage (this was added to capture rejects, but it too is RCP with no column specified)
3. The target is ODBC Connector, with parameters for table name and Table Action. The Write Mode is set to 'Insert new rows only'. The target DB tables have primary key values. The arrays size, and record count are set as 2000, Isolation level is set to 'Read uncommittted'
The idea behind the job is that it can be run once for a given table, and the table action will be set to Truncate. This will load an initial view of the data.
A second pass of the table will then happen (the source table has since been updated). The second pass will have the taret table action set to 'Append'. This in conjunction with the Write mode 'Insert new rows only' should in theory check the target DB metadata for the key values and insert new rows only. (Or at the very least, fail on the PK violation and not insert existing records).
Currently this job is used to move some 350 odd tables. However, when running the second pass of information sometimes the job will work, other times it will just hang. It is consistent with the target tables on which it hangs and for which it works.
I have talked to a DBA (of sorts) and we have found the following:
* There is no obvious difference between the tables' set up
* There are no foreign key relationships
* There are no other processes running against the table.
We have also tried the following:
* Various array and record count sizes, chaning Isolation level settings between Serializable and Read Committed - no effect - the jobs will still hang for the offending tables. In some instances it causes the tables that once worked to no longer work.
* Drop and recreate the table - no effect - job will still hang.
* Create a copy of the table with the same structure - job will work as it is supposed to.
* Create the table on a new DB (same server) - job still hangs.
* I have found that it works for tables which have 160,000 rows, but can fail on a table with 13 rows. The number of rows does not seem to be a factor.
* It will fail on the tables whether there are new records to insert or not. At first I was thinking that it was failing only when there were new records, but testing has shown this to be incorrect.
These tests suggest to me that there must be something attached to the offending tables that is preventing the update.
OR
That the way that MSSQL is determining the record locks is for some reason causing a deadlock.
Has anybody seen such behaviour before? Is there a setting that I have potentially missed, either on the ODBC connector stage, or in the odbc.ini file. Can anybody share how MSSQL determines its record locks (ie can it be different at the table level as opposed to the DB level?)
EDIT:
We have done further investigations and tests, and have found that the issue is begin caused by the Primary Key name on the tables. If we change the primary key name the process seems to work. No idea as to why this would be the case. I am thinking that it might be the actual ODBC connector somehow.
I did update the job to used a target DRS stage, but that too has issues. The insert new rows option does not seem to work, it finishes ok, but doesn't actually update anything in the target DB.
Any comments are welcome at this point.
Regards
Shane
I am having a little bit of an issue when executing a process to update a target MSSQL DB using an ODBC connector stage with the Write mode set to Insert New Rows Only.
The issue is, that it works for some target tables and not others. We have checked for differences in the table set up between tables that did work and tables that didn't work and can see no obvious difference.
Further Detail:
Job Design: The job design is very simple.
1. It reads from Oracle using an ODBC Connector with RCP on. Ie no columns are specified. The source tables do not have key values specified.
2. The data passes through a transformer stage (this was added to capture rejects, but it too is RCP with no column specified)
3. The target is ODBC Connector, with parameters for table name and Table Action. The Write Mode is set to 'Insert new rows only'. The target DB tables have primary key values. The arrays size, and record count are set as 2000, Isolation level is set to 'Read uncommittted'
The idea behind the job is that it can be run once for a given table, and the table action will be set to Truncate. This will load an initial view of the data.
A second pass of the table will then happen (the source table has since been updated). The second pass will have the taret table action set to 'Append'. This in conjunction with the Write mode 'Insert new rows only' should in theory check the target DB metadata for the key values and insert new rows only. (Or at the very least, fail on the PK violation and not insert existing records).
Currently this job is used to move some 350 odd tables. However, when running the second pass of information sometimes the job will work, other times it will just hang. It is consistent with the target tables on which it hangs and for which it works.
I have talked to a DBA (of sorts) and we have found the following:
* There is no obvious difference between the tables' set up
* There are no foreign key relationships
* There are no other processes running against the table.
We have also tried the following:
* Various array and record count sizes, chaning Isolation level settings between Serializable and Read Committed - no effect - the jobs will still hang for the offending tables. In some instances it causes the tables that once worked to no longer work.
* Drop and recreate the table - no effect - job will still hang.
* Create a copy of the table with the same structure - job will work as it is supposed to.
* Create the table on a new DB (same server) - job still hangs.
* I have found that it works for tables which have 160,000 rows, but can fail on a table with 13 rows. The number of rows does not seem to be a factor.
* It will fail on the tables whether there are new records to insert or not. At first I was thinking that it was failing only when there were new records, but testing has shown this to be incorrect.
These tests suggest to me that there must be something attached to the offending tables that is preventing the update.
OR
That the way that MSSQL is determining the record locks is for some reason causing a deadlock.
Has anybody seen such behaviour before? Is there a setting that I have potentially missed, either on the ODBC connector stage, or in the odbc.ini file. Can anybody share how MSSQL determines its record locks (ie can it be different at the table level as opposed to the DB level?)
EDIT:
We have done further investigations and tests, and have found that the issue is begin caused by the Primary Key name on the tables. If we change the primary key name the process seems to work. No idea as to why this would be the case. I am thinking that it might be the actual ODBC connector somehow.
I did update the job to used a target DRS stage, but that too has issues. The insert new rows option does not seem to work, it finishes ok, but doesn't actually update anything in the target DB.
Any comments are welcome at this point.
Regards
Shane