Insert into SQL Server failing same table FK constraint

sbass1 · Post by **sbass1** » Thu Apr 16, 2009 1:37 am

Hi,

I have a SQL Server target table (DRS stage) containing CustomerNumberSK and ParentCustomerNumberSK columns.

The ParentCustomerNumberSK is derived earlier in the job stream via a table self-join. ParentCustomerNumberSK can be repeated and can be null. IOW, the ParentCustomerNumber (natural key) is optional; if specified, it should match an existing CustomerNumber, which may be on another row (and usually is).

The target table has a "same table" FK constraint such that the CustomerNumberSK must exist for the ParentCustomerNumberSK to be written out. The ParentCustomerNumberSK will always be present in the incoming data...it just may be "further down" in the table. This situation is causing row rejects due to the FK constraint.
For example:

Code: Select all

CNSK  PCNSK  Rejected?
1     null   N
2     3      Y
3     3      N
4     3      N
5     7      Y
6     7      Y
7     7      N

I've set the transaction size on the DRS target table to 0. The DRS Guide says:

Transaction size. Specifies the number of rows written before a
commit is executed in the database. A value of 0 causes all rows in
the job to be written as a single transaction.

If I submit "similar" code from the SQL Server client, it works:

Code: Select all

INSERT INTO TargetTable
(CustomerNumberSK, ParentCustomerNumberSK)
SELECT
CustomerNumberSK, ParentCustomerNumberSK
FROM StagingTable

This is the same staging table that is feeding the DRS stage.

Questions:

1. Any ideas on how to get DS to commit the rows in one transaction? I assume this is what the SQL Server client is doing.

2. I find the doc woefully lacking in defining Array Size, Transaction Size, and Transaction Isolation. Are there better docs explaining what these do? Or is this a "read your database documentation" issue?

Thanks,
Scott

Mike · Post by **Mike** » Thu Apr 16, 2009 7:08 am

I don't think transaction size has anything to do with it. I assume your job is doing singleton inserts.... parents must be inserted before children.

The SQL that you posted is inserting an entire set of rows in a single call... the RI check must be happening at the end.

Array size, transaction size, and transaction isolation documentation is best gleaned from your database documentation.

Mike

chulett · Post by **chulett** » Thu Apr 16, 2009 7:51 am

Scott, as Mike noted that sql is only "similar" because the same tables are involved. The actual mechanisms are totally different.

All that setting the Transaction Size to 0 means is to allow all records to be written to the database before the 'commit' is executed. Yes, that means you get everything in 'one transaction' so to speak, but that doesn't mean you can violate any RI constraints in the process. To leave your processing unchanged, you'd need to disable that constraint before sql and then (attempt to) re-enable it after sql. Or perhaps in SQL Server there's some kind of "alter session" equivalent that could help, I have no idea. Otherwise, to the best of my knowledge what you'd need to do is sort your incoming data so any records that are 'parent' record to others in the stream are processed first, so they're visible to the database when the child records arrive and don't blow the FK constraint.

Really don't think there's a way around that in this 'singleton delivery' environment.

sbass1 · Post by **sbass1** » Thu Apr 16, 2009 11:18 am

Hi Mike and Craig,

From your two replies, I assume there's no way I can get DS to mimic the SQL Server client.

An approach Ray told me is to do two passes...as Mike said, parents then children. I'll insert all data except ParentCustomerNumberSK, and in that transformation write a hashed file of the CustomerSK (the PK) and the PCNSK. After the insert, have another transform to do a lookup on the hashed file with an update to a second DRS stage.

I do wonder why it works in the SQL Server client but not in DS. It would be good to get a better understanding of what is happening "under the covers". To that end, a colleague has suggested turning on SQL Profiler to watch how DS is writing the data to the table.

Thanks again,
Scott

chulett · Post by **chulett** » Thu Apr 16, 2009 4:09 pm

Again, to me this really isn't an issue about "mimicking the SQL Server client" as it's got nothing to do with the client. It "works" there because you are doing a more of a "bulk" select and insert, nothing at all like the singleton inserts that DataStage is doing via the DRS stage. Apples and oranges in this context.

If you don't want to reorder your input data to support the constraint, then sure - two operations with the second adding the constraint data is certainly another way to skin this cat.