Page 1 of 1

Referential Integrity

Posted: Fri Jun 11, 2004 1:19 pm
by von
Hi,
Can anybody tell me how Referential integrity is handled in
DataStage,

Say we have two Oracle tables
i want to check if The column data present in table 1 has the corresponding value in table 2 this check has to be performed on Fk on the table2 PK is on the Tabe1

Thanks
von

Posted: Fri Jun 11, 2004 1:52 pm
by mhester
Von,

If you will not be relying on the DB to ensure RI then it is up to you as a developer to ensure RI is enforced. One way this can be accomplished would be to use Hash tables as lookups and reject rows where FK's are not found. I'm sure others will have more to say, but I believe the answer to be - it is a manual process if the DB is not enforcing RI.

Regards,

Michael Hester

Posted: Fri Jun 11, 2004 2:05 pm
by kduke
Von,

Mike is right it is a manual process. You also need to load the tables in the correct order too. Usually means loading fact tables last because they have keys to all the dimensions. I doubt if this is different in any methodology or programming tool.

Posted: Fri Jun 11, 2004 3:48 pm
by von
Thanks mike and Duke

I was also thinking of the same way if anything else please let me know


Thanks again
Von

Posted: Mon Jun 14, 2004 1:26 am
by rsrikant
I think Mike has answered your question completely.

Load the data from table 2 into a hashed file.

Use a ODBC stage for table 1. Use a transformer which uses both ODBC & Hashed file (used as a lookup) stages as inputs and map the foreign key from ODBC stage to the primary key hashed file stage. Have a output link to write the records for which the foreign key from ODBC is found in hashed file. Have a rejects link from transformer to write the records for which the foreign key from ODBC is not found in hashed file.

I am not sure if there is some other way to do this. Only our experts can tell us about their experiences.

Regards,
Srikanth Reddy

Posted: Mon Jun 14, 2004 4:19 pm
by ray.wurlod
Handling referential integrity in DataStage - like in any other realm - is as simple as making sure that referential integrity is not violated.

To do this in DataStage, you need to have accessible a reference lookup to the referenced column values in the referenced table when loading the referring table, and not attempt to load the referring table with any row that would violate foreign key constraints.

Easiest is to pre-load the referred columns into a hashed file. This might be a key-only table, or might contain additional columns if you're using the reference lookup to obtain additional information for the load, such as performing de-normalization or maintenance of slowly-changing dimensions.

Some databases allow you to guarantee referential integrity by, instead of taking a violation, generating a new row in the referenced table, or changing the foreign key value to some predetermined "default". There is no reason DataStage could not take advantage of this; it would mean that there would be nothing to do in DataStage to ensure that referential integrity is not violated. In general, however, this approach appears not to be in favour.

Posted: Tue Jun 15, 2004 7:18 am
by von
Thank you Srikanth & Ray

That solves my problem

Von