Page 1 of 1

SCD Purpose Codes

Posted: Wed Apr 07, 2010 4:30 pm
by vdr123
As per the IBM documentation:
Updating the dimension table
Purpose codes are part of the column metadata that the SCD stage propagates to the dimension update link. You can send this column metadata to a database stage in the same job, or you can save the metadata on the Columns tab and load it into a database stage in a different job. When the database stage uses the auto-generated SQL option to perform inserts and updates, it uses the purpose codes to generate the correct SQL statements.

It says the it writes the purpose codes on the output link so that DB stage can figure out what sql to generate?

I dont see these purpose codes any where on the output link? Any thoughts where its storing them on the output link?

Re: SCD Purpose Codes

Posted: Thu Apr 08, 2010 12:17 pm
by jgajardo
vdr123 wrote: I dont see these purpose codes any where on the output link? Any thoughts where its storing them on the output link?
I haven't seen anything being stored anywhere. The Upsert that gets generated will try to do an update based on the surrogate key, if it fails because of a new surrogate key then it will insert. It's sort of "brute force" ...

In the case of a pure SCD1 table (all the columns are SCD1) it doesn't matter since you will have the same number of surrogate keys as there are business keys (ie: one to one). The update being based on the surrogate key is correctly applied.

If you have a mixture of SCD1 and SCD2 columns then this doesn't work anymore because the update will only be applied on the current record based on the surrogate key.

You want to be able to do the update on the business key, so the history of the SCD1 column is correctly maintained. For this you have to roll you own.

see this thread: viewtopic.php?p=357102&highlight=#357102

Posted: Thu Apr 08, 2010 1:27 pm
by vdr123
PS: I have looked at the post already.

Want to know what the documentation means that the links to db will use Purpose Codes.

I would want to do my Updates first and then my inserts on the target.

The business key is always null for updates, so i am using this to split inserts VS updates, as I am not loading to the target table in the same job.