Page 1 of 1

Doubt about Surrogate Key Generation

Posted: Tue Feb 19, 2008 4:02 pm
by manishk
I have a table called "Test".

Data in the Table is :

Test
--------------------------------
Col 1 | Col 2 | Col 3
-------------------------------
123 | abc | test0
223 | dcf | test2
123 | abc | test3

I want to generate the surrogate key based on the values in Col 1 and Col 2.

When i am generating the key using the surrogate stage and defining Col1 and Col 2 as key in Metadata, i am getting the result as :

TestTarget
--------------------------------
Col | Col 1 | Col 2 | Col 3
-------------------------------
1 |123 | abc | test0
2 |223 | dcf | test2
3 |123 | abc | test3

I was thinking that i will get 2 rows in output and the Datastage will drop a row as the record seems to be duplicate based on the keys

If the above result is correct then this can be achieved using a Column Generator and increment the value by one.


Just wondering what difference it make having a surrogate key.

Thanks
Manish

Posted: Tue Feb 19, 2008 5:23 pm
by ray.wurlod
The point about a surrogate key is that it allows history to be preserved. What you have illustrated above is a Type 2 slowly changing dimension, when the actual key remains the same, a non-key column changes value, and you need to preserve the history. In a Type 1 slowly changing dimension you would typically simply update Col3. In either case it is wise to have extra column(s) in the target table to identify when the record was created, last upated, and by what process, and, possibly, a simple identifier of which is the currently active version of the row for any particular key value. All this gives you the flexibility to re-run or to unwind changes.