Page 1 of 1

Performance concern for SCD stage for very large dimension

Posted: Thu Mar 12, 2009 12:02 pm
by longma98
Our company is getting version 8 very soon, and is considering implement type 2 SCD using SCD stage. I have read SCD stage use in-memory lookup. If we have a monster-type SCD type 2, (I have talking about potentially hundreds of millions rows with at least dozens of type-2 columns), is SCD stage still a valid choice?

Has anyone used SCD stage for large volume dimension table? What happens when total dataset size is larger than physically available memory?

Thanks

LM

Posted: Thu Mar 12, 2009 12:28 pm
by Raftsman
Appending to your question, we have found the internal surrogate key generator in the SCD very slow for large volumes on the initial load. Is there a way to incorporate the Surrogate key generator stage into this mechanism. It is much more efficient.

Posted: Mon Mar 16, 2009 1:07 am
by richdhan
Hi,

Search the forum for CDC topics. That should give you more information on how to handle SCDs as well as to generate SKs.

HTH
--Rich

Posted: Tue Mar 17, 2009 9:02 am
by longma98
richdhan wrote:Hi,

Search the forum for CDC topics. That should give you more information on how to handle SCDs as well as to generate SKs.

HTH
--Rich
I don't have a problem using CDC or SK. My question is that if the new SDC stage is rigorous enough for us to throw some heavy stuff at it. The new SCD stage centainly looks very interesting, and will make coding and maintenance much easier.
If we don't have a high confidence that it won't choke on large volumn, then we have to maintain 2 code bases until we test it out during later stage of development cycle. That will make life a little more interesting for us.

LM