Type of Slowly Changing Dimension used in DW

Post by **daignault** » Wed Apr 21, 2010 9:14 am

How many sites out there are using the SCD stage. How do you find performance of SCD compared to other options since it looks like the SCD loads data from the Dimension table as a Lookup stage would be loaded (increasing memory usage).

On a side note, are you building Type1 Dimension tables (Replace) or Type2 (Insert New/collective)?

A client asked me a question about using Type2 SCD so I wanted to research it with people in the field.

Ray D

chowdhury99 · Post by **chowdhury99** » Wed Apr 21, 2010 2:34 pm

Your question is very vague. If you want to know either SCD stage supports both type 1 and type 2, then the answer is yes.

Thanks

debrujr · Post by **debrujr** » Fri Apr 23, 2010 9:26 am

Are you asking about the differences between using a SCD in DS (Whether it be a 1 or 2) as compared to performing it outsude DS? Basicly the performance difference between DS and non DS?

Another option if you wanted, you could "stage" the data for a delta table to be used in an external SCD with DS and let your DB do the rest if you have a lot of data to manipulate before it is added/changed in your SCD..... I would love to hear input on this as well though.

chulett · Post by **chulett** » Fri Apr 23, 2010 9:40 am

The question isn't vague, it seems rather simple. He's asking if people who are building dimensions with DataStage are using the SCD stage or if they are rolling their own solution. And for those who have actually used the SCD stage to do the dirty work, how have they found the performance to be?

I've yet to use it, being old school and not having a problem building jobs to do the needful, so can't really comment on the stage. I've seen others comment that it doesn't quite fit their requirements so they can't use it or aren't willing to make changes to their processes / structures to use it.

So... who is actually using it or specifically decided not to use it? Care to chime in with your experiences? And to the last question, was it primarily with Type 1 or Type 2 dimensions?

vmcburney · Post by **vmcburney** » Thu May 20, 2010 7:14 pm

Used it to build both Type 1 and Type 2 dimensions. Performance was good - as long as you set the Surrogate Key tab to brnig back 1,000 at a time for large dimensions rather than the default 1 key at a time. We used it for conformed dimensions which meant we discarded the fact row output and just used the dimension insert/update output. It saved us development time by handling the date and key fields and making it easy to switch between type 1 and type 2 attributes. Worked well on small dimensions or million row dimensions.