Type of Slowly Changing Dimension used in DW

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Type of Slowly Changing Dimension used in DW

Post by daignault »

How many sites out there are using the SCD stage. How do you find performance of SCD compared to other options since it looks like the SCD loads data from the Dimension table as a Lookup stage would be loaded (increasing memory usage).

On a side note, are you building Type1 Dimension tables (Replace) or Type2 (Insert New/collective)?

A client asked me a question about using Type2 SCD so I wanted to research it with people in the field.

Ray D
chowdhury99
Participant
Posts: 43
Joined: Thu May 29, 2008 8:41 pm

Post by chowdhury99 »

Your question is very vague. If you want to know either SCD stage supports both type 1 and type 2, then the answer is yes.

Thanks
debrujr
Participant
Posts: 56
Joined: Fri Jul 31, 2009 1:05 pm
Location: South

Post by debrujr »

Are you asking about the differences between using a SCD in DS (Whether it be a 1 or 2) as compared to performing it outsude DS? Basicly the performance difference between DS and non DS?

Another option if you wanted, you could "stage" the data for a delta table to be used in an external SCD with DS and let your DB do the rest if you have a lot of data to manipulate before it is added/changed in your SCD..... I would love to hear input on this as well though.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The question isn't vague, it seems rather simple. He's asking if people who are building dimensions with DataStage are using the SCD stage or if they are rolling their own solution. And for those who have actually used the SCD stage to do the dirty work, how have they found the performance to be?

I've yet to use it, being old school and not having a problem building jobs to do the needful, so can't really comment on the stage. I've seen others comment that it doesn't quite fit their requirements so they can't use it or aren't willing to make changes to their processes / structures to use it.

So... who is actually using it or specifically decided not to use it? Care to chime in with your experiences? And to the last question, was it primarily with Type 1 or Type 2 dimensions?
-craig

"You can never have too many knives" -- Logan Nine Fingers
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Used it to build both Type 1 and Type 2 dimensions. Performance was good - as long as you set the Surrogate Key tab to brnig back 1,000 at a time for large dimensions rather than the default 1 key at a time. We used it for conformed dimensions which meant we discarded the fact row output and just used the dimension insert/update output. It saved us development time by handling the date and key fields and making it easy to switch between type 1 and type 2 attributes. Worked well on small dimensions or million row dimensions.
Post Reply