SCD - Architecture

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

SCD - Architecture

Post by Raftsman »

What is the design architecture of the SCD? Is it the SCD stage a lookup with extra functionality or it is a join stage with extra functionality. We are not if we need to sort and repartition keys prior to entering the stage. Is the lookup done in memory?

We are not sorting and repartitioning key and for now, most of our jobs have returned correct value. Another collegue of mine ran into an issue where records did not much and created new surrogates.

What is the standard rule for using this stage.

Thanks
Jim Stewart
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I thought all that was explained fairly well in the Parallel Job Developer Guide pdf entry for that stage.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

So did I until I ran into serious issues with the stage. We have implemented this stage in all our EDW and now we are second guessing the way it works. I am just looking for a simple answer, does the stage do it lookup in memory or is it like a join. If it's like a join, then we have serious issues. We will need to go through all our module and insert partitioning and sorting.

BTW, I always refer to the manual prior to asking a question. The manuals are not the greatest.
Jim Stewart
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

From the manual:
Each SCD stage processes a single dimension and performs lookups by using an equality matching technique. If the dimension is a database table, the stage reads the database to build a lookup table in memory.
There's also this:
Input data to SCD stages must accurately represent the order in which events occurred. You might need to presort your input data by a sequence number or a date field. If a job has multiple SCD stages, you must ensure that the sort order of the input data is correct for each stage.

If the SCD stage is running in parallel, the input data must be hash partitioned by key. Hash partitioning allows all records with the same business key to be handled by the same process. The SCD stage divides the dimension table across processes by building a separate lookup table for each process.
Does that help?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Raftsman
Premium Member
Premium Member
Posts: 335
Joined: Thu May 26, 2005 8:56 am
Location: Ottawa, Canada

Post by Raftsman »

So what does AUTO do. Would it not take care of this. We do not have to partition and presort the lookup stage. I would of assumed the same.
Jim Stewart
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

While you don't "have to" partition / sort for the lookup stage, there are times when it is best to explicitly do that yourself rather than hoping the "auto" partitioning will get it right. Just went through that here.

Ray has a post here somewhere where he explains what "auto" partitioning does (and does not) do for you and from what I recall, when it's best to switch it off. I'll see if I can dig it up... unless Ray or Arnd beats me to it or answers it here. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Here's one but don't think it's the one I was actually remembering, will poke around again later.

viewtopic.php?t=128688
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply