Query on SCD Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Query on SCD Stage

Post by balu536 »

Hi,
I've a query on Slow Changing Dimensions (SCD) Stage.

Do we need to sort the data (on the input links) on the Business Keys being used in the stage?
Is it mandatory for proper functioning of SCD stage, like we do for Join and Merge stages?


Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Data do not need to be sorted on business key. When the dimension table is loaded into memory the business key column is identified. That is sufficient.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Thanks Ray.

Also is it fine if we do partitioning (Hash) on the source link alone or is it required on both source and reference link?

Please clarify.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Yes, the partitioning should match on both links. However, if the other link is set to "auto" then DataStage will determine that it needs to be set to match the other input stream and set it to "Hash" on the right keys without tell you. I believe if you look at the osh you can verify that.

However, I always recommend setting stages explicitly so it is more evident where the partitioning is occurring.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Yes and No. Yes, the partitioning should match on both links. However, if the other link is set to "auto" then DataStage will determine that it needs to be set to match the other input stream and set it to "Hash" on the right keys without tell you. I believe if you look at the osh you can verify that.

However, I always recommend setting stages explicitly so it is more evident where the partitioning is occurring.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Thanks Andy. :)
Post Reply