SCD Type 2 in Datastage 8.0

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

SCD Type 2 in Datastage 8.0

Post by srini.dw »

Hi guys,

Iam need to implement a SCD Type2 in DataStage 8.0,
Need to know what are the important fields i should take to find out how it works.

I just need to know how this SCD works in 8.0.

DEPT_K (Surrogate Key)
DEPT_C (Business_Key)

From where should be the below fields value come from.

Type2
Current Indicator
Effective Date
Expiration Date
SK Chain

2. do we have any forum for DataStage 8.0?

Thanks,
Srini
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

#1 - These fields depend on your type 2 dimension table design (there are some slightly different, but common flavors to type 2 dimension designs). The SCD stage supports all of the common flavors that I've used in the past. From this set of fields, you use the ones that are needed and ignore the ones that are not needed.

#2 - You're in the correct forum for DataStage 8.0 parallel jobs now.

Mike
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

Hi,

Thanks for the reply, (i have 1 source and 1 tgt table, comparing)

SRC Columns
EMP_ID
EMP_NAME
EMP_ADD
SALARY

TARGET Columns
S_KEY (surrogate_key)
EMP_ID (Business_Key)
EMP_NAME
EMP_ADD
SALARY
CURR_IND (Current Indicator)
EFF_Y (Effective Date)
EXP_Y (Expiration Date)

Job Description
TGT
|
|
SRC --> SCD--> Insert_ds
|
|
Update_ds

SCD Stage
Stage --> General --> Select output link

1. Wat's the important of Select output link.
In Select output link, should it be Insert_ds or Update_ds.

2. In this first time, all records should go to Insert_ds, but in my case 7 records flow to Insert_ds and 7 records goes to Update_ds.

Can you please let me know why this is the case.

Thank You
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1(a) A wat is a Buddhist place of worship. Please strive for a professional standard of written English on DSXchange, for reasons elucidated elsewhere.

1(b) The SCD stage requires two output links. One goes to the dimension table, the other (eventually) to the fact table. The Select Output Link allows you to specify which of the named links is the latter.

2. This indicates that you don't understand the SCD stage. See prior answer. The output link that goes to the dimension table handles both inserts and updates.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

Hi,

My design is as below
TargetTable(Reference)
|
|
V
SCD ----->Dimension Table (Upsert)
|
|
V
Dataset (Insert)

In the Dim update tab
Empno Business Key
Ename Type2
EAdd Type2
Salary Type2
Curr_Ind Current Indicator (Derivation Y, Expiration N)

Empno Ename EAdd Salary Curr_Ind
555 RELIC GERMANY 1212 Y

for the first time record will be as above.

What changes should i do in job to make that Curr_Ind column as N for that record if i update the same record in the source.

Thank you,[list][/list][img][/img][img][/img][img][/img][img][/img][img][/img]
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Nothing. It will happen automatically using the derivation and expiration expressions that you've defined. Y will be used on the new type 2 record (an insert) and N will be used on the expired record (an update).

Mike
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

Thanks for the reply,

First time my record is as below,
EMPNO ENAME ADD SALARY CURR_IND
10001 John Doe Singapore 4000 Y

I changed SALARY from 4000 to 500, in my source.

When i run my job iam getting 2 inserted record i.,e

SKEY EMPNO ENAME ADD SALARY CURR_IND EFF_DATE
85 10001 John Doe Singapore 4000 Y 10/31/2008
85 0 0 N NULL
91 10001 John Doe Singapore 500 Y 10/31/2008

it should update 1 record and insert 1 record, in my case which is not happening.

What must have gone wrong.

Thank you
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Hard to tell... bug in DataStage or bug in your job? As far as I can tell without actually seeing your job, you seem to have things set up correctly.

Have you contacted your official support provider?

I note that you have a numeric business key and a numeric type 2 column. What are these data types? In my limited testing, I think I've only tryed out char and varchar business keys. I think I remember reading something here about a potential issue involving decimal keys and the lookup stage. The SCD stage has to include lookup functionality, so it wouldn't be surprising if that same issue existed.

Try experimenting with other scenarios to try and isolate a bug.

Mike
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

SKEY EMPNO ENAME ADD SALARY CURR_IND EFF_DATE
85 10001 John Doe Singapore 4000 Y 10/31/2008
85 0 0 N NULL
91 10001 John Doe Singapore 500 Y 10/31/2008
These rows result from a database query? Yes?
85 0 0 N NULL
would be what I might expect in a sequential file for an expiring row since only the surrogate key and current indicator are needed for the actual UPDATE statement.

Mike
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

If your surrogate key is defined as a primary key, then you can't have two rows with 85. I'm guessing that you are testing with a sequential file as output and not understanding that what you see is correct.

Mike
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

My design,

TargetTable(Reference)
|
|
V
SCD ----->ODBC Enterprise Stage
|
|
V
Dataset (Insert)

My apologize, I was using Upsert Mode --> Insert than Update

It should have been Update than insert.
My target was Terdadata database

Its working now, thank you
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please mark thread as Resolved using the green button at top.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply