type 1 and type 2 dimensions

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bobby
Participant
Posts: 86
Joined: Mon Jul 19, 2004 8:31 pm

type 1 and type 2 dimensions

Post by bobby »

hi there
I have this DIM_Customer table aginst the columns the requirment is written tht this column is type 1 and this column is type 2 dimension
like County_City_Code 1(type1)
Nomenclature_Code 2(type2)
and so on

1) they want effdate and enddate aginst all type 2 dimensions

2) is tht possible to have type 1 and type 2 on all columns

i am bit confused throw some light
thanks
bobby



DIM_CUSTOMER
Customer_No (pk)
Cust_Key (pk)
Cust_Name 1
Cust_Address_1 1
Cust_Address_2 1
Cust_Cityprovince 1
Cust_Postal_Code 1
County_City_Code 1
Nomenclature_Code 2
Cust_Status 1
Store_No 1
Store_Short_Name 1
Sales_Group 2
Sales_Subgroup 2
Cust_Type 1
Sold_To_Cust_No 1
Invoice_To_Cust_No 1
Bus_Class 2
Credit_Limit 1
Company_Code 1
Gl_Group_Code 2
Division_Code 1
Cust_Salesman_Code 2
Class_Code 2
Cust_Territory 2
Cust_Price_Column_Code 1
Nomenclature_Mkt_Seg_Descr_Eng 2
Nomenclature_Mkt_Seg_Descr_Fr 2
Nomenclature_Cust_typ_Desc_Eng 2
Nomenclature_Cust_type_Desc_Fr 2
Nomenclature_Company_Size_Desc 2
Class_Descr_Eng 2
Class_Descr_Fr 2
Cust_Country_Eng 2
Cust_Territory_Eng 2
Cust_Territory_Fr 2
Cust_Salesman_Name 2
Corp_Cust_No 2
Mkt_Segment_Descr 2
Mkt_Segment 2
Bus_Class_Descr 2
Gl_Group_Code_Descr 2
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

A dimension is of one type, not some odd combination of types. So, to keep history you would need it to be a Type 2 and thus keep history on all columns.

If you really need a mixture, then the "type 2 columns" would need to come out into a child table.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please make Craig's point to whomever gave you this "requirement", and suggest that they read the Data Warehouse Lifecycle Toolkit by Ralph Kimball and others in order to learn about what the various types of slowly changing dimensions actually are. For they seem not to understand the concept.
Let us know what they say!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

This is type 3 isn't it?
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not if it includes effective date ranges for the changes, that's indicative of a type 2. A type 3 is, in a way, just a stack of tracked fields of a fixed number with values pushed down the line as a change is detected. Eventually, changes 'fall off' the end and are lost.

I suppose you could do some combination of the above - but it would be odd, atypical and a pain in the patootie to use.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

There are 2 versions of ech field. Agreed it is ugly but type 3.
Mamu Kim
Gokul
Participant
Posts: 74
Joined: Wed Feb 23, 2005 10:58 pm
Location: Mumbai

Post by Gokul »

Hi there,

In case of RCD(rapidly changing dimensions), where the frequency of change is very high. It would be difficult to keep track of all the Columns due to disk space and other constraints.

In such case,Normally the dimension is partitioned into 2 tables,
1. One containing the columns,whose history is not maintained
2. The other contains the columns whose history is maintained(Frequently changing Columns).

The Tables are related using foreign key relationship.

The Table 1 will be using Type 1 SCD and Table 2 will be using Type2 or Tyep 3.

In such case , combinations of SCD types can be handled.

Thanks,
Gokul
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

The SCD is handled in IWS (Industry warehouse studio by Sybase) as mentioned by Gokul. Frequently changing dimension attributes are stored in a separate dimension tables. All of them are maintained as type 1 dimensions and the relationship of these dimension in the separate table and that table is considered as type 2 dimension (Profile table as IWS terminology). This approach would reduce the no of records in all the dimension tables and query performance would be better.

Regards
Saravanan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

chulett wrote:If you really need a mixture, then the "type 2 columns" would need to come out into a child table.
As mentioned in my first post. Just a little shorter. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Sorry I asked. This gets uglier as it goes. I understand the concept. Is this RCD type 42 or SCD type 42. When it gets to 42 it is the answer to all our problems. Right?
Mamu Kim
bobby
Participant
Posts: 86
Joined: Mon Jul 19, 2004 8:31 pm

type 1 and type 2 dimensions the requirment

Post by bobby »

hi
to add more on tht wht the requirment is
FIRST TIME RECORD COME
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -

SECOND TIME RECORD COME Nomenclature_Code changes

custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
12 101 10052 2001 ABC 06/28/2005

This time the Nomenclature_Code changes and it should look like

custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC 06/28/2005


THIRD TIME RECORD COME code CHANGES
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005

FOURTH TIME RECORD COMES CUTNAME CHANGES NOW
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
19 101 10052 2004 CDE NOCHANGE NOCHANGE
LIKE FOR NOMENCULATURE_CODE AND CODE EFFDATE AND END DATE CHANGES BUT NOT FOR CUSTNAME
ITS JUST A SAMPLE THTS THEY WANT LIKE SO WHTS BEST APPROACH WHT U SUGGEST
THANKS
BOBBY
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's not our decision - it's theirs.

They must tell you what information they want stored; for which fields it is essential to keep full history, for which fields partial history (maybe the most recent three changes) would be OK, and for which fields they are only interested in the most recent value (for example telephone number).

The good news is that, whatever they decide, it can be done with DataStage. In most cases, change detection can be achieved via a lookup against the target table (or a copy of relevant rows and columns from it in a hashed file).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply