Page 1 of 1

type 1 and type 2 dimensions

Posted: Mon Jun 27, 2005 7:03 pm
by bobby
hi there
I have this DIM_Customer table aginst the columns the requirment is written tht this column is type 1 and this column is type 2 dimension
like County_City_Code 1(type1)
Nomenclature_Code 2(type2)
and so on

1) they want effdate and enddate aginst all type 2 dimensions

2) is tht possible to have type 1 and type 2 on all columns

i am bit confused throw some light
thanks
bobby



DIM_CUSTOMER
Customer_No (pk)
Cust_Key (pk)
Cust_Name 1
Cust_Address_1 1
Cust_Address_2 1
Cust_Cityprovince 1
Cust_Postal_Code 1
County_City_Code 1
Nomenclature_Code 2
Cust_Status 1
Store_No 1
Store_Short_Name 1
Sales_Group 2
Sales_Subgroup 2
Cust_Type 1
Sold_To_Cust_No 1
Invoice_To_Cust_No 1
Bus_Class 2
Credit_Limit 1
Company_Code 1
Gl_Group_Code 2
Division_Code 1
Cust_Salesman_Code 2
Class_Code 2
Cust_Territory 2
Cust_Price_Column_Code 1
Nomenclature_Mkt_Seg_Descr_Eng 2
Nomenclature_Mkt_Seg_Descr_Fr 2
Nomenclature_Cust_typ_Desc_Eng 2
Nomenclature_Cust_type_Desc_Fr 2
Nomenclature_Company_Size_Desc 2
Class_Descr_Eng 2
Class_Descr_Fr 2
Cust_Country_Eng 2
Cust_Territory_Eng 2
Cust_Territory_Fr 2
Cust_Salesman_Name 2
Corp_Cust_No 2
Mkt_Segment_Descr 2
Mkt_Segment 2
Bus_Class_Descr 2
Gl_Group_Code_Descr 2

Posted: Mon Jun 27, 2005 7:08 pm
by chulett
A dimension is of one type, not some odd combination of types. So, to keep history you would need it to be a Type 2 and thus keep history on all columns.

If you really need a mixture, then the "type 2 columns" would need to come out into a child table.

Posted: Mon Jun 27, 2005 7:30 pm
by ray.wurlod
Please make Craig's point to whomever gave you this "requirement", and suggest that they read the Data Warehouse Lifecycle Toolkit by Ralph Kimball and others in order to learn about what the various types of slowly changing dimensions actually are. For they seem not to understand the concept.
Let us know what they say!

Posted: Mon Jun 27, 2005 8:53 pm
by kduke
This is type 3 isn't it?

Posted: Mon Jun 27, 2005 9:17 pm
by chulett
Not if it includes effective date ranges for the changes, that's indicative of a type 2. A type 3 is, in a way, just a stack of tracked fields of a fixed number with values pushed down the line as a change is detected. Eventually, changes 'fall off' the end and are lost.

I suppose you could do some combination of the above - but it would be odd, atypical and a pain in the patootie to use.

Posted: Mon Jun 27, 2005 10:54 pm
by kduke
There are 2 versions of ech field. Agreed it is ugly but type 3.

Posted: Mon Jun 27, 2005 11:00 pm
by Gokul
Hi there,

In case of RCD(rapidly changing dimensions), where the frequency of change is very high. It would be difficult to keep track of all the Columns due to disk space and other constraints.

In such case,Normally the dimension is partitioned into 2 tables,
1. One containing the columns,whose history is not maintained
2. The other contains the columns whose history is maintained(Frequently changing Columns).

The Tables are related using foreign key relationship.

The Table 1 will be using Type 1 SCD and Table 2 will be using Type2 or Tyep 3.

In such case , combinations of SCD types can be handled.

Thanks,
Gokul

Posted: Tue Jun 28, 2005 12:44 am
by elavenil
The SCD is handled in IWS (Industry warehouse studio by Sybase) as mentioned by Gokul. Frequently changing dimension attributes are stored in a separate dimension tables. All of them are maintained as type 1 dimensions and the relationship of these dimension in the separate table and that table is considered as type 2 dimension (Profile table as IWS terminology). This approach would reduce the no of records in all the dimension tables and query performance would be better.

Regards
Saravanan

Posted: Tue Jun 28, 2005 6:37 am
by chulett
chulett wrote:If you really need a mixture, then the "type 2 columns" would need to come out into a child table.
As mentioned in my first post. Just a little shorter. :wink:

Posted: Tue Jun 28, 2005 8:12 am
by kduke
Sorry I asked. This gets uglier as it goes. I understand the concept. Is this RCD type 42 or SCD type 42. When it gets to 42 it is the answer to all our problems. Right?

type 1 and type 2 dimensions the requirment

Posted: Tue Jun 28, 2005 5:44 pm
by bobby
hi
to add more on tht wht the requirment is
FIRST TIME RECORD COME
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -

SECOND TIME RECORD COME Nomenclature_Code changes

custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
12 101 10052 2001 ABC 06/28/2005

This time the Nomenclature_Code changes and it should look like

custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC 06/28/2005


THIRD TIME RECORD COME code CHANGES
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005

FOURTH TIME RECORD COMES CUTNAME CHANGES NOW
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
19 101 10052 2004 CDE NOCHANGE NOCHANGE
LIKE FOR NOMENCULATURE_CODE AND CODE EFFDATE AND END DATE CHANGES BUT NOT FOR CUSTNAME
ITS JUST A SAMPLE THTS THEY WANT LIKE SO WHTS BEST APPROACH WHT U SUGGEST
THANKS
BOBBY

Posted: Tue Jun 28, 2005 6:44 pm
by ray.wurlod
It's not our decision - it's theirs.

They must tell you what information they want stored; for which fields it is essential to keep full history, for which fields partial history (maybe the most recent three changes) would be OK, and for which fields they are only interested in the most recent value (for example telephone number).

The good news is that, whatever they decide, it can be done with DataStage. In most cases, change detection can be achieved via a lookup against the target table (or a copy of relevant rows and columns from it in a hashed file).