type 1 and type 2 dimensions
Moderators: chulett, rschirm, roy
type 1 and type 2 dimensions
hi there
I have this DIM_Customer table aginst the columns the requirment is written tht this column is type 1 and this column is type 2 dimension
like County_City_Code 1(type1)
Nomenclature_Code 2(type2)
and so on
1) they want effdate and enddate aginst all type 2 dimensions
2) is tht possible to have type 1 and type 2 on all columns
i am bit confused throw some light
thanks
bobby
DIM_CUSTOMER
Customer_No (pk)
Cust_Key (pk)
Cust_Name 1
Cust_Address_1 1
Cust_Address_2 1
Cust_Cityprovince 1
Cust_Postal_Code 1
County_City_Code 1
Nomenclature_Code 2
Cust_Status 1
Store_No 1
Store_Short_Name 1
Sales_Group 2
Sales_Subgroup 2
Cust_Type 1
Sold_To_Cust_No 1
Invoice_To_Cust_No 1
Bus_Class 2
Credit_Limit 1
Company_Code 1
Gl_Group_Code 2
Division_Code 1
Cust_Salesman_Code 2
Class_Code 2
Cust_Territory 2
Cust_Price_Column_Code 1
Nomenclature_Mkt_Seg_Descr_Eng 2
Nomenclature_Mkt_Seg_Descr_Fr 2
Nomenclature_Cust_typ_Desc_Eng 2
Nomenclature_Cust_type_Desc_Fr 2
Nomenclature_Company_Size_Desc 2
Class_Descr_Eng 2
Class_Descr_Fr 2
Cust_Country_Eng 2
Cust_Territory_Eng 2
Cust_Territory_Fr 2
Cust_Salesman_Name 2
Corp_Cust_No 2
Mkt_Segment_Descr 2
Mkt_Segment 2
Bus_Class_Descr 2
Gl_Group_Code_Descr 2
I have this DIM_Customer table aginst the columns the requirment is written tht this column is type 1 and this column is type 2 dimension
like County_City_Code 1(type1)
Nomenclature_Code 2(type2)
and so on
1) they want effdate and enddate aginst all type 2 dimensions
2) is tht possible to have type 1 and type 2 on all columns
i am bit confused throw some light
thanks
bobby
DIM_CUSTOMER
Customer_No (pk)
Cust_Key (pk)
Cust_Name 1
Cust_Address_1 1
Cust_Address_2 1
Cust_Cityprovince 1
Cust_Postal_Code 1
County_City_Code 1
Nomenclature_Code 2
Cust_Status 1
Store_No 1
Store_Short_Name 1
Sales_Group 2
Sales_Subgroup 2
Cust_Type 1
Sold_To_Cust_No 1
Invoice_To_Cust_No 1
Bus_Class 2
Credit_Limit 1
Company_Code 1
Gl_Group_Code 2
Division_Code 1
Cust_Salesman_Code 2
Class_Code 2
Cust_Territory 2
Cust_Price_Column_Code 1
Nomenclature_Mkt_Seg_Descr_Eng 2
Nomenclature_Mkt_Seg_Descr_Fr 2
Nomenclature_Cust_typ_Desc_Eng 2
Nomenclature_Cust_type_Desc_Fr 2
Nomenclature_Company_Size_Desc 2
Class_Descr_Eng 2
Class_Descr_Fr 2
Cust_Country_Eng 2
Cust_Territory_Eng 2
Cust_Territory_Fr 2
Cust_Salesman_Name 2
Corp_Cust_No 2
Mkt_Segment_Descr 2
Mkt_Segment 2
Bus_Class_Descr 2
Gl_Group_Code_Descr 2
A dimension is of one type, not some odd combination of types. So, to keep history you would need it to be a Type 2 and thus keep history on all columns.
If you really need a mixture, then the "type 2 columns" would need to come out into a child table.
If you really need a mixture, then the "type 2 columns" would need to come out into a child table.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Please make Craig's point to whomever gave you this "requirement", and suggest that they read the Data Warehouse Lifecycle Toolkit by Ralph Kimball and others in order to learn about what the various types of slowly changing dimensions actually are. For they seem not to understand the concept.
Let us know what they say!
Let us know what they say!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Not if it includes effective date ranges for the changes, that's indicative of a type 2. A type 3 is, in a way, just a stack of tracked fields of a fixed number with values pushed down the line as a change is detected. Eventually, changes 'fall off' the end and are lost.
I suppose you could do some combination of the above - but it would be odd, atypical and a pain in the patootie to use.
I suppose you could do some combination of the above - but it would be odd, atypical and a pain in the patootie to use.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Hi there,
In case of RCD(rapidly changing dimensions), where the frequency of change is very high. It would be difficult to keep track of all the Columns due to disk space and other constraints.
In such case,Normally the dimension is partitioned into 2 tables,
1. One containing the columns,whose history is not maintained
2. The other contains the columns whose history is maintained(Frequently changing Columns).
The Tables are related using foreign key relationship.
The Table 1 will be using Type 1 SCD and Table 2 will be using Type2 or Tyep 3.
In such case , combinations of SCD types can be handled.
Thanks,
Gokul
In case of RCD(rapidly changing dimensions), where the frequency of change is very high. It would be difficult to keep track of all the Columns due to disk space and other constraints.
In such case,Normally the dimension is partitioned into 2 tables,
1. One containing the columns,whose history is not maintained
2. The other contains the columns whose history is maintained(Frequently changing Columns).
The Tables are related using foreign key relationship.
The Table 1 will be using Type 1 SCD and Table 2 will be using Type2 or Tyep 3.
In such case , combinations of SCD types can be handled.
Thanks,
Gokul
The SCD is handled in IWS (Industry warehouse studio by Sybase) as mentioned by Gokul. Frequently changing dimension attributes are stored in a separate dimension tables. All of them are maintained as type 1 dimensions and the relationship of these dimension in the separate table and that table is considered as type 2 dimension (Profile table as IWS terminology). This approach would reduce the no of records in all the dimension tables and query performance would be better.
Regards
Saravanan
Regards
Saravanan
type 1 and type 2 dimensions the requirment
hi
to add more on tht wht the requirment is
FIRST TIME RECORD COME
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
SECOND TIME RECORD COME Nomenclature_Code changes
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
12 101 10052 2001 ABC 06/28/2005
This time the Nomenclature_Code changes and it should look like
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC 06/28/2005
THIRD TIME RECORD COME code CHANGES
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
FOURTH TIME RECORD COMES CUTNAME CHANGES NOW
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
19 101 10052 2004 CDE NOCHANGE NOCHANGE
LIKE FOR NOMENCULATURE_CODE AND CODE EFFDATE AND END DATE CHANGES BUT NOT FOR CUSTNAME
ITS JUST A SAMPLE THTS THEY WANT LIKE SO WHTS BEST APPROACH WHT U SUGGEST
THANKS
BOBBY
to add more on tht wht the requirment is
FIRST TIME RECORD COME
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
SECOND TIME RECORD COME Nomenclature_Code changes
custkey customer_no Nomenclature_Code code custname effdate endate
1 101 10050 2001 ABC 06/27/2005 -
12 101 10052 2001 ABC 06/28/2005
This time the Nomenclature_Code changes and it should look like
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC 06/28/2005
THIRD TIME RECORD COME code CHANGES
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
FOURTH TIME RECORD COMES CUTNAME CHANGES NOW
custkey customer_no Nomenclature_Code code CUSTNAME effdate endate
1 101 10050 2001 ABC - 06/27/2005(effdate-1 for Nomenclature_Code=10052)
12 101 10052 2001 ABC - 06/30/2005(effdate-1 for code=2004)
17 101 10052 2004 ABC 06/30/2005
19 101 10052 2004 CDE NOCHANGE NOCHANGE
LIKE FOR NOMENCULATURE_CODE AND CODE EFFDATE AND END DATE CHANGES BUT NOT FOR CUSTNAME
ITS JUST A SAMPLE THTS THEY WANT LIKE SO WHTS BEST APPROACH WHT U SUGGEST
THANKS
BOBBY
Code: Select all
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It's not our decision - it's theirs.
They must tell you what information they want stored; for which fields it is essential to keep full history, for which fields partial history (maybe the most recent three changes) would be OK, and for which fields they are only interested in the most recent value (for example telephone number).
The good news is that, whatever they decide, it can be done with DataStage. In most cases, change detection can be achieved via a lookup against the target table (or a copy of relevant rows and columns from it in a hashed file).
They must tell you what information they want stored; for which fields it is essential to keep full history, for which fields partial history (maybe the most recent three changes) would be OK, and for which fields they are only interested in the most recent value (for example telephone number).
The good news is that, whatever they decide, it can be done with DataStage. In most cases, change detection can be achieved via a lookup against the target table (or a copy of relevant rows and columns from it in a hashed file).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.