Fact to Fact
Moderators: chulett, rschirm, roy
-
- Charter Member
- Posts: 35
- Joined: Sun Jan 16, 2005 8:39 am
- Location: US
Fact to Fact
Hi Again,
I have a mindset where I am tempted to join two facts.
A fact to fact join.
Am I commiting suicide !!
Thanks.
I have a mindset where I am tempted to join two facts.
A fact to fact join.
Am I commiting suicide !!
Thanks.
Santosh
Re: Fact to Fact
Not if your fact tables are in tera bytessantoshkumar wrote: Am I commiting suicide !!
![Wink :wink:](./images/smilies/icon_wink.gif)
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
I assume your design is in the classic Kimball model and the compound key to the fact table is, in fact (hah!), the foreign keys to the dimensions. Therefore joining fact tables is really joining by dimension across fact tables - A federated Data Warehouse!
Naturally all facts are at the same level of granularity...
Naturally all facts are at the same level of granularity...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Not if your database is Red Brick, which supports efficient fact-to-fact joins, constrained on as many dimensions as you wish, via a multi-table index called a star index.
It is worth adding that this technique - indeed the entire database now known as Red Brick - was originally designed by Ralph Kimball and his colleagues.
Red Brick is now owned by IBM, but I don't think anyone in IBM realizes what a jewel they've got.
It is worth adding that this technique - indeed the entire database now known as Red Brick - was originally designed by Ralph Kimball and his colleagues.
Red Brick is now owned by IBM, but I don't think anyone in IBM realizes what a jewel they've got.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Heresy! Fact to fact joins, arrrrgghh. Curse ye data modelers. Something when horribly wrong somewhere. If I were to get a tattoo, it just might say STAR SCHEMA.
Often folks mistake and usa a join for a 'merge' or 'union' operation. For example, say you have a bunch of sales order line items in one table, and a bunch of payment line items in another. You're asked to build a result set that tracks order metrics and payment metrics by item. The first thing folks do with SQL is join the two tables.
Consider instead an operation where you "map" sales order line item rows into a row of columns that looks like your final result column set, putting nulls or zeroes in the columns that pertain to payment line items. Do the same for the payment line item rows, and for the sales order related columns you put zeroes and nulls. Now, the union of those two virtual sets of data can be aggregated and grouped, using a MAX derivation for all of the attributes.
You've avoided join operations and all of that nastiness, simply scanned your fact tables with one pass, aggregated your results which deftly handles grain disparaties, and arrived at a result set much faster. You can count by source table rows that contributed to the final row and determine if you have an order with payments, or just payments, and thus filter out that row, which simulates an inner join.
This technique is a great way to handle grain shifting different fact tables and merging into aggregates that move to datamarts.
Often folks mistake and usa a join for a 'merge' or 'union' operation. For example, say you have a bunch of sales order line items in one table, and a bunch of payment line items in another. You're asked to build a result set that tracks order metrics and payment metrics by item. The first thing folks do with SQL is join the two tables.
Consider instead an operation where you "map" sales order line item rows into a row of columns that looks like your final result column set, putting nulls or zeroes in the columns that pertain to payment line items. Do the same for the payment line item rows, and for the sales order related columns you put zeroes and nulls. Now, the union of those two virtual sets of data can be aggregated and grouped, using a MAX derivation for all of the attributes.
You've avoided join operations and all of that nastiness, simply scanned your fact tables with one pass, aggregated your results which deftly handles grain disparaties, and arrived at a result set much faster. You can count by source table rows that contributed to the final row and determine if you have an order with payments, or just payments, and thus filter out that row, which simulates an inner join.
This technique is a great way to handle grain shifting different fact tables and merging into aggregates that move to datamarts.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: