Fact to Fact

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
santoshkumar
Charter Member
Charter Member
Posts: 35
Joined: Sun Jan 16, 2005 8:39 am
Location: US

Fact to Fact

Post by santoshkumar »

Hi Again,

I have a mindset where I am tempted to join two facts.

A fact to fact join.

Am I commiting suicide !!

Thanks.
Santosh
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

When you say fact, you mean a fact table right?
I don't see any harm in joining two fact tables, if that is what is your requirement is.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Re: Fact to Fact

Post by DSguru2B »

santoshkumar wrote: Am I commiting suicide !!
Not if your fact tables are in tera bytes :wink:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
trobinson
Participant
Posts: 208
Joined: Thu Apr 11, 2002 6:02 am
Location: Saint Louis
Contact:

Post by trobinson »

I assume your design is in the classic Kimball model and the compound key to the fact table is, in fact (hah!), the foreign keys to the dimensions. Therefore joining fact tables is really joining by dimension across fact tables - A federated Data Warehouse!
Naturally all facts are at the same level of granularity...
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not if your database is Red Brick, which supports efficient fact-to-fact joins, constrained on as many dimensions as you wish, via a multi-table index called a star index.

It is worth adding that this technique - indeed the entire database now known as Red Brick - was originally designed by Ralph Kimball and his colleagues.

Red Brick is now owned by IBM, but I don't think anyone in IBM realizes what a jewel they've got.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Heresy! Fact to fact joins, arrrrgghh. Curse ye data modelers. Something when horribly wrong somewhere. If I were to get a tattoo, it just might say STAR SCHEMA.

Often folks mistake and usa a join for a 'merge' or 'union' operation. For example, say you have a bunch of sales order line items in one table, and a bunch of payment line items in another. You're asked to build a result set that tracks order metrics and payment metrics by item. The first thing folks do with SQL is join the two tables.

Consider instead an operation where you "map" sales order line item rows into a row of columns that looks like your final result column set, putting nulls or zeroes in the columns that pertain to payment line items. Do the same for the payment line item rows, and for the sales order related columns you put zeroes and nulls. Now, the union of those two virtual sets of data can be aggregated and grouped, using a MAX derivation for all of the attributes.

You've avoided join operations and all of that nastiness, simply scanned your fact tables with one pass, aggregated your results which deftly handles grain disparaties, and arrived at a result set much faster. You can count by source table rows that contributed to the final row and determine if you have an order with payments, or just payments, and thus filter out that row, which simulates an inner join.

This technique is a great way to handle grain shifting different fact tables and merging into aggregates that move to datamarts.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That, too.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
trobinson
Participant
Posts: 208
Joined: Thu Apr 11, 2002 6:02 am
Location: Saint Louis
Contact:

Post by trobinson »

"I'm intrigued by your views and would like to subscribe to your newsletter."
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Very smart analysis and suggestion by Ken. Two thumbs up, Ken :wink:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply