Page 1 of 1

DW Modeling question/concept

Posted: Fri Oct 15, 2004 12:45 pm
by marc_brown98
All,
Just looking to get a little feedback on an issue that we are trying to deal with. We have a dimensional model that we are using, quite happily, but the end-users would like to do more benchmarking analylsis, to track retail centers against each other etc. They would like to do this inside of a report. We can compute the ranking in the report, but it is complex and tends to slow the reporting process down, esp. when the ranking takes 4 queries to chug through a lot of data. I was thinking of calculating the rank of each retail center and creating a subject area that tracked ranks over time(ie fact & dimension tables). Has anybody seen something like this or have similar experience?

Thanks
Marc

Posted: Fri Oct 15, 2004 1:13 pm
by KeithM
We do this type of thing all the time. Dimensional models work great but sometimes the reporting requirements cause you to look for other options. You can have the greatest dimensional model possible but if it takes too long to generate a report, the users are not really going to care. Perception is everything.

Generally what we do is we have an enterprise data repository. The repository is kept in the dimensional model. Off of the repository we build reporting marts. These are generally designed to follow the dimensional model but when reporting requires some complex processing like what you have, we build summary tables were the values can be quickly retrieved.

Re: DW Modeling question/concept

Posted: Fri Oct 15, 2004 1:18 pm
by ogmios
Not really a DataStage question. No functionality for that is natively built into DataStage.

In Oracle you could use "Materialized Views" e.g. to speed up your queries. or you can write ETL programs to do the calculations yourself. The ranking will probably have to be done in the database.

Besides the books by Kimball I can also recommend "Data Warehouse Design Solutions" by Adamson and Venerable. There's also a small piece on what you try to do in there.

Ogmios

Posted: Fri Oct 15, 2004 2:13 pm
by kduke
Ogmios

I think they were wanting to calculate the ranking ahead of time and store in some kind of summary table so the report runs faster. DataStage can do this. You need to think about the date ranges like quarterly ranking or annual or both. In healthcare this is very common to rank doctors by how much they spend (Cost per member per month). Extremely effective reports. You are using the data to set an average or a base line cost and then compare individual doctors to this base line. These numbers require multiple passes through the data and to build summary tables are the only way to display these numbers in an OLAP environment. Joining summary fact tables can be hard to setup in the metadata of MicroStrategy or some other tool but better than calculating this at run time.

Posted: Fri Oct 15, 2004 2:17 pm
by ogmios
You can do anything with DataStage :lol: ... what I meant is that there's no stage or so to magically have such rankings created... it's manual labour which is a lot of times underestimated :wink:

The book I recommend in the previous reply has some examples of such things in it (e.g. in the chapter on OLAP reporting).

Ogmios

Posted: Fri Oct 15, 2004 2:38 pm
by marc_brown98
The rollups over time is where I see the problems for this. For instance, if they want to see the ranking on a weekly, monthly, quarterly basis. I would need to rollup revenue & cost information for each time frame, etc. good ideas...

Posted: Fri Oct 15, 2004 2:44 pm
by ogmios
marc_brown98 wrote:The rollups over time is where I see the problems for this. For instance, if they want to see the ranking on a weekly, monthly, quarterly basis. I would need to rollup revenue & cost information for each time frame, etc. good ideas...
If your database is Oracle "Materialized Views" and "Oracle Dimensions" (not to be confused with datawarehousing dimensions), and most of your roll-ups will be automatic... in the assumptions your measures are additive and you have a strict hierarchy.

Ogmios

Posted: Fri Oct 15, 2004 5:43 pm
by kduke
Rollups over time are not possible the way I explained. You need to pick specific time frames like monthly, quarterly and annually. The speed gained by summary tables usually out weighs the rollups on the fly. It is a trade off. It depends on the number of transactions.

Plug

Posted: Fri Oct 15, 2004 6:08 pm
by ray.wurlod
Red Brick Warehouse (now owned by IBM) has RANK as one of the functions built in to its SQL. Also tiling (e.g. TERTILE, percentage, N-tile), moving sums and moving averages, running totals and cumulations, and lots more.

It's a database designed specifically for implementing the dimensional model, with query priority (and very fast too).

If only IBM knew what they'd bought. :roll:

Posted: Mon Oct 18, 2004 7:56 am
by marc_brown98
Ray,
Yup. I wish I had a copy of RBW for what we are doing. We had Rank, Ntile, moving avg, weighted moving avg. functions years ago inside RISQL. Alas, we are running MS SOL Server, err, SQL Server.;) The corp standard might very well become Oracle. I think Big Blue has a case of rectal cranial inversion when it comes to DB technology that they own/control.