DW Modeling question/concept

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
marc_brown98
Premium Member
Premium Member
Posts: 67
Joined: Wed Apr 14, 2004 11:33 am

DW Modeling question/concept

Post by marc_brown98 »

All,
Just looking to get a little feedback on an issue that we are trying to deal with. We have a dimensional model that we are using, quite happily, but the end-users would like to do more benchmarking analylsis, to track retail centers against each other etc. They would like to do this inside of a report. We can compute the ranking in the report, but it is complex and tends to slow the reporting process down, esp. when the ranking takes 4 queries to chug through a lot of data. I was thinking of calculating the rank of each retail center and creating a subject area that tracked ranks over time(ie fact & dimension tables). Has anybody seen something like this or have similar experience?

Thanks
Marc
KeithM
Participant
Posts: 61
Joined: Thu Apr 22, 2004 11:34 am
Contact:

Post by KeithM »

We do this type of thing all the time. Dimensional models work great but sometimes the reporting requirements cause you to look for other options. You can have the greatest dimensional model possible but if it takes too long to generate a report, the users are not really going to care. Perception is everything.

Generally what we do is we have an enterprise data repository. The repository is kept in the dimensional model. Off of the repository we build reporting marts. These are generally designed to follow the dimensional model but when reporting requires some complex processing like what you have, we build summary tables were the values can be quickly retrieved.
Keith
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: DW Modeling question/concept

Post by ogmios »

Not really a DataStage question. No functionality for that is natively built into DataStage.

In Oracle you could use "Materialized Views" e.g. to speed up your queries. or you can write ETL programs to do the calculations yourself. The ranking will probably have to be done in the database.

Besides the books by Kimball I can also recommend "Data Warehouse Design Solutions" by Adamson and Venerable. There's also a small piece on what you try to do in there.

Ogmios
In theory there's no difference between theory and practice. In practice there is.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Ogmios

I think they were wanting to calculate the ranking ahead of time and store in some kind of summary table so the report runs faster. DataStage can do this. You need to think about the date ranges like quarterly ranking or annual or both. In healthcare this is very common to rank doctors by how much they spend (Cost per member per month). Extremely effective reports. You are using the data to set an average or a base line cost and then compare individual doctors to this base line. These numbers require multiple passes through the data and to build summary tables are the only way to display these numbers in an OLAP environment. Joining summary fact tables can be hard to setup in the metadata of MicroStrategy or some other tool but better than calculating this at run time.
Mamu Kim
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

You can do anything with DataStage :lol: ... what I meant is that there's no stage or so to magically have such rankings created... it's manual labour which is a lot of times underestimated :wink:

The book I recommend in the previous reply has some examples of such things in it (e.g. in the chapter on OLAP reporting).

Ogmios
In theory there's no difference between theory and practice. In practice there is.
marc_brown98
Premium Member
Premium Member
Posts: 67
Joined: Wed Apr 14, 2004 11:33 am

Post by marc_brown98 »

The rollups over time is where I see the problems for this. For instance, if they want to see the ranking on a weekly, monthly, quarterly basis. I would need to rollup revenue & cost information for each time frame, etc. good ideas...
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

marc_brown98 wrote:The rollups over time is where I see the problems for this. For instance, if they want to see the ranking on a weekly, monthly, quarterly basis. I would need to rollup revenue & cost information for each time frame, etc. good ideas...
If your database is Oracle "Materialized Views" and "Oracle Dimensions" (not to be confused with datawarehousing dimensions), and most of your roll-ups will be automatic... in the assumptions your measures are additive and you have a strict hierarchy.

Ogmios
In theory there's no difference between theory and practice. In practice there is.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Rollups over time are not possible the way I explained. You need to pick specific time frames like monthly, quarterly and annually. The speed gained by summary tables usually out weighs the rollups on the fly. It is a trade off. It depends on the number of transactions.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Plug

Post by ray.wurlod »

Red Brick Warehouse (now owned by IBM) has RANK as one of the functions built in to its SQL. Also tiling (e.g. TERTILE, percentage, N-tile), moving sums and moving averages, running totals and cumulations, and lots more.

It's a database designed specifically for implementing the dimensional model, with query priority (and very fast too).

If only IBM knew what they'd bought. :roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
marc_brown98
Premium Member
Premium Member
Posts: 67
Joined: Wed Apr 14, 2004 11:33 am

Post by marc_brown98 »

Ray,
Yup. I wish I had a copy of RBW for what we are doing. We had Rank, Ntile, moving avg, weighted moving avg. functions years ago inside RISQL. Alas, we are running MS SOL Server, err, SQL Server.;) The corp standard might very well become Oracle. I think Big Blue has a case of rectal cranial inversion when it comes to DB technology that they own/control.
Post Reply