DW Modeling question/concept
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 67
- Joined: Wed Apr 14, 2004 11:33 am
DW Modeling question/concept
All,
Just looking to get a little feedback on an issue that we are trying to deal with. We have a dimensional model that we are using, quite happily, but the end-users would like to do more benchmarking analylsis, to track retail centers against each other etc. They would like to do this inside of a report. We can compute the ranking in the report, but it is complex and tends to slow the reporting process down, esp. when the ranking takes 4 queries to chug through a lot of data. I was thinking of calculating the rank of each retail center and creating a subject area that tracked ranks over time(ie fact & dimension tables). Has anybody seen something like this or have similar experience?
Thanks
Marc
Just looking to get a little feedback on an issue that we are trying to deal with. We have a dimensional model that we are using, quite happily, but the end-users would like to do more benchmarking analylsis, to track retail centers against each other etc. They would like to do this inside of a report. We can compute the ranking in the report, but it is complex and tends to slow the reporting process down, esp. when the ranking takes 4 queries to chug through a lot of data. I was thinking of calculating the rank of each retail center and creating a subject area that tracked ranks over time(ie fact & dimension tables). Has anybody seen something like this or have similar experience?
Thanks
Marc
We do this type of thing all the time. Dimensional models work great but sometimes the reporting requirements cause you to look for other options. You can have the greatest dimensional model possible but if it takes too long to generate a report, the users are not really going to care. Perception is everything.
Generally what we do is we have an enterprise data repository. The repository is kept in the dimensional model. Off of the repository we build reporting marts. These are generally designed to follow the dimensional model but when reporting requires some complex processing like what you have, we build summary tables were the values can be quickly retrieved.
Generally what we do is we have an enterprise data repository. The repository is kept in the dimensional model. Off of the repository we build reporting marts. These are generally designed to follow the dimensional model but when reporting requires some complex processing like what you have, we build summary tables were the values can be quickly retrieved.
Keith
Re: DW Modeling question/concept
Not really a DataStage question. No functionality for that is natively built into DataStage.
In Oracle you could use "Materialized Views" e.g. to speed up your queries. or you can write ETL programs to do the calculations yourself. The ranking will probably have to be done in the database.
Besides the books by Kimball I can also recommend "Data Warehouse Design Solutions" by Adamson and Venerable. There's also a small piece on what you try to do in there.
Ogmios
In Oracle you could use "Materialized Views" e.g. to speed up your queries. or you can write ETL programs to do the calculations yourself. The ranking will probably have to be done in the database.
Besides the books by Kimball I can also recommend "Data Warehouse Design Solutions" by Adamson and Venerable. There's also a small piece on what you try to do in there.
Ogmios
In theory there's no difference between theory and practice. In practice there is.
Ogmios
I think they were wanting to calculate the ranking ahead of time and store in some kind of summary table so the report runs faster. DataStage can do this. You need to think about the date ranges like quarterly ranking or annual or both. In healthcare this is very common to rank doctors by how much they spend (Cost per member per month). Extremely effective reports. You are using the data to set an average or a base line cost and then compare individual doctors to this base line. These numbers require multiple passes through the data and to build summary tables are the only way to display these numbers in an OLAP environment. Joining summary fact tables can be hard to setup in the metadata of MicroStrategy or some other tool but better than calculating this at run time.
I think they were wanting to calculate the ranking ahead of time and store in some kind of summary table so the report runs faster. DataStage can do this. You need to think about the date ranges like quarterly ranking or annual or both. In healthcare this is very common to rank doctors by how much they spend (Cost per member per month). Extremely effective reports. You are using the data to set an average or a base line cost and then compare individual doctors to this base line. These numbers require multiple passes through the data and to build summary tables are the only way to display these numbers in an OLAP environment. Joining summary fact tables can be hard to setup in the metadata of MicroStrategy or some other tool but better than calculating this at run time.
Mamu Kim
You can do anything with DataStage
... what I meant is that there's no stage or so to magically have such rankings created... it's manual labour which is a lot of times underestimated ![Wink :wink:](./images/smilies/icon_wink.gif)
The book I recommend in the previous reply has some examples of such things in it (e.g. in the chapter on OLAP reporting).
Ogmios
![Laughing :lol:](./images/smilies/icon_lol.gif)
![Wink :wink:](./images/smilies/icon_wink.gif)
The book I recommend in the previous reply has some examples of such things in it (e.g. in the chapter on OLAP reporting).
Ogmios
In theory there's no difference between theory and practice. In practice there is.
-
- Premium Member
- Posts: 67
- Joined: Wed Apr 14, 2004 11:33 am
If your database is Oracle "Materialized Views" and "Oracle Dimensions" (not to be confused with datawarehousing dimensions), and most of your roll-ups will be automatic... in the assumptions your measures are additive and you have a strict hierarchy.marc_brown98 wrote:The rollups over time is where I see the problems for this. For instance, if they want to see the ranking on a weekly, monthly, quarterly basis. I would need to rollup revenue & cost information for each time frame, etc. good ideas...
Ogmios
In theory there's no difference between theory and practice. In practice there is.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Plug
Red Brick Warehouse (now owned by IBM) has RANK as one of the functions built in to its SQL. Also tiling (e.g. TERTILE, percentage, N-tile), moving sums and moving averages, running totals and cumulations, and lots more.
It's a database designed specifically for implementing the dimensional model, with query priority (and very fast too).
If only IBM knew what they'd bought.![Rolling Eyes :roll:](./images/smilies/icon_rolleyes.gif)
It's a database designed specifically for implementing the dimensional model, with query priority (and very fast too).
If only IBM knew what they'd bought.
![Rolling Eyes :roll:](./images/smilies/icon_rolleyes.gif)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 67
- Joined: Wed Apr 14, 2004 11:33 am
Ray,
Yup. I wish I had a copy of RBW for what we are doing. We had Rank, Ntile, moving avg, weighted moving avg. functions years ago inside RISQL. Alas, we are running MS SOL Server, err, SQL Server.
The corp standard might very well become Oracle. I think Big Blue has a case of rectal cranial inversion when it comes to DB technology that they own/control.
Yup. I wish I had a copy of RBW for what we are doing. We had Rank, Ntile, moving avg, weighted moving avg. functions years ago inside RISQL. Alas, we are running MS SOL Server, err, SQL Server.
![Wink ;)](./images/smilies/icon_wink.gif)