Managing data coming out of the warehouse or ODS

RodBarnes · Post by **RodBarnes** » Fri Jan 04, 2008 8:42 am

I've a general question that I thought many here might have already answered in their own environments.

Given that there are a number of feeds generating shared dimensions and fact tables in the warehouse (and that there will eventually be an ODS) and other groups desire access to the data in order to use it within another application.

We can grant controlled access to the warehouse/ODS via one or more logins that expose the desired tables/views and allow those other groups to build whatever feed they desire to pull the data into their application.

Or, we can reduce or eliminate direct access and build the feeds to push the data into their applications, thus controlling how the data is obtained and being able to manage the number of feeds since some requests can be consolidated into the same feed. Of course, this puts us into the mode of being the provider of all feeds.

Anyone desire to weigh in on this? Is there a best practice response for this question?

Thanks.

ArndW · Post by **ArndW** » Fri Jan 04, 2008 9:08 am

I have seen both of these alternatives being used with more or less success in the past and believe that both have their place.

The German in me wants to turn the ODS into a high-data-security prison and only allow data out when it's time has come, i.e. only allowing push. This is an acceptable approach for certain organizations and data and also caters for easier adherence to standards (SOX, Data Security, ISO-9002, etc.). But it is relatively inflexible and since feeds take time and effort to create using push methods it can slow down an organization's response times.

The Anarchistic part of me would prefer the latter approach, exposing the underlying table structures and granting read access. This way, the onus of creating new "applications" off the data is transferred to those who need it and the ODS owners cannot be considered to be hampering the flow of data. This does have the downside that the system load is no longer controlled or planned by the ODS owners. Although database level throttling could be implemented, this type of access means that any system downtime would need to be planned and co-ordinated ahead of time. It is a fact of life that some of these exposed tables/queries will feed systems that will be deemed mission-critical at some point in time, and that will seriously impact the real control of the ODS by the owners.

Every organization has different mindsets (the control freaks vs. the anarchists) and either method has its advantages and weaknesses. I think, given a site with no bias towards one or the other extrem, that I would prefer the limited access and read-only open table approach.