Datawarehouse's Data Merging Considerations

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
holymac
Participant
Posts: 25
Joined: Sun Jan 08, 2006 9:40 pm

Datawarehouse's Data Merging Considerations

Post by holymac »

Dear All,

Would like to put forward this question as well:

Our company has a datawarehouse and we have merged with a new company that does not have a datawarehouse and now we are to combine/merge the new company's data into our existing datawarehouse.

I would like to know from everyone in this forum on what are the areas and issues/aspects of consideration in preparing for this merging activity.

Thanks in advance.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Are these job interview questions?
Mamu Kim
holymac
Participant
Posts: 25
Joined: Sun Jan 08, 2006 9:40 pm

Post by holymac »

No it is not Duke. This is an actual scenario i am facing right now.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

For that you need to get your data modeler/data architect involved. Lay out a rough sketch of how the data is combinable. Handle cons, create extra tables for different/new dimensions, create fact/factless fact tables, or add new attributes to the existing tables.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look also for opportunities to use conformed dimensions. For example, typically the same time granularity will be used regardless of data source, so that the same time dimension table can be used with all fact tables.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I would look at the 2 designs and see what is in common. If you truly need to merge dimensions then you may need to use QualityStage with some address matching. You need the cleanse and standardize first.

I would pick a subject area where both are strong.
Mamu Kim
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

You should also consider a master data management process. For example, where is your definitive list of customers, addresses and products? Are you decommissioning core systems? You could try and make your data warehouse the consolidated view of all information from each set of systems but you face a difficult matching, consolidation and deduplication task as you populate the DW.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... a task much less difficult if you have QualityStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply