dimension and fact build

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
snt_ds
Premium Member
Premium Member
Posts: 280
Joined: Wed Oct 18, 2006 11:53 am
Location: Pune, India

dimension and fact build

Post by snt_ds »

Hi All,

I need to create 100 dimensions for one of our data mart, source data is SAP and each dimension represents one attribute.
All the 100 dimensions have the same source (one table) and have the same target structure (100 attribute dimension).

I thought of creating a multiple instance job and pass the attribute as invocation ID.
Invocation ID is attribute name which will be there in dimension name.

Can any one please help me in creating a good design?

One more problem is how we are going to build the fact job.
Fact has to look into all the 100 dimensions.

Please suggest me any good method to create the dimensions and fact.

Thanks
Suresh
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

So what your saying is, that all the 100 table definitions are identical? :?
That does not make sense. Can you be clear on your requirement.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

First it doesn't make any sense to create dimension table for each attribute. I would suggest you to check that with your data modeler. Also do you want to have any history on these dimension because the design will differ depending on your SCD.
Last edited by us1aslam1us on Sat Apr 21, 2007 4:42 pm, edited 1 time in total.
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

If you really have to do this then parameterize the table name and the value in the column and associate them in each run. I would put them in a table.

ColumnValue TableName
A TableA
B TableB

Then loop thru these and feed them into the job one at a time.

select * from source.table where ColumnWhatever = '#jpColumnValue#';

#jpTableName# is in the table name for the output link.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ralph Kimball wrote:Real dimensional models in the business world seem to contain between 4 and 15 dimensions. Models with only 2 or 3 dimensions are rare, and often lend the suspicion that more dimensions should be added to the design, if only the designer would stop and think. Models with 20 or more dimensions, for different reasons, seem unjustified. A 20-dimension design usually has superfluous dimensions that should be combined. Although the final choice of dimensions is the prerogative of the designer, there is a fundamental sense that the identity of the dimensions is rooted in the reality of the business environment, and not in the designer's taste. Few of us really believe that we live in a world in which 20 or more independent variables affect every observations.
... and certainly not 100 independent variables!!!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It sounds like the OP has to convert one row of 100 columns into 100 rows of one column and wants a simple reusable way to do it.

Ignoring the fact that this data model design makes zero obvious sense, the idea would be as follows:

1. Create a parameter to be used as the column name.
2. Create a parameter to be used as the target table name for the column.
3. Create a multi-instance job of OCI-->XFM-->SEQ
4. Use user-defined SQL of selects keycolumns, #columnname# from onehundredcolumn table.
5. Output file is #tablename#.
6. Run 100 instances of said job, passing in the parameter couplet of columnname and tablename.
7. Load the 100 files produced, one named for each dimension table.

I suggest using custom Batch logic with an array of column names and target table names. Use a loop to start each job instance, passing the unique parameter couplet inside the job control loop.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
nick.bond
Charter Member
Charter Member
Posts: 230
Joined: Thu Jan 15, 2004 12:00 pm
Location: London

Post by nick.bond »

100 dims - 1 fact - sounds lop-sided!
Regards,

Nick.
Post Reply