Page 1 of 1

Need datastage development efforts advice

Posted: Wed Oct 30, 2013 10:20 pm
by Santhinisankar
Hi,
I am working on Datastage ETL estimation. The details I have is I have 100 source tables and the ETL architecture is

Source --> staging--> EDW--> Data Mart. EDW is in 3rd normal form and datamart is in Star schema. The complexity of the jobs to be considered is complex-20% , medium-30% and simple 50%. There are specific requirement to implement data quality like standardization and de-duplication. The version of datastage we are using is DS 8.7. Do not have much details. please suggest rough development effort estimate in PDs for ETL and Data Quality ? What would be the rough percentage of estimate between ETL and Data quality estimates ?

Posted: Thu Oct 31, 2013 12:35 am
by ray.wurlod
How good are your developers?

Posted: Thu Oct 31, 2013 1:37 am
by Santhinisankar
average. Have 2-3 years of experience

Posted: Sun Nov 03, 2013 9:43 pm
by Santhinisankar
Any suggestions/advice please ?

Posted: Mon Nov 04, 2013 8:07 am
by roy
Hi,
1.This is too complex to answer on 1 foot.
2. You might want to hire a consultant for this, since it will take an effort to estimate.

Good Luck.

Posted: Mon Nov 04, 2013 9:55 am
by kduke
You need a rough estimate for each type of job. Next you need to know how many of each type you have. I would estimate each landing point separately. So do not count source to data mart as all one thing. I would also separate dim jobs from fact jobs. I always count lookups too. I bump my estimates for less experienced developers.

1. Dim: Source => Staging (3 days) (# tables)
2. Fact: Source => Staging (5 days) (# tables)
3. Dim: Source => EDW (4 days) (# tables)
4. Fact: Source => EDW (6 days) (# tables)
5. Dim: Source => Datamart (5 days) (# tables)
6. Fact: Source => Datamart (7 days) (# tables)

The days are not accurate. You need to guess or build one job to figure it out. Build a prototype. Now you can bump these numbers up for more complex jobs or add 12 more rows for complex and medium jobs.

Lookups make the jobs more complex and that is how I figure out what is more complex. You may need to change your estimates based on testing (unit,QA and UAT). This can get too complex to be useful if you don't keep it simple.

Posted: Tue Nov 05, 2013 4:49 am
by eph
Hi,

In addition you can check this blog post which is very instructive on development time estimation.

Eric

Posted: Tue Nov 05, 2013 8:25 pm
by ray.wurlod
Just been to a presentation at IOD where a quite competent team reported a multi-week development effort around a series of a few dynamic ETL jobs.

Posted: Thu Nov 07, 2013 4:29 am
by Santhinisankar
Ray could you please elaborate on this ?