Page 1 of 1

DataSet Vs Database

Posted: Fri Jul 21, 2006 7:54 am
by samba
I would like to know which one is the best practice in below scienario

sourceDataSet/Database ----> transfromation --> lookups ---> Database

i have several lookups it includes tables, some of the tables act like 2 times with different query on the table. entire job includes 10-12 lookups

which is the best way whether to use database or Dataset in lookups for the best performance

suppose if the Jobs started all stage will active at same time and closes at same time. how it will work?





Thanks in Advance
Samba

Posted: Fri Jul 21, 2006 8:06 am
by kcbland
The concept of pipeline parallelism means that the data is flowing across all stages. In your diagram, I don't see why the data won't be coming out of the source database while other data is already loading into the database. The exception would be if you introduced synchronization stages that cause the data to pool before proceeding.

It's impossible to tell you why to design your job a certain way without knowing how much data is in each lookup. You wouldn't build a dataset of 100 million rows if you only need 1 million. Then again, constantly banging direct lookups against a database for 50K rows would be shunned in favor a Lookup or a dataset.

DataSet Vs Database

Posted: Fri Jul 21, 2006 8:54 am
by ashwin141
Hi

If you are talking about the reference file for you lookup then I think, Datasets will be better than Database. You can even go for Lookup Fileset as Reference files for your lookups.

Regards
Ashwin