Page 1 of 1

Dealing with huge data

Posted: Mon Jul 25, 2011 6:25 am
by datastagedw
Hi All,

We have a requirement of running a query on a huge DB table (100 million records or more) to get the required information. We do not have any filtering criteria to reduce the amount of data to be brought in to DS. Also we need to join this huge data with little less(less than a million records) data from a different database table.

Is it advisable to run this join in DS using DB connector and join stages. We are running on a 4 node SMTP server. let me know if more details are required

Re: Dealing with huge data

Posted: Tue Aug 30, 2011 8:53 pm
by shawn.k
You should be fine doing this join in DS. I suggest you to use correct partitioning and sort data if possible in DB before it gets to DS. Test your job with less data and see how it works before running it against full set.

Re: Dealing with huge data

Posted: Tue Aug 30, 2011 9:10 pm
by chulett
datastagedw wrote:Also we need to join this huge data with little less(less than a million records) data from a different database table.
A different table... in the same database?

Posted: Wed Aug 31, 2011 7:08 am
by DSguru2B
Is that join going to reduce the data???
If yes, then you can load the data from the "other" database into your current database that houses your source. Pass a sql join and extract.