High volume of reference data

synsog · Post by **synsog** » Thu Mar 10, 2011 1:15 am

Hi All,

We are getting high volume of reference data is one of my job and this job is taking 40 min time for complete execution.

On daily basis they are adding reference data and at this point of time we are getting more than 50 million records.

We have tried few options but the issue is still same.

1. tried with join and merge instead of look up
2. tried with recordset dataset as reference data instead of refering against the table directly
3. tried with creating indexes on the table

And also there is no limit for this reference data there is a chance of getting more volume in future.

Anyone please suggest if you have any idea on this.

Thanks

jwiles · Post by **jwiles** » Thu Mar 10, 2011 1:27 am

Questions:

1) What's the length of the reference data records (sum of length of columns)?
2) How many jobs use this reference data each day? Just this one job or multiple jobs?
3) How long does it take to extract the data from the table and write it to a dataset?
4) What database stage do you use to extract the reference data and do you extract the data in sequential or parallel mode?
5) For Join or Merge, do you include a Sort stage for the reference data or do you just use the input link sort option on the Join/Merge stage?
6) When is the reference table updated in relation to when you use the data? Hours/Minutes, Overnight update versus daytime usage?