We have a reference match on standardized address and area data. The reference data set has 25 million rows, and loads into the job at only 1700 rows per second. There are only 12,000 rows at a time in the source data set.
Can the experts please share ways to improve the performance of a reference match job? We would like to be able to run this match hourly but it currently runs 3.5 hours.
Reference Match Performance
Reference Match Performance
Kevin K Tashadow
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If the source and reference data have a common key, then you can build a temporary table of the source data keys, and extract the reference data from that joined to the actual reference data, thereby processing only the reference records that are actually needed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.