Lookups and Performance with ODBC stages

jusami25 · Post by **jusami25** » Thu Mar 02, 2006 2:45 am

Hi guys,

we're implementing several extracting jobs against DB2 databases. During the process there are several joins to be implemented. In the first attempt of creating a single lookup, the primary link was a select with a ODBC the reference link was a previously created Hash File (with the preload to memory option set to true) and the output was an insert to a ODBC stage. I read 200 rows/sec from the initial ODBC stage but the lookup was slowing the perfomance to 8 rows/sec. I think this is very slow, the average performance was 8 rows/sec, Is that normal??

In terms of performance whatg is better to do lookups or to implement the JOINS in the SQL user-defined of the ODBC stage?? Doi you guys have any guidelines for that??

Thanks in advance.
Julio

kumar_s · Post by **kumar_s** » Thu Mar 02, 2006 3:40 am

You need to search the forum. This has been discussed many times.
Ok, when you use the lookup against Hashed file, the performance will be better. Always prefere lookup when you have a comparatively less number of rows in the reference link. Though its for parallel Benchmark has been provided in a blog by vincent. Large amount of data cannot be loaded in to the lookup memory. When you perform a join, you need to sort the data prior to the join. Apart from this, there is nothing to do with the physical memory. You can come to a conclusion that if you have a large number of data to be looked up, use join. If the reference link data is less you can opt for lookup.