Lookups and Performance with ODBC stages

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
jusami25
Premium Member
Premium Member
Posts: 84
Joined: Tue Oct 26, 2004 12:49 am

Lookups and Performance with ODBC stages

Post by jusami25 »

Hi guys,

we're implementing several extracting jobs against DB2 databases. During the process there are several joins to be implemented. In the first attempt of creating a single lookup, the primary link was a select with a ODBC the reference link was a previously created Hash File (with the preload to memory option set to true) and the output was an insert to a ODBC stage. I read 200 rows/sec from the initial ODBC stage but the lookup was slowing the perfomance to 8 rows/sec. I think this is very slow, the average performance was 8 rows/sec, Is that normal??

In terms of performance whatg is better to do lookups or to implement the JOINS in the SQL user-defined of the ODBC stage?? Doi you guys have any guidelines for that??

Thanks in advance.
Julio
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

You need to search the forum. This has been discussed many times.
Ok, when you use the lookup against Hashed file, the performance will be better. Always prefere lookup when you have a comparatively less number of rows in the reference link. Though its for parallel Benchmark has been provided in a blog by vincent. Large amount of data cannot be loaded in to the lookup memory. When you perform a join, you need to sort the data prior to the join. Apart from this, there is nothing to do with the physical memory. You can come to a conclusion that if you have a large number of data to be looked up, use join. If the reference link data is less you can opt for lookup.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply