Performance of DB2/EE stage and DB2/UDB API stage

Chuah · Post by **Chuah** » Sun Oct 08, 2006 11:28 pm

Hi,

I am trying to compare the performance of the DB2/EE stage against the DB2/UDB API Stage.
In a read-only job, the DB2/UDB stage reads more than 50000 rows/sec whereas the EE stage could only manage 11000 rows/sec on the same database / table.
Incidentally the speed of the EE stage is almost the same as that of the ODBC Enterprise stage for DB2.

Is this consistent behaviour of the EE stage or are we missing something here because the performance is way too slow.
How should we tune the EE stage to perform better ?

Rgds,
Chin

ArndW · Post by **ArndW** » Mon Oct 09, 2006 2:09 am

Chin,

my observations have been quite different. Is your DB/2 a partitioned database? What sort of a config file do you have? Does the read speed correlate with the write speed?

Chuah · Post by **Chuah** » Sun Oct 15, 2006 10:19 pm

ArndW wrote:Chin,

my observations have been quite different. Is your DB/2 a partitioned database? What sort of a config file do you have? Does the read speed correlate with the write speed?

Hi ArndW,
Thanks for the reply.
currently the DB2 database has not been installed with DPF yet which we have requested the DBA to do so. The config file was defined with 4 nodes. Given the scenario, I would imagine to have at least on par performance with the UDB stage,the database needs to have at least 5 partitions since on a single node we get only 11000 rows/sec ?
What do you think ?

Rgds
Chin

ArndW · Post by **ArndW** » Mon Oct 16, 2006 12:12 am

The read speed should be measured in a simple job that outputs nothing (i.e. to a sequential file /dev/null) or with a transformer constraint. Try starting off with a 1 node configuration file in addition to your 4 way file and see what the speeds are.

vmcburney · Post by **vmcburney** » Mon Oct 16, 2006 12:13 am

It depends on what your EE job is doing with the data after it reads it. For example if you are writing it to a sequential file you are potentially partitioning the data, repartitioning and sending it through a file export process. A good way to judge is to send the output of both to a transformer with a constraint of 1<>1. That way they have the same data termination. You should also make sure they have the same array size, which I think is an optional option you can configure in the parallel stage.