Issue when 'Execution Mode' set to 'Parallel' for DB2-API

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
basav_ds
Participant
Posts: 24
Joined: Sun Nov 11, 2007 11:19 pm
Location: Mumbai

Issue when 'Execution Mode' set to 'Parallel' for DB2-API

Post by basav_ds »

My job contains a DB2-API stage which reads data from a table containing 10,000 records and passes it to Sort stage. Then the sorted data is input to Aggregator stage and the output of Aggregator is written to Sequential Fille.
I want this job to be run on multiple nodes (Here i am simulating it on 1, 2, 4, 6 and 8 nodes) parallelly and test the performance.
For Sort and Aggregator stages, Default execution mode is set to 'Parallel'(I am using it as it is). But for DB2-API stage Default execution mode is set to 'Sequential' so i changed it to Parallel.

Issue:

1) When i run the job on 2 nodes, the DB2-API stage is actually reading 20,000 records(i.e 10,000*2) instead of 10,000 records..
When i run the job on 4 nodes, the DB2-API stage is actually reading 40,000 records(i.e 10,000*4) instead of 10,000 records.. and so on. Why?

2) For DB2-Enterprise stage, default Execution mode is Sequential. I can't change it to Parallel as it is disabled. Can't we run DB2-Enterprise stage in Parallel execution mode? Why?

3) Where as ODBC connector stage is having default Execution mode as Parallel. Any reason behind it?
I never let school to interfere in my education
stefanfrost1
Premium Member
Premium Member
Posts: 99
Joined: Mon Sep 03, 2007 7:49 am
Location: Stockholm, Sweden

Post by stefanfrost1 »

1) When i run the job on 2 nodes, the DB2-API stage is actually reading 20,000 records(i.e 10,000*2) instead of 10,000 records..
When i run the job on 4 nodes, the DB2-API stage is actually reading 40,000 records(i.e 10,000*4) instead of 10,000 records.. and so on. Why?
The DB2-API stage will connect to the database via DB2 Connect and when changed to parallell it will open a connection per node therfore multiplying your data likewise (unless you have a patch for DPF with range partitions).
2) For DB2-Enterprise stage, default Execution mode is Sequential. I can't change it to Parallel as it is disabled. Can't we run DB2-Enterprise stage in Parallel execution mode? Why?
Strange that its set to sequential!? However DB2EE-stage will run sequnatial if you have a user-defined SQL. It will also run sequential if you only have one node on your database...
-------------------------------------
http://it.toolbox.com/blogs/bi-aj
my blog on delivering business intelligence using agile principles
Post Reply