Same partition with Two Nodes

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ravij
Premium Member
Premium Member
Posts: 170
Joined: Mon Oct 10, 2005 7:04 am
Location: India

Same partition with Two Nodes

Post by ravij »

Hi ,

When I define 2 nodes in Same Partition the Job is getting aborted n with single node its working fine.

The error log I got is

Code: Select all

main_program: Fatal Error: There are irreconcilable constraints on the number of
partitions of an operator: parallel Sur_Key_For_Same_Partition.
The number of partitions is already constrained to 2,
but an eSame partitioned input virtual dataset produced by
 sequential Src_SeqFile has 1.
This step has 2 datasets:
ds0: {op0[1p] (sequential Src_SeqFile)
      eSame<>eCollectAny
      op1[2p] (parallel Sur_Key_For_Same_Partition)}
ds1: {op1[2p] (parallel Sur_Key_For_Same_Partition)
      >>eCollectAny
      op2[1p] (sequential APT_RealFileExportOperator in Tgt_SeqFile)}
It has 3 operators:
op0[1p] {(sequential Src_SeqFile)
    }
op1[2p] {(parallel Sur_Key_For_Same_Partition)
    }
op2[1p] {(sequential APT_RealFileExportOperator in Tgt_SeqFile)
If its the case how can I achieve the Parallelism with Same Partition?
Any answer can be appreciated.
Thanks in advance.
Ravi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please post the configuration file. Without this it's difficult to judge. For example, what exactly do you mean by "two nodes in the same partition"?

Further, be aware that Sequential File stages can only sustain parallel operation under very particular circumstances. This is not the same as "multiple readers per node".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Re: Same partition with Two Nodes

Post by ameyvaidya »

Hi Ravij,
this part of the posted error message has a clue:
ravij wrote:

Code: Select all

There are irreconcilable constraints on the number of
partitions of an operator: parallel Sur_Key_For_Same_Partition.
The number of partitions is already constrained to 2,
but an eSame partitioned input virtual dataset produced by
 sequential Src_SeqFile has 1
A Sequential file reading a single file runs Sequentially(1 partition only). After the dataset is read, it is partitioned ( to a 2 node config in your case).

Since the "same" partitioning method is employed just after the sequential file in this job, no partitioning can take place. Hence the error.

In the stage after the sequential file, partition the data; after this stage, the method can be kept "same".
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
ravij
Premium Member
Premium Member
Posts: 170
Joined: Mon Oct 10, 2005 7:04 am
Location: India

Post by ravij »

Hi Ray,

This is my Configuration file which contains 2 nodes.

Code: Select all

 {
	node "node2"
	{
		fastname "PROXY-1"
		pools "DefPool" 
		resource disk "C:/Ascential/DataStage/Datasets" { pools "" "pool1" }
		resource scratchdisk "C:/Ascential/DataStage/Scratch" { pools "" "pool2" }
	}
node "node3"
	{
		fastname "PROXY-1"
		pools ""
		resource disk "C:/Ascential/DataStage/Datasets" { pools "" "pool1" }
		resource scratchdisk "C:/Ascential/DataStage/Scratch" { pools "" "pool1" "pool2"}
	}


}
Any answer can be appreciated.
thanks in advance.
Ravi
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

HI,
Is that you have played with No of Readers per Node option in Sequential file?

-Kumar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

OK, I now see that you mean "two identically defined processing nodes". Though unusual, and not optimally efficient, this is legal and not the source of your problem.
It appears that the virtual Data Set on the link between the srcSeqFile stage and the Surrogate Key Generator stage contains some kind of incompatibility with what the latter expects.
Can you post the generated OSH? This will help to confirm this theory or otherwise.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,

Pls let me know what is the poit/benifit of mantaing single node as two virutal node.
Ultimatly the single node is going to process for the both partition :roll:

-Kumar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's obliging two processes to be created for each stage operating in parallel mode. Presumably to force the use of two CPUs, even though they're sharing not only memory but also disk and scratch disk resources.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply