Transformer BlockSize and node_node1: Player 1 terminated
Posted: Thu Feb 12, 2009 2:17 am
Hi all,
Here is the situation we have a series of simplistic jobs which extract from Unidata and then load into Db2. This is more of a replication project than true complex ETL.
Anyway we had a problem with the following errors appearing in jobs randomly. Well so I thought.
Message:: SEFJobVariationTypes,0: Internal Error: (shbuf): iomgr\iomgr.C: 1880
Message:: node_node2: Player 1 terminated unexpectedly.
Message:: node_node1: Player 1 terminated unexpectedly.
Message:: main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9(SIGKILL).
Message:: main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
A search of the site referred to threads where people referred to patches that need to be applied not the case here as we are on 8.01 with FixPatch1A applied.
IBM support couldn't shed any light ... change this do that ... nothing.
Job structure for Parallel Load
SEF --> TRansFormer --> Db2Load
In the Transformer we use a sequence (SurrogateKeyGen) to generate the number and have set the blocksize to something reasonable to Say 200 for a reasonable number of rows to be processed otherwise the performance hit is marked with system block size checked.
With the system block size set to 200 we experience the error above and jobs abort all over the place.
Anyway changing back to system blocksize on these surrogateKeyGen in the transformer removed the problem.
Can someone please give me a rationale as why this may occur ? I can understand that allocating the blocksize means that this number of sequences need to be stored in memory for the duration of the job, but we are only talking about 200
Why would this small obscure change in the Transformer have such a major affect on the DS server and cause the jobs to abort with memory issues.
Any knowledge you could impart that would make sense of this would be appreciated.
Regards
Nick
Here is the situation we have a series of simplistic jobs which extract from Unidata and then load into Db2. This is more of a replication project than true complex ETL.
Anyway we had a problem with the following errors appearing in jobs randomly. Well so I thought.
Message:: SEFJobVariationTypes,0: Internal Error: (shbuf): iomgr\iomgr.C: 1880
Message:: node_node2: Player 1 terminated unexpectedly.
Message:: node_node1: Player 1 terminated unexpectedly.
Message:: main_program: APT_PMsectionLeader(2, node2), player 1 - Unexpected termination by Unix signal 9(SIGKILL).
Message:: main_program: APT_PMsectionLeader(1, node1), player 1 - Unexpected exit status 1.
A search of the site referred to threads where people referred to patches that need to be applied not the case here as we are on 8.01 with FixPatch1A applied.
IBM support couldn't shed any light ... change this do that ... nothing.
Job structure for Parallel Load
SEF --> TRansFormer --> Db2Load
In the Transformer we use a sequence (SurrogateKeyGen) to generate the number and have set the blocksize to something reasonable to Say 200 for a reasonable number of rows to be processed otherwise the performance hit is marked with system block size checked.
With the system block size set to 200 we experience the error above and jobs abort all over the place.
Anyway changing back to system blocksize on these surrogateKeyGen in the transformer removed the problem.
Can someone please give me a rationale as why this may occur ? I can understand that allocating the blocksize means that this number of sequences need to be stored in memory for the duration of the job, but we are only talking about 200
Why would this small obscure change in the Transformer have such a major affect on the DS server and cause the jobs to abort with memory issues.
Any knowledge you could impart that would make sense of this would be appreciated.
Regards
Nick