Page 1 of 1

Basic Transformer within Parallel Job

Posted: Fri May 15, 2009 5:25 am
by palmeal
We have a parallel job which incorporates a Basic Transformer. When running a job there was a failure around the Basic Transformer due to some resources error on the UNIX server. When we restarted the job it kept failing as the Basic Transformer job was left in an invalid state. This is despite us doing a job reset at the start of any job run.
Is there a problem using Basic Transformers in a Parallel Job?
If so then is there a workaround or patch that we can introduce?

Posted: Fri May 15, 2009 5:33 am
by nagarjuna
For us when we used a basic transformer in a parallel job has caused a mutation error when its processing huge data .

Posted: Fri May 15, 2009 6:18 am
by chulett
Perhaps if you posted the actual errors you are seeing it would help. There's no fundamental problem, per se, other than all the normal reasons to not use one.

Posted: Fri May 15, 2009 6:45 am
by palmeal
The initial error on the Basic Transformer was:

1885 FATAL Thu May 14 22:45:56 2009
BASIC_Transformer_18,0: dspipe_wait(1597): Writer timed out waiting for Reader to connect.

On trying to re-run the job after this failure (job is reset) we got the following on every other attempt:

Event Id: 1995
Time : Fri May 15 03:21:26 2009
Type : FATAL
User : dsadm
Message :
BASIC_Transformer_18,0: Unable to run job - -2.
Event Id: 1996
Time : Fri May 15 03:21:26 2009
Type : FATAL
User : dsadm
Message :
BASIC_Transformer_18,0: the runLocally() of operator [DSJobRun in BASIC_Transformer_18], partition 0 of 2, processID 1096 on node1 failed.

The only way to get around this was to rebuild the code from source reseting the status of all jobs.

Posted: Fri May 15, 2009 11:40 am
by priyadarshikunal
whats the value of environment variable DSIPC_OPEN_TIMEOUT? if default 30, try increasing it to 600.

and DS_TDM_PIPE_OPEN_TIMEOUT, by default it should be 720.

Posted: Fri May 15, 2009 4:45 pm
by ray.wurlod
Is node1 a different machine from the conductor node?

Posted: Mon May 18, 2009 8:52 am
by palmeal
priyadarshikunal wrote:whats the value of environment variable DSIPC_OPEN_TIMEOUT? if default 30, try increasing it to 600.

and DS_TDM_PIPE_OPEN_TIMEOUT, by default it should be 720.

Our DSIPC_OPEN_TIMEOUT is set to 30 - this is something that our Admin team will have to tinker with.

Posted: Mon May 18, 2009 9:00 am
by palmeal
ray.wurlod wrote:Is node1 a different machine from the conductor node? ...
We only have one server available to us.