Automatic reset of Parallel Job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Automatic reset of Parallel Job

Post by dxk9 »

Hi,
I need to know if a aborted job can be reset automatically via script/reset job(if any).

I have a sequence scheduled to run daily. But certain jobs aborts due to lack of resources and once its reset and the sequence is re-started, it runs fine.

Since the abort is frequent, I need to know if I can reset the job automatically via a script. Also let me know if I can access the status of the job automatically.

Thanks in advance,

Divya
Klaus Schaefer
Participant
Posts: 94
Joined: Wed May 08, 2002 8:44 am
Location: Germany
Contact:

Post by Klaus Schaefer »

If you're using a sequence anyhow, you don't need a script to achieve this. In the job-activity simply set the "Execution action" to "Reset if required, then run" from the drop down list...

Klaus
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

Where is the job-activity option??

Thanks in advance,

Divya
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

It's a sequence stage.
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

Thanks for the prompt response :)

I understand that this option will reset the job(already in Abort state) and then run it if necessary, but if the job get aborted while running in the sequence, will be try to reset it and continue the sequence with the this job run??

Thanks in advance,

Divya
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You may have to break the job into smaller units or increase your "resource" availibility.
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

Breaking into smaller jobs is not possible as the job itself is a modular one. Increasing the resource availability is not possible as we do not have access and at the server side, they are not ready to increase the resouce size. :(

Regards,

Divya
miwinter
Participant
Posts: 396
Joined: Thu Jun 22, 2006 7:00 am
Location: England, UK

Post by miwinter »

Coming at it from another angle, what are the resources that it fails due to a lack of?
Mark Winter
<i>Nothing appeases a troubled mind more than <b>good</b> music</i>
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

We get error related to "SIGINT" or "SIGKILL". Occasionally we get errors such as "output file full" or "scratch fill full".

Regards,
Divya
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You need to provide
a.) job design
b.) resource available
c.) error messages received - unedited

for anyone to assit.

Scratch or Resource disk being full may be due to several reasons.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

"Since the abort is frequent" I too would suggest that simply automating the restart is not the answer but rather you need to rethink the Sequence job to run fewer jobs at the same time so it doesn't abort.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

Here are some of the frequent errros which we face:

a.Fatal Error: waitForWriteSignal(): Premature EOF on node etlprd3 No such file or directory

b.main_program: ORCHESTRATE step execution terminating due to SIGINT

c.Fatal Error: Tsort merger aborting: Scratch space full

d.Fatal Error: Unable to allocate communication resources

All the above errors aborts the job. But once the job is reset and run, its running fine.

Regards,
Divya
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

dxk9 wrote:Here are some of the frequent errros which we face:


c.Fatal Error: Tsort merger aborting: Scratch space full
Did you try pre-sorting your sources ?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

dxk9 wrote:Here are some of the frequent errros which we face:


c.Fatal Error: Tsort merger aborting: Scratch space full
Did you try pre-sorting your sources ?
dxk9
Participant
Posts: 105
Joined: Wed Aug 19, 2009 12:46 am
Location: Chennai, Tamil Nadu

Post by dxk9 »

No, I dont do any pre-sorting. The job is something like this,

sequential file 1 ---> Oracle Enterprise 1
Change_capture--->Filter--->
sequential file 2 ---> Oracle Enterprise 2

I dont know where exactly I get the error. Mostly it occurs even before the import from input files are done. :(

Regards,
Divya
Post Reply