DataStage Process Control

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

DataStage Process Control

Post by jerome_rajan »

Hi,
We are in the process of defining a framework through which each of our DataStage jobs should flow through before finally executing. What are the things to look out for while defining such a framework?

For starters, I am planning to check for other instances of the same job with the same invocation id running at the time of triggering the job; Can you please help me with the other things to check for?
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
dsetlteam
Premium Member
Premium Member
Posts: 35
Joined: Mon Feb 10, 2014 10:14 pm
Location: USA

Post by dsetlteam »

Check how many jobs are running (It is possible that at a given time if many jobs are executed, then they might run out of resources)
Check if job is in aborted/crashed/not compiled state
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Code: Select all

1. Check if same job is already running
2. Check how many jobs are simultaneously running. Set a threshold value and do not allow the number of jobs to cross the threshold
What else?
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm assuming you're talking about a generic "start a DataStage job" script / process, yes? We kept ours simple... pretty much just check to make sure the job exists and is in a runnable state and then whatever was needed "post" to determine what happened or to capture whatever you wanted to always capture from any job. And then leveraged it with our Enterprise Scheduler. Trying to decide how many jobs are "too many" seems far too error prone to me as it's more about the resources the jobs are using rather than necessarily the shear number of jobs. You'd probably need to look at the load on the box and... other stuffs. :wink:

Extending this metaphor, we leveraged a highly modified version of Ken Bland's Job Control Utility, something that would (amongst many other things) drive a series of jobs from a list, enforce dependencies and only allow a configurable number of jobs to run at any given time. So rather than fail to start a singleton job because of resources, it would hold jobs in the queue until enough finished to allow those waiting to be started. Still not based on "load" however, but a nice feature nonetheless.

Wasn't sure how far reaching your Process Control plans were so just thought I'd throw that out there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply