Hi,
We are in the process of defining a framework through which each of our DataStage jobs should flow through before finally executing. What are the things to look out for while defining such a framework?
For starters, I am planning to check for other instances of the same job with the same invocation id running at the time of triggering the job; Can you please help me with the other things to check for?
DataStage Process Control
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 376
- Joined: Sat Jan 07, 2012 12:25 pm
- Location: Piscataway
DataStage Process Control
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
-
- Premium Member
- Posts: 376
- Joined: Sat Jan 07, 2012 12:25 pm
- Location: Piscataway
Code: Select all
1. Check if same job is already running
2. Check how many jobs are simultaneously running. Set a threshold value and do not allow the number of jobs to cross the threshold
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
I'm assuming you're talking about a generic "start a DataStage job" script / process, yes? We kept ours simple... pretty much just check to make sure the job exists and is in a runnable state and then whatever was needed "post" to determine what happened or to capture whatever you wanted to always capture from any job. And then leveraged it with our Enterprise Scheduler. Trying to decide how many jobs are "too many" seems far too error prone to me as it's more about the resources the jobs are using rather than necessarily the shear number of jobs. You'd probably need to look at the load on the box and... other stuffs.
Extending this metaphor, we leveraged a highly modified version of Ken Bland's Job Control Utility, something that would (amongst many other things) drive a series of jobs from a list, enforce dependencies and only allow a configurable number of jobs to run at any given time. So rather than fail to start a singleton job because of resources, it would hold jobs in the queue until enough finished to allow those waiting to be started. Still not based on "load" however, but a nice feature nonetheless.
Wasn't sure how far reaching your Process Control plans were so just thought I'd throw that out there.
Extending this metaphor, we leveraged a highly modified version of Ken Bland's Job Control Utility, something that would (amongst many other things) drive a series of jobs from a list, enforce dependencies and only allow a configurable number of jobs to run at any given time. So rather than fail to start a singleton job because of resources, it would hold jobs in the queue until enough finished to allow those waiting to be started. Still not based on "load" however, but a nice feature nonetheless.
Wasn't sure how far reaching your Process Control plans were so just thought I'd throw that out there.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers