Page 1 of 1

kicking off DStage jobs with external script?

Posted: Mon Mar 31, 2008 8:01 am
by jshurak
Our Data Stage ETL processes support a large data warehouse. In an effort to minimize lag between data entering our AS400 production box and the data entering our data warehouse my boss has asked me to investigate ways in which Data Stage jobs can be executed with external scripts. The work flow would be something like:

Code: Select all

  Data enters AS400 --> once fully processed, file created or script gets kicked off --> file or script kicks off Data Stage job
My initial thought was to schedule the Batch job according to a rough estimate of when the data will be processed. Have the AS400 job create a 'notification' file on a network drive somewhere. Back in Data Stage, use the DSWaitForFile before-job subroutine to look for that file and loop until the file is present.


Does sound efficient? One thing I'm worried about is overworking our Data Stage server. Is there a better way? Maybe I can use this as an excuse to beef up the server! :lol:

Posted: Mon Mar 31, 2008 8:54 am
by kcbland
Ad-hoc or dynamic "scheduling" means you're not controlling what runs in competition with other processes. If two jobstreams are eligibible to run (notify files present) then do you really want them to compete for resources? Your method is fine, I only question why an enterprise scheduler wouldn't be used.

Posted: Mon Mar 31, 2008 9:24 am
by jshurak
Kenneth, thanks. That was my initial concern. By Enterprise scheduler, you mean a scheduler to incorporation both systems, not just the Data Stage server, right? One major complication to that is cost (of course). Trying to justify the purchase of such an application will be difficult knowing my organization. Again, thank you.

Posted: Mon Mar 31, 2008 9:30 am
by kcbland
With an enterprise scheduler you can fit tasks into "classes" and then disallow too many tasks within the same "class" to execute. You can also build in "mutual exclusivity" so that if you have three tasks (jobstreams maybe) that are system killers, only one is allowed to execute at any given time. Most enterprise schedulers have file watching built into them so it's really easy to do what you want.

Posted: Mon Mar 31, 2008 9:48 am
by shawn_ramsey
I agree with Ken the best approach is a scheduler. I have been in the same boat where we had to go through the process of convincing management to get a scheduler. I was finally able to locate a fairly good one that was within their price tolerance. We ended up with ActiveBatch http://www.activebatch.com/ and have been pretty happy with it.


BTW. The other tremendous benefit to a enterprise scheduler is the visibility it gives to the process flow. This is something that the existing SQL Scheduler based scheduling did not provide.