Experiences with DataStage High Availability?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Experiences with DataStage High Availability?

Post by ogmios »

Does anyone have experiences with DataStage High Availability on Solaris UNIX Machines?

Has anyone already implemented the failover scripts that are shown in the "DataStage Operator Tips Tricks Final Version Customers.ppt" in the files section of http://developernet.ascential.com ? And do they work with the Standard version of DataStage or only with the PE edition?

Ogmios
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
do you have a specific need?
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

As high available as possible. We already have "high availability" on hardware level, every component is duplicated: when a disk crashes its mirror is automatically used, when a backplane gets broken the machine reboots and automatically switches to a backup backplane.

Now customers want to implement application high availability, when e.g. a backplane is broken they want DataStage to be restarted automatically upon reboot and somehow pick up the ETL cycle where it left off. Yeah right.

Ogmios
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What hardware are you running on? The same basic tasks as listed in the Tips & Tricks presentation can work for Server as well as PX I would think, they just differ in some of the gory details. In a former life we ran DataStage on a multi-node Compaq Tru64 cluster with failover scripted in. Wasn't that hard to setup from what I remember, biggest issue was failing over the crontab with the server for any affected users.

As to restarting the jobs where they left off, in my mind that wouldn't be any different in that particular environment than in a crashed stand-alone server implementation. I'd be curious what their script looks like and how it decides to restart job streams 'appropriately'. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

Solaris 8/Sunfire (with a big number :) ). The pictures in the presentation don't look like a representation of the standard DataStage edition: e.g. haven't seen a "player" mentioned elsewhere in the documentation I have (will check with Ascential).

Right now I was thinking of rolling our own and we're evaluating to use a third party "smart" scheduler instead of crontab that starts the jobs and can be programmed to restart the jobs which didn't finish yet after a crash.

Ogmios
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I think you'll find that 'Player' is specific to the Orchestrate underpinnings that power PX.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

chulett wrote:I think you'll find that 'Player' is specific to the Orchestrate underpinnings that power PX.
True. Orchestrate gives you a Conductor and a number of Players. Wonder where they got their terminology? :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

ogmios wrote:Now customers want to implement application high availability, when e.g. a backplane is broken they want DataStage to be restarted automatically upon reboot and somehow pick up the ETL cycle where it left off.
This is possible only if you write an ETL application built to restart. There are frameworks by which your ETL application has staging milestones (see all of my history on this forum supporting this architecture) where data is landed to a physical form. These staging milestones set logical and convenient milestones within the processing stream so as to provide a fallback point in the event of catastrophic failure. Since a particular segment of processing may be manipulating staged datasets, and failure during that segment of processing will leave the dataset in a unrecoverable state, you have to be able to fall back to a recent milestone whereby you can pick up from there with a minimal amount of reprocessing.

I can espouse this framework further, but I want to say that this works. The critical elements are that your staging databases and work dataset file systems have to be mirrored for failover. Furthermore, you have to have an enterprise scheduler (or at least intelligent job control) that is smart enough to either resurrect a jobstream exactly where it last stopped, or restart the jobstream from the last completed milestone if in an unrecoverable segment of processing. The basic milestones are sourcing from source systems, lookup preparations, dimensional transformation, fact transformation, aggregate transformation, target database load preparation, bulk inserting, bulk updating. This is part of a Kimball'ian datawarehouse/bus architecture, which is highly successful.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply