What does it take to implement HA with IIS?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

What does it take to implement HA with IIS?

Post by asorrell »

This is in response to another PM where someone wanted to know more about implementing a High Availability configuration with DataStage.

I have configured and worked on both "Active/Active" and "Active/Passive" configurations. In "Active/Active" both systems are up and running jobs and a failover switches to the one unaffected system (side note: must have licensing to support this!). In "Active / Passive", though both boxes are "up", only one is actively running jobs. If there's a problem, the workload is shifted to the other box.

Though I've worked on several HA configurations, I've always done them in partnership with the vendor supplying the HA software (like Veritas VCS or IBM HACMP). They have the larger part of the burden in that their software has to detect the problem, shutdown the problematic system and reconfigure / restart processes on the new system.

Especially with Active/Active systems there can be a lot of re-configuration that is required within DataStage to make the fail-over transparent. For instance system names, resource paths, configuration files all must be "re-aligned" for the new system. This is especially problematic for parallel datasets where improper conifgurations might result in only half the data getting read!

Prior to release 8.5 much of my time was spent creating scripts that would detect "the failover" and do the required reconfiguration automatically. My understanding is that 8.5 has new features that eliminate the need for much of that custom scripting (I'll be able to confirm in a few weeks!).

A working HA system is a beautiful thing. I remember one site where we were wondering why the system seemed to be "a bit slow". We checked and found out that we'd had a hardware failure the day before and were only running on one box instead of two! It had happened so smoothly (when no-one was on) that nobody had noticed the failure!
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The essence of the new functionality in 8.5 is that the WebSphere Application Server and the common metadata repository support clustering with failover.

There's still not a whole lot that can be done to failover running jobs, because of the extensive use that is made of shared memory.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
egsalayon
Participant
Posts: 27
Joined: Sun Sep 23, 2007 9:21 pm

Post by egsalayon »

Hi,

can we implement ACTIVE-ACTIVE HA on Linux, particularly RHEL with DS version 8.1?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not really. You need version 8.5.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
egsalayon
Participant
Posts: 27
Joined: Sun Sep 23, 2007 9:21 pm

Post by egsalayon »

ray.wurlod wrote:Not really. You need version 8.5. ...
I thought asorrell has implement HA on DataStage even before version 8.5.

asorrell, can we have your insight on this?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Isn't there a license issue with active-active HA? If you have an active Information Server that is just for HA do you need to pay for it? One new method available in 8.5 is the combination of CDC and DataStage to create a restartable job. If the job fails half way through it can be restarted from where it left off using the CDC pipe to identify data not yet applied.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

1) It can be done on Linux at 8.1 - just needs a lot more scripting work since the product doesn't support it at that release. This isn't a trivial task at any release and I wouldn't recommend trying to get it to work without the help of IBM Global Services or someone else who has done it before.

2) Yes - Active / Active means you have to double the licensing since both are "running" at the same time. IBM is VERY specific about that.

3) I've not had to setup any job-recovery in the past, since almost all of the jobs were either truncate & load jobs or insert / updates with low volumes. I'll have to read up on the CDC solution - post a link if you have it!
Last edited by asorrell on Thu Jan 13, 2011 5:22 pm, edited 1 time in total.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
egsalayon
Participant
Posts: 27
Joined: Sun Sep 23, 2007 9:21 pm

Post by egsalayon »

Want to clarify the behaviour of DS v8.5 HA given the scenario below:

1. Main Sequencer1 (MS1) have jobs JOB1, JOB2 and JOB3 and is programmed with the option restartability. MS1 was scheduled on DataStage1 (DS1), and DS1 broke down with MS1 aborting on JOB2. Am I right to say that when I run the same MS1 in DS2 , it will start on JOB2 right?
Post Reply