Clustering and High Availability

seanc217 · Post by **seanc217** » Tue Feb 28, 2006 3:50 pm

We are investigating setting up an MPP System using HP-UX boxes. I understand how DataStage supports MPP Systems by the setting up of a configuration file that defines all the nodes available to it.

My question is if one of the servers in the cluster were to fail would DataStage be able to continue to run?

I know alot of assumptions need to be made here. Assume that all files and projects are located on a SAN.

Does this product support this kind of high availability?

Thanks for any insight.

vmcburney · Post by **vmcburney** » Tue Feb 28, 2006 4:54 pm

There was a session on this at AscentialWorld a couple years ago, from memory they were able to provide high availability by running the jobs through the RTI pack. The config files are hard coded to nodes, the RTI pack is more flexible letting jobs run on servers that are free or under utilised. You could manually cope with a server that is down by changing the default config file. Have several config files available for each type of server configuration and switch to one that will work.

It is hard to automatically cope with a server that goes down since you may be midway through an extract and some manual cleanup may be required.

Similar functionality would be available if you added grid management software.

kommven · Post by **kommven** » Thu Mar 02, 2006 4:19 pm

How about Veritas Cluster Software.

My Question :

If a Job is initiated and running on Server 1 and Server 1 eventually over-loaded, is there a scope for the load to shift AUTOMATICALLY to Server 2?

vmcburney · Post by **vmcburney** » Thu Mar 02, 2006 6:15 pm

I've got a book from IBM here called "An Overview of the IBM WebSphere Data Integration Suite", the section on SOA states the following benefits:

* Scalable - They provide real-time scalability by distributing request processing and stopping/starting job instances across multiple WebSphere servers. This enables large unpredictable volumes of requests to be handled without performance degradation.
* Reliable and highly available - They provide service transparency across the complete spectrum of WebSphere DataStage servers. If any one server becomes unavailable, it routes requests to a different server in the pool.