Clustering and High Availability

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Clustering and High Availability

Post by seanc217 »

We are investigating setting up an MPP System using HP-UX boxes. I understand how DataStage supports MPP Systems by the setting up of a configuration file that defines all the nodes available to it.

My question is if one of the servers in the cluster were to fail would DataStage be able to continue to run?

I know alot of assumptions need to be made here. Assume that all files and projects are located on a SAN.

Does this product support this kind of high availability?

Thanks for any insight.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

There was a session on this at AscentialWorld a couple years ago, from memory they were able to provide high availability by running the jobs through the RTI pack. The config files are hard coded to nodes, the RTI pack is more flexible letting jobs run on servers that are free or under utilised. You could manually cope with a server that is down by changing the default config file. Have several config files available for each type of server configuration and switch to one that will work.

It is hard to automatically cope with a server that goes down since you may be midway through an extract and some manual cleanup may be required.

Similar functionality would be available if you added grid management software.
kommven
Charter Member
Charter Member
Posts: 125
Joined: Mon Jul 12, 2004 12:37 pm

Post by kommven »

How about Veritas Cluster Software.

My Question :

If a Job is initiated and running on Server 1 and Server 1 eventually over-loaded, is there a scope for the load to shift AUTOMATICALLY to Server 2?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I've got a book from IBM here called "An Overview of the IBM WebSphere Data Integration Suite", the section on SOA states the following benefits:
* Scalable - They provide real-time scalability by distributing request processing and stopping/starting job instances across multiple WebSphere servers. This enables large unpredictable volumes of requests to be handled without performance degradation.
* Reliable and highly available - They provide service transparency across the complete spectrum of WebSphere DataStage servers. If any one server becomes unavailable, it routes requests to a different server in the pool.
Post Reply