A High availability is important for many applications of Isode's products. In many cases, multiple servers will be operated to achieve high availability. For example:
- A message relaying service uses multiple copies of M-Switch, so that the service is not dependent on a single copy of M-Switch.
- A directory service has replicated copies of the directory stored in multiple servers, so that any of the servers may resolve an incoming directory query.
It is good practice to structure a service so that it is not dependent on the availability of a single server.
Failover clustering is a security feature common to all of Isode's server products, this page describes how Isode's failover clustering support works.
High Server Availability
It is always desirable to achieve high server availability. For example, if a Message Switch fails when it is transferring some messages, those messages will remain "stuck" on the Message Switch until it is repaired. In some service environments, such a delay would be unacceptable. Fail-over clustering is designed to provide very high server availability, for environments with this type of service requirement.
In a failover cluster, there are two computers (or occasionally several computers). One (primary) provides the service in normal situations. A second (failover) computer is present in order to run the service when the primary system fails. The primary system is monitored, with active checks every few seconds to ensure that the primary system is operating correctly. The system performing the monitoring may be either the failover computer or an independent system (called the cluster controller). In the event of the active system failing, or failure of components associated with the active system such as network hardware, the monitoring system will detect the failure and the failover system will take over operation of the service.
A key element of the fail-over clustering approach, is that both computers share a common file system. One approach is to provide this by using a dual ported RAID (Redundant Array of Independent Disks), so that the disk subsystem is not dependent on any single disk drive. An alternative approach is to utilize a SAN (Storage Area Network).
Isode's fail-over clustering utilizes cluster support from the Operating System vendor. This is listed under Platform Support. The primary fail-over functions are provided by this cluster support. Isode provides components to integrate with these cluster managers, to enable the cluster manager to monitor, stop and start Isode servers.
The primary goal of failover clustering is to provide resilience to hardware failure. Isode's application support enables clean switching between servers. By monitoring the Isode applications, the cluster system can detect hardware failures that are manifested by application failure. The cluster manager can also be used to restart (locally) servers that fail.