On this page you'll find information on M-Link features designed to minimise the impact of Server Failures and/or Link Failures in order to provide a reliable XMPP service. M-Link offers LAN and WAN clustering to protect against potential server failures as well as reliable messaging features to address the consequences of link failure.
The core XMPP model is one server per domain. A single M-Link Server can support multiple domains, with delegated administration of users within each supported domain. XMPP Clustering is a technique to enable a single domain to be supported by multiple servers.
XMPP clustering is provided by some XMPP servers, using vendor-specific techniques but capabilities provided under the heading of 'clustering' vary widely between products so actual features need to be reviewed with care.
XMPP clustering needs to synchronize 'state' between servers to ensure that messages are routed to correct destinations and that presence information is correct.
It is also important that information from various services (Presence, Multi User Chat (MUC), and Publish Subscribe (PubSub)) are set on the local server where possible. For example, where MUC subscribers are on multiple servers, participant groups should be managed locally on each server, and messages sent directly to other local users without having to go to another server first. A related characteristic is that MUC and PubSub will continue operation in the event of any cluster node failing.
Isode's XMPP Clustering implementation is designed to work well for both LAN Clustering and Wide Area Clustering environments.
Local Area Network (LAN) Clustering
In LAN Clustering there are multiple clustered XMPP servers operating on a common fast highly reliable local network.
Clustering in this environment is important for large deployments, as it enables servers to be added to support load levels greater than can be handled by a single server. This horizontal scaling is important for service providers and large enterprises. It also provides reliability, so that service can continue in the event of failure (accidental or planned) of a server.
Wide Area (WAN) Clustering
In Wide Area Clustering the XMPP servers are interconnected by links that may be slower and less reliable than a LAN.
There are various scenarios where this is important:
- Off site operation of a server, so that service can continue in event of site failure (Disaster Recovery).
- Support of organizations with multiple sites, so that a server can be run at each site.
- Support of a distributed military deployment with, for example, one server at HQ and another in the field.
Supporting Wide Area Clustering requires protocols and algorithms that will deal with wide area network throughput/latency and periods where connectivity is lost. Servers need to be kept in sync, but operations should continue as well as possible when there are network failures.
Having a server close to a client with good connectivity will give a fast and robust client experience. It is important that local traffic is optimized, and does not switch between servers except where needed. Handling traffic locally to a server without unnecessary switching is particularly important for Wide Area Clustering.
There are a number of ways in which an XMPP service can become unreliable, usually involving a failure in one or more componants of the service. In a constrained network deployment, where link failures can be common, Isode's XMPP products (both the M-Link server and the Swift XMPP client) include capabilities to alert the user to and protect them from link failures.
Users of messaging systems (email or instant messaging) operating in environments with internet quality links often make the usually justified assumption that a message has been delivered. Users in constrained networking environments, where link failures are common, cannot afford to make that assumption.
M-Link and Swift both support XEP-0198: Stream Management for message acknowledgements clearly showing the status of messages and allowing the user to decide on remedial action in the event of non-delivery.
Federated Multi-User Chat
In standard Multi-User Chat (MUC) a room is hosted on one server and participants joining the room may be local to that server or joining via another server using standard XMPP server federation. A link failure will disrupt the ability of users on a federated server to participate in the MUC.
M-Link supports XEP-0298: Federated MUC for Constrained Environments, federating the provsion of MUC, just as the distribution of XMPP servers federates the provision of 1:1 chat. More information on Federated Multi-User Chat can be found on the page that discusses M-Link's multi-user chat capabilities and in the whitepaper [Federated Multi-User Chat: Efficient and Reliable Operation over Slow and Unreliable Networks].