September 2012


This paper looks at how M-Link, Isode's XMPP server is optimized for operation over constrained networks, including Satcom, HF Radio, and other Radio links.

The paper starts by looking at the benefits of using XMPP over constrained networks, and the key problems faced. Then it describes the M-Link architecture and how it addresses the various problems, both for networks where IP will be used, and for HF Radio.

Creative Commons License

Why XMPP over Constrained Networks?

XMPP, the Internet Standard eXtensible Messaging and Presence Protocol, is being widely adopted by military organizations and others who make use of constrained network communications. XMPP is an important building block for handling 1:1 instant messaging, multi-user chat, and presence. It provides this in an open standards framework, which supports security, extensibility and distributed deployment. It is highly desirable to deploy one communication system that can support any environment.

Although military communication increasingly uses high speed networks, there are many deployed situations where slower networks, including SATCOM, VHF, UHF and HF radio are important. HF Radio is of particular importance, as it is the only generally viable Beyond Line of Sight (BLOS) alternative to Satcom, and necessary for situations where Satcom cannot be used or as a backup to Satcom.

For further information:

Problems to be Addressed

XMPP works efficiently over medium and high speed networks, and so the issues discussed here do not affect most XMPP deployments. Constrained networks have three characteristics that can lead to application level performance implications:

  1. Low speed. Networks can commonly operate at 9600 bits/second or less.
  2. High latency. Constrained networks will often have high latency.
  3. Poor reliability.

XMPP protocol characteristics that interact with this include:

  • Large number of handshakes (typically 9) and data exchange on startup. This is the most significant problem.
  • Exchange of information that provides value (and so is desirable for faster nets) but that has an overhead greater than the value provided for slower networks.
  • The encoding is not very compact, although compression does help, especially for a long lived connection.

The rest of this paper looks at these problems in more detail, and how M-Link addresses them.

Server to Server Architecture

Isode's core approach to low bandwidth networks is to use specialized server to server protocols over the constrained link that are optimized for the link, rather than operating a client/server protocol over the constrained link. There are a number of advantages to the server to server approach:

  • Only one implementation of the protocol is needed.
  • The standard Client/Server protocols can still be used, and so any (XMPP) client product can be used. There is no requirement to use a special client.
  • The XMPP client is isolated from poor network performance. Most XMPP clients will give poor user experience if the connection to the server is slow. By always using a local server, the client/server connection will be fast and responsive.
  • Information can be held in the local server and data transfer and requests over the constrained network can be minimized.

Zero Handshake Protocol

The communication between a pair of XMPP servers is essentially a flow of stanzas in each direction. A stanza is an element of data on an XMPP stream, which can be "message", "presence", or "iq" (information query).

XMPP uses two TCP connections to support this (one for each direction of flow). This use of two connections is a historical consequence of dial-back and it is expected that a future version of the specifications will allow single connection.

Setup of each connection involves a number of handshakes and data exchange. These include:

  • TCP handshakes.
  • TLS handshakes – use of TLS for server to server connection is recommended.
  • SASL handshakes for authentication.
  • XMPP stream binding.

This can lead to nine or more end to end handshakes and exchange of several kilobytes of data. For a high speed network, this overhead is minimal, particularly given that server to server connections will generally be very long lived (days or weeks).

For a constrained network, this overhead is a big problem, particularly where network reliability means that long lived connections will generally not be possible. For some networks, the number of handshakes is a particularly severe problem.

Isode's approach for M-Link is to use the "XEP-0361: Zero Handshake Server to Server Protocol” to reduce data volumes and removes all handshakes at the XMPP level. Three approaches are used to achieve this:

  1. Use of a single (bi-directional) stream.
  2. To configure options at both ends of the connection (using peering controls) to avoid the need for negotiation at run time. This saves both data volume and handshakes.
  3. To have full pipelining of the remaining stanzas. This means that when a connection is initiated there will be a sequence of initializing stanzas followed by messages. There is no requirement for any returned data to start sending. This achieves "zero handshake" at the stanza level.

This stanza level exchange is abstracted, so that it can be mapped onto multiple transports.

The base transport is TCP, which is a good choice for Satcom links. There will be a single TCP handshake to establish the TCP connection, and then data can flow without further handshakes. TCP will optimize use of the available link bandwidth.

XEP-0361 may be operated over TLS. M-Link supports this to provide data confidentiality with an option to use peer authentication using X.509. With many constrained networks these services will  be provided at the data link or network layer, and there is no functional requirement to provide them at the application layer. TLS adds some protocol overhead, and with current versions of TLS the handshaking will add significant latency. The upcoming TLS 1.3 will significantly improve this.


Minimizing data transferred is important, and so use of compression is desirable.

Standard XMPP compression is used with Isode's optimized server to server protocol. This compression is stream based and uses the ZLIB format which in turn uses the DEFLATE algorithm. A key benefit of this compression approach is that it is adaptive to both the protocol used (e.g., the XMPP protocol options and XML namespaces used) and to user messages and addresses exchanged. This means that a very high level of compression can be achieved in many situations.

DEFLATE references previous data in the stream, and it becomes more effective for larger data sets (or longer use in the case of a stream). This means that for compression to be effective connections between servers must be reasonably long lived. M-Link operates to achieve this. Where TCP is used, it is important that the underlying network is sufficiently reliable to hold the connection open for extended periods and that the overhead of TCP keepalives is not an issue. TCP will generally be a good choice for Satcom.

Presence Caching

A key benefit of XMPP is to provide up to date information on a user's presence status. To support this, XMPP servers exchange presence information. An important optimization in support of a low bandwidth link is for a server to cache presence values, so that if this gets requested (e.g., by a client logging on) then this can be handled locally rather than making a query to the remote server. Presence updates are pushed (not polled) and so if done correctly, this caching will still lead to clients being given correct information.

Traffic Filtering

All of the changes described so far, optimize performance without impacting the XMPP service provided to the client. Traffic filtering removes data, and so will modify the service provided to the end user. With traffic filtering there will be a trade-off between service and performance. Removing traffic and information of low value to the user will improve performance for high value data. The details of the filtering and the trade-off will vary, and traffic filtering is likely to be used aggressively for very slow networks, and less for somewhat faster networks.

Filtering options available are:

  • Removal of selected types of message (or other stanza).
  • Removal of selected elements from messages (message folding).
  • Removal of selected elements from presence stanzas (presence folding).

Removal of messages seems a drastic measure, but can be helpful. A class of message that it generally makes sense to remove is "chat state notifications". These give real time notification as to user "state" and in particular if the user is typing. Chat state notifications lead to client indications such as "Joe is typing". It will often be desirable to save the network overhead of these messages for a constrained network.

A more extreme filtering that M-Link offers is to remove all presence messages. This would reduce the communication to an exchange of user messages, and there would be no setting or update of user presence. Clients would need to be chosen that are appropriate for this type of deployment, as some will expect presence updates.

Another option would be filtering of IQ (information query) stanzas, which clients use to gather information and negotiation protocol features and extensions. There are a number of protocol features and extensions which are unsuitable for use in constrained networks, such a features allowing establishing Audio/Video streams over XMPP. Many of these features and extensions can be disabled via IQ filtering. M-Link provides flexible controls to filter traffic.

XMPP is an extensible message protocol, and a wide range of XMPP applications and services use this extension mechanism. Extensions and additions to a message are clearly identifiable in the XML of an XMPP stanza. M-Link allows extensions and message elements in general to be removed. This is called "message folding". Message folding can be specified either as a list of elements allowed (i.e., everything else will be stripped) or as an explicit list of elements to strip. Possible uses of this:

  • Work out the list of fields that are operation critical, and then strip out everything else.
  • Remove specific fields that are known to be not required, for example security labels.
  • Remove the HTML variant of a message (which some clients insert) and leave only the simple text version.

M-Link provides an equivalent "presence folding" mechanism for Presence stanzas. Presence can be used to convey additional information, which may or may not be needed. Presence folding allows presence messages to be reduces to a simple online/offline status, with no additional information. Things that might be stripped include:

  • Information on Avatars.
  • Additional presence information such as "extended away".

Multi-User Chat

Multi-User Chat (MUC) is often an important service in a constrained bandwidth environment, and it introduces a number of performance problems. These problems and Isode’s approach to solving them is described in the whitepaper [Federated Multi-User Chat: Efficient and Resilient Operation over Slow and Unreliable Networks].

Optimised Support for HF Radio (STANAG 5066), VHF and UHF

XEP-0361 operates over TCP/IP, which will work well for Satcom and (relatively) fast radio links. However, it will not work so well for slower radio links and will be particularly bad for HF radio. The reasons for this are explained in the Isode white paper "Performance Measurements of Applications using IP over HF Radio". The most significant problem is that the TCP windowing mechanism for flow control interacts badly with the very long HF turnaround times.

The solution is to use STANAG 5066 which provides a standard approach for integrating applications to run over HF Radio. Use of STANAG 5066 directly by the application is key to getting good performance over HF Radio, and this is what has been done in M-Link. STANAG 5066 is often used with VHF and UHF radio, and will give useful performance increases here too.

The mapping of XEP-0361 is straightforward and standardized in "XEP-0365: Server to Server communication over STANAG 5066 ARQ”.   Key capabilities:.

  • The application level protocol is a sequence of stanzas. The same mapping needs to be done for each direction, noting that there is no application level handshaking.
  • Stanzas that are ready to transmit are grouped together. This will optimize throughput and maximise compression, possibly at the expense of latency. Experience with HF suggests that it is generally sensible to optimize for throughput.
  • This packet block is transferred as a block using STANAG 5066 RCOP (Reliable Connection Oriented Protocol), which as the name suggests reliably transfers the data to the peer XMPP server.

This is a natural mapping that will lead to optimal HF usage.


The features described in this whitepaper are available in M-Link.


This paper describes how Isode’s M-Link server is optimized for use over constrained links, for 1:1 and MUC traffic using XEP-0361 and XEP-0365. This can operate over TCP for Satcom, and over STANAG 5066 for HF Radio.

Isode plans to measure performance of these protocols, and to report results in a future white paper. This may lead to enhancements and updates to the architecture described here.