XMPP has a Publish-Subscribe capability, generally referred to as PubSub, which many XMPP experts see as very important. This white paper seeks to explain PubSub and its significance to non-experts.
The paper looks at the problems addressed by publish-subscribe systems, shows how XMPP PubSub has been used by two services (Collecta and buddycloud), discusses PubSub capabilities & potential applications and outlines M-Link's PubSub support.
XMPP has a Publish-Subscribe capability, generally referred to as PubSub, which many XMPP experts see as very important. This white paper seeks to explain PubSub and its significance to non-experts. PubSub is a building block for other applications and not directly used by human users so, to show its capabilities and benefits, this paper focuses on applications using PubSub rather than the technical details. It also looks briefly at the PubSub implementation provided by Isode's M-Link server.
This paper assumes basic familiarity with XMPP concepts.
The Publish Subscribe Problem
Publish-Subscribe is a well-known technique, that is often used in large scale distributed systems. Publish-Subscribe is used when there are lots of events happening, and entities interested in differing subset of the events. A mechanism where each entity independently tried to find out about the events of interest would be complex and inefficient; it requires resource polling which is resource intensive and does not scale to very large deployments.
With Publish-Subscribe, publishers and subscribers are decoupled. Publishers classify events into topics and each event is sent to the Publish-Subscribe service. Subscribers indicate to the Publish-Subscribe service which topics they are interested in and receive events associated with each topic as they are sent to the Publish-Subscribe service, giving access to information in real time.
Publish/Subscribe systems are useful for a wide range of applications. In the next sections we describe two applications that use XMPP PubSub in order to show how Publish/Subscribe works and why it is so useful.
Collecta: Delivering real-time search with PubSub
The Collecta service allowed real time tracking of information on selected topics with data feeds from a wide range of sources, Collecta used an XMPP PubSub interface to deliver data and receive subscription requests. The key features of the Web interface to the service are shown in the screenshot below.
In the first column you enter the search term(s) you're interested in monitoring and select the types of sources you wish to consider results from. Items matching your selected search term appear in the second column, updated as new results come in. Highlighting any of the results shows details in the third column.
We now consider how Collecta made use of XMPP PubSub, noting that this is simplified to help understand how the core elements of PubSub work.
XMPP PubSub was central to the Collecta service. The various feeds used by Collecta were processed so that each item was sent to the Collecta PubSub server as an XMPP (PubSub) Publish message.
Users of the Collecta service could subscribe to events in a variety of different ways. The earlier screenshot shows event delivery via the Collecta website. An XMPP client is run in the browser window using BOSH (Bidirectional-streams Over Synchronous HTTP) facilitating communication between the browser and the PubSub service. The subscriber's selection of a topic initiates a PubSub subscribe request to Collecta's PubSub server. Matching events (historical) were returned to the user's browser as XMPP (PubSub) events and displayed in reverse chronological order.
New events that match the request(s) were displayed as they were received by the Collecta PubSub service and sent to the browser as an XMPP PubSub event. The Collecta API allowed developers to build results from Collecta's PubSub service into their own applications or widgets.
buddycloud: Social location using PubSub
In the same way that Collecta gathered news sources for its PubSub service, buddycloud gathers user-submitted data which is re-purposed for a "social location" network accessed through mobile device applications and web clients.
Buddycloud users are both publishers and subscribers. They publish data (including geolocation information) to the pubsub service which other users can receive by subscribing to information 'channels' which may be based on a topic, person or place. Users can be notified when other users visit a location (perhaps one nearby to their own) and/or share information about a topic or place of interest. All of this is achieved by use of XMPP PubSub.
Why Standardise Publish-Subscribe?
Publish-Subscribe is a general approach used in many systems that do not use XMPP PubSub. In order to understand the benefits of XMPP PubSub, it is useful to first consider why a standardized approach is desirable and then why XMPP is a sensible base for this.
A Publish-Subscribe subsystem is a significant component and, if you are building a product or service that needs this, it will often make sense to obtain it separately. An Open Standard makes this separation straightforward, and also makes it easy to replace the Publish-Subscribe component if this becomes necessary or desirable.
Use of an open protocol is helpful in complex systems, as it may well be desirable to implement publishers and subscribers in very different environments that will need different technologies (e.g., Web Browsers; Mobile Devices; Servers). Building PubSub on the standardized XMPP protocols is particularly helpful here, as there are many open source and commercial client libraries available.
Use of an Open Standard for Publish-Subscribe can reduce development cost and time to market for a service or product needing this capability.
Overview of XMPP PubSub Capabilities
In this section you'll find a high-level overview of XMPP PubSub features (XEP-0060: Publish-Subscribe), without going into detailed explanations.
Use of XMPP Stanzas
PubSub works by exchanging XMPP Stanzas (messages) in and out of a PubSub server. There are four basic PubSub message types:
- Subscription: A subscriber sends an XMPP iq (information.query) stanza to register a subscription. The user will then receive events from the PubSub server while the subscription is active.
- Publishing: The Publisher sends an XMPP iq stanza to publish an event, typically (but not always) including a payload.
- Events: Events are sent to subscribers as XMPP message stanzas.
- M-Link's PubSub is fully integrated with M-Link Clustering with clustering of the PubSub nodes, so that PubSub operation will continue without interruption in the event of server failure
- Management: XMPP manages the PubSub system as well as providing the server (information discovery and management of the PubSub system are all handled with XMPP stanzas).
Nodes are logical entities within a PubSub server used to group information, they are the entities to which events are published to which content consumers subscribe. In other Publish Subscribe systems they are called 'topics' and this name makes the concept clearer for content syndication services like Collecta and buddycloud.
Nodes are a framework for dispatching new events to subscribers. A node can also hold old events, so that a new subscriber to that node can review historical events. Collecta showed the benefits of this capability.
Nodes can be grouped into flexible hierarchies (technically a directed acylic graph) using collection nodes. Events are always published to leaf nodes and then flow up the hierarchy. The hierarchy gives a convenient way to group nodes in order to help subscribers easily subscribe to a set of nodes or to all nodes.
Use of the node model works well for applications where the data model for matching publication and subscription can be set up separately to data contained in the events. However in both of our examples, Collecta and buddycloud, users subscribe to events based on data in those events. XMPP PubSub supports this by use of a content filter in the subscription.
PubSub provides flexible access control to support a number of broad approaches:
- Open (anyone can publish or subscribe)
- Managed (the PubSub service provider decides who can do what)
- User controlled (publishers and/or subscribers control access)
Personal Eventing Protocol (PEP): An extended XMPP Service built on PubSub
Personal Eventing Protocol (PEP) is an example of an extended XMPP service built on PubSub. PEP is a mechanism to support extended Presence, so that a user can publish information in addition to the core online/offline status, and control how that information is shared. For example, a user can publish location information (GeoLocation) and music being listened to. This information can then be shared with selected members of the user’s roster. This is all built with PubSub, which gives a number of advantages:
- By using a core XMPP mechanism, duplication of low level XMPP functionality is avoided.
- The user can use PubSub access control, to give flexible control of who this information is shared with.
- Roster members will only subscribe to this information if they choose to, and so resource is saved by only sending the data to those who want it.
Multi-User Chat: a service that has similarities with PubSub
Multi-User Chat (MUC) is a core XMPP service that bears some resemblance to PubSub. It is described here to clarify the relationship, and also to give those familiar with MUC a different way to think about PubSub. You could view sending a message to a MUC group as 'publication' and joining a MUC group as 'subscription' and MUC groups as 'nodes' in a PubSub server. Although MUC is not built on PubSub (it was specified before PubSub) it is likely that were XMPP to be built again from scratch that MUC would be built on PubSub.
Other Services that could use XMPP PubSub
XMPP PubSub has potential to provide support for a wide range of applications. This section looks briefly at some things that might be done, to illustrate the potential offered by PubSub.
The essence of social networking is that individuals make information available to be shared with others. Distribution needs to be controlled based both on controls specified by the person publishing the information and on interest of potential recipients of the information (subscribers). Most desire this information to be available immediately, so an underlying publish/subscribe mechanism is ideal.
Use of an open protocol (XMPP PubSub) would enable sharing across boundaries, rather than constraining social networking to closed systems to which all participants must belong. Social networks such as Facebook and identi.ca are providing XMPP access, and use of PubSub would be a natural extension of this.
Content Syndication and replacing RSS
Content syndication such as Collecta and Google Alerts is an important capability. Use of an open standards infrastructure would benefit sharing.
An interesting observation is that RSS, which is polling technology widely used for content sybdication, is inefficient. It would be highly desirable to replace this with an open publish/subscribe mechanism. XMPP PubSub could be used for this, and it has been suggested that the next generation of Web browsers should provide integrated XMPP support in order to enable this. This is a quite radical idea.
Military Situational Awareness in Constrained Networking Environments
Situational awareness is a vital military application for sharing data between users. There will often be significant volumes of data to share. There may be particular interest in change of location of certain entities, or the availability of photographs. In many deployed environments, there is insufficient bandwidth to broadcast all data. So a publish subscribe system, where a user can subscribe to 'changes in this area' is ideal.
In an era of partner interoperability with joint operations, use of an Open Standard publish/subscribe base would be ideal.
Event Recording/Alerting Systems
Isode server products generate 'events', which can range from severe operational errors to warnings and diagnostic information. These are specified by comprehensive XML catalogs, and available in log files, syslog and Windows events.
These could be provided using XMPP PubSub, with events delivered to general purpose or special XMPP clients. PubSub could then be used to give user control over which events are received. For example "All severe events on all operational servers and directory replication errors and warnings on test directory servers".
Feedback on the desirability of such a capability, which we are considering implementing, is welcome.
M-Link XMPP PubSub Support
Once a decision has been made to use XMPP PubSub to support an application, a choice will need to be made between various XMPP Servers that support PubSub. M-Link supports PubSub, and has the following benefits:
- M-Link was designed from the start to support PubSub, and so provides excellent performance that does not impact other XMPP functions.
- PubSub is a substantial specification, and M-Link provides a very full implementation, with all major features included.
- M-Link’s PEP implementation is based on PubSub (this is not true of many PEP implementations) and so capabilities of PubSub such as access control are available to PEP users.
- We plan to provide useful management tools as a part of our M-Link Console tool, that will enable:
- Easy setup of PubSub networks,
- Visualization of PubSub distribution.
- Flexible authentication that will allow appropriate manager control.
This paper has given an overview of Publish-Subscribe and shown examples of where this technology can help in providing a range of solutions. The benefits and capabilities of XMPP PubSub as an open standard for Publish-Subscribe have been explained, and the market leading features of the M-Link XMPP PubSub implementation.