Summary

This whitepaper looks at XMPP (Internet Standard eXtensible Messaging and Presence Protocol) and its relationship to the Web. It looks at situations where Web access to XMPP is appropriate, and describes BOSH, the standard way of integrating XMPP and Web. It looks at why BOSH is important for specialized XMPP applications and how Web applications can be built over XMPP.

Creative Commons License

XMPP: User Applications vs Building Block

XMPP is the Internet Standard for real time messaging communication and presence. XMPP's basic model of communication is Client -> Server -> Server -> Client, and in support of this it defines a Client to Server protocol and a Server to Server protocol. Both of these protocols use XML encoded protocol directly over TCP. Isode’s XMPP server product is M-Link.

XMPP is often used for IM (Instant Messaging), both 1:1 chat and MUC (multi-user chat). However, XMPP can be used for many more things. XMPP provides a general purpose infrastructure providing real time messaging and presence. XMPP can be used as infrastructure to support a variety of applications and systems. Where specialized systems interact with humans, a Web interface is often sensible, as this facilitates use without deployment of a desktop application.

Client Initiated vs Server Initiated

When a user interacts with other users and applications on the network (through a computer), there are two basic types of interaction:

  1. Client Initiated. Here the user makes a decision to do something. The client (the software on the computer that the user is using) will send protocol to a server on the network. Typically this forms a request/response pattern, with the client making a request and the server providing a response, although XMPP allows for other patterns.
  2. Server Initiated. Here the user is passive. Something happens remotely; A server sends something to a client; The user is alerted or informed in some way. This may form either a request/response pattern or a pure server-push where no response is expected.

Web Integration of Client Initiated Protocol

The Web was originally designed to support client initiated interactions, and many client initiated services interact naturally with the Web. Consider Isode's DSI (Directory Services Interface) phone browser, illustrated below. This is a Web front end to Isode’s M-Vault directory server.

 

DSI Web Application

 

This DSI application is a Java program running in a (Tomcat) Application Server, which talks LDAP to the M-Vault directory server. Any web browser can interact with DSI using the standard Web HTTP protocol. When the user clicks on the DSI screen, this will cause the browser to send an HTTP request (usually a GET request) to TomCat, which will lead to DSI making one or more LDAP queries, and then returning a Web page in a response to the browser to be rendered.

The key point to understand with this client initiated service, is that the entire DSI application resides on the server. Many web applications work in this way.

Why a Different Approach is needed for XMPP

Server initiated interactions are central to the way XMPP operates. In particular:

  • Messages may arrive, either from one of your buddies or in a MUC room.
  • The online status of one of your buddies may change.

Unless you have an appropriate browser window open, there is no mechanism for the user to be alerted in a standard Web browser. This is a reason why many XMPP users (and IM users in general) prefer to use a desktop client. A desktop application can give useful alerting, even when the primary window is minimized or obscured (e.g., by a "pop up" alert).

A more general problem is that the core Web (HTTP) protocol is client (browser) initiated. This means that there is no basic mechanism for a server web application to display information on the user’s screen in response to a server initiated event. This means that the simple application server architecture used for client initiated protocol will not work for XMPP.

BOSH

BOSH (Bidirectional streams Over Synchronous HTTP) specified in XEP-0124 is the standardized way to do XMPP over HTTP. This is shown in the diagram below, illustrating how BOSH works with a server such as Isode's M-Link, which supports BOSH directly. It is also possible to support BOSH by a distinct service that converts between BOSH and standard XMPP.

 

BOSH

 

BOSH is simply a means of carrying the XMPP protocol over HTTP. The XML XMPP packets used in BOSH are the same as the ones used when standard XMPP is run over TCP. BOSH defines a simple framing to use these packets in HTTP protocol. For client initiated protocol the client simply sends packets over HTTP (using HTTP post). Server initiated protocol is handled by a technique called 'long polling'. In long polling, the web browser initiates a standard request, but does not expect a response back immediately. When the server has data to send, it will send it as a response to the request, and the client will immediately issue another request, thus keeping a request pending for the server to respond to.

BOSH is a mechanism to exchange XMPP protocol between a Web Browser and a BOSH server over HTTP. So, a Web XMPP client will run in the browser and use BOSH to communicate with an XMPP server.

Why JavaScript is Key

In principle, a BOSH client could be written in any language supported by a browser. JavaScript is supported by all Web Browsers and the major browsers are highly focused to optimize JavaScript performance. This means that in practice, a Web XMPP client needs to be written in JavaScript. There are a number of JavaScript libraries available for those wishing to develop BOSH applications.

What Applications can you build with BOSH?

As noted above, general purpose chat applications are generally desktop. BOSH is typically used for more specialized requirements, such as:

  • Specialized MUC clients, such as speeqe, which provides an easy Web interface to chat rooms.
  • Jappix is an open social network based on XMPP using a BOSH interface.
  • Multi-user games. XMPP is ideal infrastructure for large scale gaming. ChessPark was an interesting example of a game system built on XMPP.

For those wishing to investigate in detail, the book Professional XMPP Programming with JavaScript and JQuery by Jack Moffit is highly recommended. This takes you through a number of examples, all of which are available on the Web. The example applications include a chat client, a service browser, a group chat client, a shared whiteboard, a collaborative document editor, and a real-time game. They use the Strophe.js BOSH library, which Isode recommends.

Why M-Link's BOSH

Isode provides BOSH support as a part of its M-Link server. The diagram below shows a typical deployment and interaction between Web Browser, M-Link and an associated general purpose Web server. BOSH support is built into M-Link, which gives high performance, and avoids the complexity of configuring a separate process. BOSH can be provided by M-Link over both HTTP and HTTPS.

M-Link provides general HTTP and HTTPS support, and can act as a basic Web server. This is important, to enable serving a few special files.

 

M-Link BOSH support

 

M-Link will generally be deployed in conjunction with a general purpose Web server that will be used to provide most information needed by the user. In particular, the general purpose Web server will provide for download of the JavaScript BOSH applications to the client (if the client does not have a copy), stylesheets, and general linking of the BOSH application to a wider Web site. M-Link will generally serve a small number of files over HTTP or HTTPS, and in particular "crossdomain.xml" which provides cross-origin control so that a Web client accessing M-Link BOSH knows to trust associated files (e.g., style sheets) on the associated Web server.