Summary

This paper gives performance benchmarks for Isode's M-Switch X.400, a high-performance X.400 Message Transfer Agent. M-Switch X.400 is deployed by Isode customers in a number of solutions areas:

  • As a special purpose gateway, to integrate with other services, for example AFTN in aviation markets and ACP127 in military markets, using one of two X.400 Gateway APIs.
  • As a MIXER gateway to convert between X.400 and Internet Mail.
  • A gateway with the military ACP127 protocol as described in whitepaper [M-Switch ACP127 Gateway to STANAG 4406 and MMHS over SMTP].
  • As a "Backbone MTA", whose role is primarily to connect together other X.400 MTAs, acting as a P1 switch, providing high performance switching and robust message routing.
  • As a "Local MTA" (or "Departmental MTA") used to provide X.400 support to end users by use of User Agents. In this situation, M-Switch X.400 will often be used in conjunction with M-Store X.400 to provide mailbox storage for X.400 P7 User Agents.
  • As a "Border MTA", to provide connection between different X.400 domains using capabilities such as authorization and anti-virus.

The benchmarks re-enforce our belief that M-Switch X.400 is substantially faster than any other X.400 MTA.

Creative Commons License

Usage Models

M-Switch X.400 supports the two core MTA (Message Transfer Agent) protocols:

  • X.400 P3, for message submission and delivery.
  • X.400 P1, for message transfer.

Performance of P3 for message submission and delivery was measured as a part of benchmarking M-Store X.400 described in the whitepaper M-Store X.400 Benchmarks. These benchmarks focus on X.400 P1.

A typical X.400 MTA configuration has a relatively small number of X.400 P1 connections configured (typically a few tens or hundreds). A heavily loaded system will typically switch high numbers of messages over each connection. This contrasts with many SMTP configurations, which handle much larger numbers of peer MTAs and lower volumes of messages for each peer MTA. Performance for this type of configuration will be examined in a future SMTP benchmark paper.

Given this top level model, tests are focused on driving a large number of messages down a small number of connections. Two specific scenarios are examined:

  1. Steady state. Messages are transferred in at a rate that keeps the message queue non-empty but small. The overall throughput is measured. This is done for messages of various sizes.
  2. Large queue. This test builds a large message queue, to show how performance is affected by queue size.

Test Configuration

Testing was done with M-Switch X.400 Release R14.1 on the following hardware:

  • Dual 2.2 GHz Opteron.
  • 4 GByte Memory
  • 135 GByte SCSI RAID (0+1 configuration) with write-back cache
  • Red Hat Linux
  • 1Gbit LAN Connection

Transfer in and out is done by use of special processes on separate machines, which provide loading for the test. Specific notes on the M-Switch Configuration:

  • Directory based configuration is used, with a local M-Vault server.
  • Audit logging is set at default levels, which is appropriate for most operational configurations.
  • Event logging is set to log errors and critical events (but not informational logging), which is appropriate for a busy operational system.
  • Queue fan-out is set to 100. This is not set by default, but must be set in order to hold more than 32,000 messages in the queue.
  • Non-default QMGR parameters were used. The values chosen would be appropriate for an operational system switching high message volumes.
  • Archiving (when used) was configured to create a new sub-directory each minute.
    The intent of the setup is to use a configuration that is realistic, and not one specially designed to achieve good benchmark numbers.

Steady State Tests

Messages are sent to single recipients of the following sizes: 1 kByte; 10 kByte; 100 kByte; 1 MByte; 10 MByte. Tests were done to one and two sink MTAs. The numbers achieved were almost identical, and so are not recorded here. Messages were submitted by one source for five minutes, then two sources, increasing at five minute intervals up to ten sources.

A problem with steady state testing is to keep messages in the queue (so that there are always messages to be sent) and to not allow the queue to get too large. This is achieved by having the 'transfer in' tool monitor queue size, and stop sending when the queue reaches a high water mark and restarts when the queue dips below a low water mark. These values were set to 5,000 messages and 500 messages respectively. These values ensured that the queue did not empty out.

The results from each sequence of tests was processed into a graph, to show throughput. An example for 10 MByte messages is shown below.

This is for transfer of 10 MByte messages. Note that the scale on the left hand side (messages every 10 secs) is different to the value in the table below (messages per second).

Results for different sizes of messages, giving steady state max throughput:

Message Size Sources to reach max capacity Throughput (messages/sec)

Bandwidth used (MBits/sec)

1 kByte
8
560
4.5
10 kByte
5
460
37
100 kByte
7
300
240
1 MByte
3
35
280
10 MByte
2
3.5
280

The throughput shown here is for a message in and a message out (i.e., the performance is measured for full message transfer, and not as half messages which is sometimes used). Messages are transferred in and out, and so the bandwidth shown is being used in both directions.
It can be seen that for small messages, performance is primarily limited by the number of messages (560 per second for very small messages) and this drops only slowly as message size increases.

For large messages, throughput is limited by data volume transferred. We believe that this figure is primarily hardware limited. For a Linux operating systems with standard networking configuration, the data throughput here is close to the limit of what can be achieved over TCP for a 1Gbit network interface. These figures are also close to the performance limits of the disk subsystem.

The numbers above are obtained with archiving not used. Because message archiving saves a copy of each message, it has a performance impact. This is shown in the numbers below:

Message Size
Throughput without archiving (messages/sec) Throughput with archiving (messages/sec)
1 kByte
560
340
10 kByte
460
230
100 kByte
300
230
1 MByte
35
33
10 MByte
3.5
3.3

It can be seen that for large messages, that archiving has minimal impact on throughput. This ties in with the earlier analysis, suggesting that throughput for large messages is limited by the available network bandwidth. There is a dramatic effect for small messages. The reason for this, is that where messages are being relayed without archive, the message will generally be held in the write-back disk cache and deleted before it actually gets written to the physical disk. Where a message is archived, it will get written to the disk. A file is created for each message, which is a relatively expensive disk operation. This is why performance is reduced when archiving is enabled.

Large Queue Tests

The test for a large queue is quite simple:

  1. Disable outbound traffic.
  2. Submit 1,000,000 messages (10 Kbyte) using multiple connections.
  3. Pause.
  4. Enable outbound traffic.

The results of this test are shown in the graph below:

The first segment of this graph shows message transfer in rate. Initial submission rises to about 550 messages per second, which it sustains for a while and then drops to about 300 messages per second. This drop in submission rate is due to the write-back cache on the disk becoming full. After this, message submission is limited by the speed of writing messages to disk. A key point is that message submission rate is sustained after this drop.

The second graph segment shows message transfer out rate. This starts at around 450 messages per second. It drops off slightly during the run, but maintains good performance for all of the messages.

Conclusions

The key conclusions from these tests are:

  1. For transfer of small messages, throughput is constrained by the number of messages (around 560 per second for this hardware).
  2. For transfer of large messages, throughput is constrained by network performance.
  3. M-Switch can robustly handle large queues, of at least 1,000,000 messages.
  4. A write back cache on the disk is important to achieve best performance.
  5. The number of inbound and outbound connections does not significantly affect performance, for a moderate number of connections. We believe that this will extend up to at least a few hundred connections.

In summary, we believe that M-Switch X.400 is substantially faster than any other X.400 product.