Introduction
This paper gives performance benchmarks for Isode's M-Switch
X.400, a high-performance X.400 Message Transfer Agent. M-Switch
X.400 is deployed by Isode customers in a number of solutions areas:
- As a special purpose gateway, to integrate with other services,
for example AFTN in aviation
markets and ACP 127 in military
markets, using one of two X.400 Gateway APIs.
- As a MIXER gateway to convert between X.400 and Internet Mail.
- As a "Backbone MTA", whose role is primarily to connect
together other X.400 MTAs, acting as a P1 switch, providing high performance
switching and robust message routing.
- As a "Local MTA" (or "Departmental MTA") used
to provide X.400 support to end users by use of User Agents. In this
situation, M-Switch X.400 will often be used in conjunction with M-Store
X.400 to provide mailbox storage for X.400 P7 User Agents.
- As a "Border MTA", to provide connection between different
X.400 domains using capabilities such as authorization and anti-virus.
The benchmarks re-enforce our belief that M-Switch X.400 is substantially
faster than any other X.400 MTA.
Usage Models
M-Switch X.400 supports the two core MTA (Message Transfer Agent) protocols:
- X.400 P3, for message submission and delivery.
- X.400 P1, for message transfer.
Performance of P3 for message submission and delivery was measured
as a part of benchmarking M-Store X.400 described in the whitepaper
M-Store X.400
Benchmarks. These benchmarks focus on X.400 P1.
A typical X.400 MTA configuration has a relatively small number of
X.400 P1 connections configured (typically a few tens or hundreds).
A heavily loaded system will typically switch high numbers of messages
over each connection. This contrasts with many SMTP configurations,
which handle much larger numbers of peer MTAs and lower volumes of messages
for each peer MTA. Performance for this type of configuration will be
examined in a future SMTP benchmark paper.
Given this top level model, tests are focused on driving a large number
of messages down a small number of connections. Two specific scenarios
are examined:
- Steady state. Messages are transferred in at a rate that keeps the
message queue non-empty but small. The overall throughput is measured.
This is done for messages of various sizes.
- Large queue. This test builds a large message queue, to show how
performance is affected by queue size.
Test Configuration
Testing was done with M-Switch X.400 Release R14.1 on the following
hardware:
- Dual 2.2 GHz Opteron.
- 4 GByte Memory
- 135 GByte SCSI RAID (0+1 configuration) with write-back cache
- Red Hat Linux
- 1Gbit LAN Connection
Transfer in and out is done by use of special processes on separate
machines, which provide loading for the test. Specific notes on the
M-Switch Configuration:
- Directory based configuration is used, with a local M-Vault server.
- Audit logging is set at default levels, which is appropriate for
most operational configurations.
- Event logging is set to log errors and critical events (but not
informational logging), which is appropriate for a busy operational
system.
- Queue fan-out is set to 100. This is not set by default, but must
be set in order to hold more than 32,000 messages in the queue.
- Non-default QMGR parameters were used. The values chosen would
be appropriate for an operational system switching high message volumes.
- Archiving (when used) was configured to create a new sub-directory
each minute.
The intent of the setup is to use a configuration that is realistic,
and not one specially designed to achieve good benchmark numbers.
Steady State Tests
Messages are sent to single recipients of the following sizes: 1 kByte;
10 kByte; 100 kByte; 1 MByte; 10 MByte. Tests were done to one and two
sink MTAs. The numbers achieved were almost identical, and so are not
recorded here. Messages were submitted by one source for five minutes,
then two sources, increasing at five minute intervals up to ten sources.
A problem with steady state testing is to keep messages in the queue
(so that there are always messages to be sent) and to not allow the
queue to get too large. This is achieved
by having the 'transfer in' tool monitor queue size, and stop sending
when the queue reaches a high water mark and restarts when the queue
dips below a low water mark. These values were set to 5,000 messages
and 500 messages respectively. These values ensured that the queue did
not empty out.
The results from each sequence of tests was processed into a graph,
to show throughput. An example for 10 MByte messages is shown below.

This is for transfer of 10 MByte messages. Note that the scale on the
left hand side (messages every 10 secs) is different to the value in
the table below (messages per second).
Results for different sizes of messages, giving steady state max throughput:
| Message Size |
Sources to reach max capacity |
Throughput (messages/sec) |
Bandwidth used (MBits/sec) |
| 1 kByte |
8 |
560 |
4.5 |
| 10 kByte |
5 |
460 |
37 |
| 100 kByte |
7 |
300 |
240 |
| 1 MByte |
3 |
35 |
280 |
| 10 MByte |
2 |
3.5 |
280 |
The throughput shown here is for a message in and a message out (i.e.,
the performance is measured for full message transfer, and not as half
messages which is sometimes used). Messages are transferred in and out,
and so the bandwidth shown is being used in both directions.
It can be seen that for small messages, performance is primarily limited
by the number of messages (560 per second for very small messages) and
this drops only slowly as message size increases.
For large messages, throughput is limited by data volume transferred.
We believe that this figure is primarily hardware limited. For a Linux
operating systems with standard networking configuration, the data throughput
here is close to the limit of what can be achieved over TCP for a 1Gbit
network interface. These figures are also close to the performance limits
of the disk subsystem.
The numbers above are obtained with archiving not used. Because message
archiving saves a copy of each message, it has a performance impact.
This is shown in the numbers below:
Message Size |
Throughput without archiving (messages/sec) |
Throughput with archiving (messages/sec) |
| 1 kByte |
560 |
340 |
| 10 kByte |
460 |
230 |
| 100 kByte |
300 |
230 |
| 1 MByte |
35 |
33 |
| 10 MByte |
3.5 |
3.3 |
It can be seen that for large messages, that archiving has minimal
impact on throughput. This ties in with the earlier analysis, suggesting
that throughput for large messages is limited by the available network
bandwidth. There is a dramatic effect for small messages. The reason
for this, is that where messages are being relayed without archive,
the message will generally be held in the write-back disk cache and
deleted before it actually gets written to the physical disk. Where
a message is archived, it will get written to the disk. A file is created
for each message, which is a relatively expensive disk operation. This
is why performance is reduced when archiving is enabled.
Large Queue Tests
The test for a large queue is quite simple:
- Disable outbound traffic.
- Submit 1,000,000 messages (10 Kbyte) using multiple connections.
- Pause.
- Enable outbound traffic.
The results of this test are shown in the graph below:

The first segment of this graph shows message transfer in rate. Initial
submission rises to about 550 messages per second, which it sustains
for a while and then drops to about 300 messages per second. This drop
in submission rate is due to the write-back cache on the disk becoming
full. After this, message submission is limited by the speed of writing
messages to disk. A key point is that message submission rate is sustained
after this drop.
The second graph segment shows message transfer out rate. This starts
at around 450 messages per second. It drops off slightly during the
run, but maintains good performance for all of the messages.
Conclusions
The key conclusions from these tests are:
- For transfer of small messages, throughput is constrained by the
number of messages (around 560 per second for this hardware).
- For transfer of large messages, throughput is constrained by network
performance.
- M-Switch can robustly handle large queues, of at least 1,000,000
messages.
- A write back cache on the disk is important to achieve best performance.
- The number of inbound and outbound connections does not significantly
affect performance, for a moderate number of connections. We believe
that this will extend up to at least a few hundred connections.
In summary, we believe that M-Switch X.400 is substantially faster
than any other X.400 product.