An M-Box server is based around a message database, and has three multi-threaded servers that access this database; imapd (supports access using IMAP), pop3d (supports access using POP) and lmptd (supports access using LMTP).
The most important part of the M-Box architecture is the database design. M-Box uses a database abstraction layer, so that it is possible to use different database back-ends. This has been prototyped with several back-ends. Two database back-end are currently being provided, one for standard usage and one for archive access.
The underlying database is mapped onto the underlying file store using a format which we call “mdir”. This uses the basic approach of the popular maildir format, which enables mailbox sharing without use of file system locks. Mdir differs from maildir primarily to enable performance improvements, particularly in support of IMAP. Key features of the mdir format:
- No locks. This is critical for high performance, and robust horizontal scaling.
- Each message in a single file. This simple approach gives inherent high robustness, as any data corruption will be constrained to a single message. It also allows standard file system backup mechanisms to be used for online backup.
- Use of file store hierarchy. Folders and mailboxes are structured using the file store hierarchy. This approach is inherently robust and allows use of standard OS tools on the database. A key benefit is that each account is represented by a single directory hierarchy, which simplifies provisioning support and in particular account deletion.
A quick analysis might suggest that this simple approach, while very robust, would not give good performance. In fact, exactly the opposite is true. Modern files systems are highly optimized for exactly the sort of operations that are performed by M-Box. In particular:
- File system directories are indexed using b-trees, and have optimized layout on disk. Use of large directories with key indexing information in the filename is therefore very efficient.
- Files are intelligently laid out on the disk, which means that use of relatively small block size (good to gain high disk utilization, as messages are generally small) does not impact performance for large files.
In essence, M-Box is working with the tuning of the underlying file system, which is one of the reasons for its excellent performance. Our tests have shown that M-Box is faster than servers that use complex approaches to optimize performance, and that the approach taken gives benefits of simplicity, robustness and high performance
A key benefit that mdir provides is support of horizontal scaling with an architecture that allows multiple M-Box servers to access a single file store. This architecture gives 'horizontal scaling', and enables additional M-Box servers to be added to support more users. Load balancing between the M-Box servers will usually be achieved by use of a network load balancer. The file store will be chosen with appropriate size, performance and expansion options for the deployment. It may be a file system on one of the M-Box servers, but in most cases it is likely to be a separate network file server appliance. The file access protocol will depend on performance and scaling targets for the total system. Options include:
- NFS (Network File System). The traditional protocol for this sort of functionality.
- iSCSI. A mapping of the SCSI (Small Computer Systems Interface) protocol commonly used onto IP. iSCSI can be used with a clustering file system to provide a low cost, higher performance alternative to NFS.
- SAN (Storage Area Network). A more expensive, higher performance option.
Although this approach gives an elegant way to scale an M-Box deployment by adding more servers, the major advantage of this architecture is in simplified management. This architecture leads to user mailboxes being deployed on a single file store, with each account having a separate directory. Increase in storage is simply a question of increasing the size of the file server. This makes it easy to add and remove accounts.
Many mailbox servers, particularly those with complex databases, lock mailboxes to specific servers. This leads to two problems:
- In order to make mailbox location transparent to the end user, a tier of application servers will generally be introduced, so that the user gets redirected to the right server.
- When servers are added, or usage grows, there will be the requirement to migrate users between servers. This adds product and operational complexity.
M-Box's architecture avoids these problems.