May 2008


A key feature of any anti-spam solution is how effective it is at removing spam. A perfect anti-spam system would have a zero false positive rate and a zero false negative rate. In practice, this is not usually achieved, and systems will invariably trade off the two measurements. This paper describes how false negatives can be measured and looks at false negative rates for Isode's M-Switch Anti-Spam.

Creative Commons License

False Negatives and False Positives

Anti-spam technologists use two important terms for looking at how well an anti-spam system works:

  • "False Negative" is where a spam message is not detected and gets through to the target recipient. False Negative Rate is the percentage of false negatives measured against the total volume of spam (including the false negatives).
  • "False Positive" is where a real message (not spam) is identified by the system as being spam. False Positive Rate is the percentage of false positives measured against all real messages (including the false positives).

A perfect anti-spam system would have a zero false positive rate and a zero false negative rate. In practice, this is not usually achieved, and systems will invariably trade off the two measurements (i.e., you can tune the system to reduce the false negative rate, which will lead to and increase in the false positive rate).

Why Measuring False Positives in a Meaningful Way is Hard

This paper is about measuring false negatives. It is useful to explain why we are not attempting to apply similar systematic measurements to false positives.

The most important issue is that each user has his or her own pattern of (real) messages. An anti-spam message will look at each incoming message and seek to determine whether or not it is spam. A consequence of this is that false positive rate will vary between users. Some users will only receive messages of a type that the anti-spam system will always correctly handle, and see a zero false positive rate. Other users will have varying rates of "spammy" messages and as a consequence will see varying false positive rates. A biochemist working on Viagra is likely to see a quite high false positive rate. There is no easy "right" value for the false positive rate of a product.

A second problem is that false positive measurements are hard to automate. They will generally rely on users inspecting message quarantines, with lots of real spam in. This gives experimental error, as users may not notice false positives. There is also a subjective issue, as different users will interpret false positives differently.

For these reasons, this paper is not measuring false positives.

Measuring False Negatives

Spam is broadly similar. Individuals, particularly those with low volumes of spam, will often see quite a bit of variation dependent on exactly which lists they have the misfortune to be on. However, spammers send out vast numbers of messages, and by use of a number of "honey pot" accounts, it is straightforward to get a useful and realistic sample of the spam that is flowing around the world at any point.

The false negative rate of an anti-spam system can be measured simply be applying this stream as it arrives, and noting how many messages are caught and how many get through.


The following measurements were made using Isode M-Switch Anti-Spam product. Measurements are derived from daily logs. Specific notes:

  • Default (recommended) settings of the product are used.
  • No white lists or black lists are set. While users will often use these to fine tune anti-spam performance, they are not used here as they will simply obscure the underlying performance.
  • These measurements are taken as messages arrive (i.e., it shows performance under real conditions, and not using "hindsight" knowledge).

The first graph shows daily volumes of spam over the period of measurement.

Total Spam Messages

This next graph shows the false negative rate.

False Negative Rate

It can be seen that the false negative rate is broadly flat and low (around 2% for the recent period). At times there are spikes. These correspond to spammers introducing new techniques, and Isode’s reaction to dealing with these techniques. They clearly reflect the war that is going on between spammers and the systems working to reduce it.


This paper has described how false negative rate can be measured, and results for Isode's M-Switch Anti-Spam product.