
M-Switch uses a number of different, complementary techniques to determine
the "Spam Score" of a message. You can use any or all of these techniques.
Isode have been adding regularly to the techniques in order to maintain
high levels of accuracy in spam detection, and will continue to add
more techniques as they are developed.
Interception Techniques
Content Filtering
Looking at the content of a message to match words, phrases and other
information is one of the most effective ways of eliminating spam.
Our content filtering measures work by using a technique known as Support
Vector Machines, a significant improvement over the Bayesian logic used
by most other anti-spam vendors.
In Bayesian analysis a sample set of messages and spam are analyzed,
and each input (word or spam feature) being checked is counted, and
weight assigned to each input. Support Vector Machines generates weights
by looking at the inputs in combination, taking into account the relationship
of the inputs and how they occur in spam, rather than treating each
input in isolation.
Opperationally this has led to a lowering of the rates of False Positives
(messages incorrectly marked as spam) and False Negatives (spam messages
incorrectly passed).
Grey Listing
Grey listing works by recording send, recipient and source IP address,
letting through known tuples, temporarily failing all other messages
(forcing legitimate sending systems to retry sending the message) and
adding as a known tuple when a retry is received.
As most spam is sent by scripts which do not retry failed deliveries,
properly implemented grey listing can remove 90% of spam before it gets
to the Message Transfer Agent (MTA).
Phone & URL Blacklists
Whilst most spammers will fake return addresses, they nearly always
include in the body of the message at least one method (phone or website
URL) so the recipient can respond to the spam's advertising. M-Switch
Anti-Spam maintains both phone and URL blacklists.
Subject Line Matching
Matching the subject line against a list of topics that should always
be treated as spam.
Originator Matching
Matching the originator of the message against an email blacklist.
Host Matching
Matching the sending host against a host blacklist.
Message Characteristic Checking
Checking the technical characteristics of a message, such as the way
in which returned messages are handled.
Network Address Checking
Checking the originating network address.
Obfuscation Techniques
Checking for spam obfuscation techniques such as HTML comments or messages
that are composed entirely of URLs.
Real time black hole lists
Up to date lists of message servers which are known to act as relays
for spam messages (either deliberately or through poor security).
English Trigraph Checking
Looks for and scores the frequency of text strings which contain three-letter
combinations that do not exist in the English language.
Other Spam Techniques
M-Switch offers a range of techniques for eliminating spam. When looking
at the range of products and service to eliminate spam, it can be quite
difficult to determine which is most appropriate or most effective.
Many products offer one (often quite limited) technique for eliminating
spam, and then promote it as a total solution. Isode offers a range
of techniques for eliminating spam. Isode is also actively tracking
spam developments, and will introduce new tools and techniques as appropriate
to deal with the evolving nature of spam.
There are two other recognized approaches to eliminating spam. These
'solutions' are used by some other vendors but not by Isode because
whilst they have some advantages we believe those are outweighed by
their disadvantages.
Desktop Solutions
Isode's approach works at the boundary, eliminating spam before it
gets to the end user.
An alternate class of solutions works to remove spam after it has been
delivered. We believe that this can be very effective for some (technically
minded) individuals. It does not appear to be a particularly useful
solution for a typical ISP customer, or for enterprise deployment. In
general, we see that a server based approach, broadly transparent to
the end user, is the most desirable.
Spam Signature Solutions
Isode's content filtering approach looks for generic patterns and information
in messages, to identify them as spam.
The other approach is to match individual spam messages, by matching
signatures of the messages (typically some sort of unique checksum calculated
across the entire message). These signatures are generated by humans
looking at messages to determine if they are spam, and adding them to
a list of "known spam". These lists are then distributed widely, to
prevent delivery of the spam. Problems with this approach:
- It is expensive, because of the need to have people reading spam.
- As levels of spam grow, it will become increasingly difficult to
maintain effective lists, because of the volume needed and to be able
to effectively hit each spam message early in its life.
- It is trivial for spammers to circumvent, by making changes to
messages that defeat the signatures or by reducing the distribution
list size for specific spam variants.