VPOP3 Spamfilter False Positives

As with any spam filter solution, the VPOP3 spam filter can, and probably will, generate false positives. These are messages which VPOP3's spam filter thinks are spam, but which actually aren't.

On the mail we receive at PSCS, the false positive rate is around 2 or 3 in every 5000 detected spam messages. That's about a 0.04% - 0.06% rate. However, different users may have different false positive detection rates. For instance:

  • you may receive mail which looks like some types of spam message (eg stockbrokers may receive a lot of mail which looks like the 'stocks & shares' spam messages that other people receive). In this case, you may need to turn off, or reduce the weighting on the spam filter rules which are being triggered by these messages.

  • you may receive mail from companies which send out spam themselves, so those companies' mail servers may have been blacklisted. In this case, you may need to turn off or reduce the weighting on the relevant DNS blacklist rules.

  • you may receive mail from people who use 'leetspeak' or text messaging language. Lots of spammers use leetspeak type language, so some of the spam filter definitions will detect this and may treat these messages as spam. In this case you may need to turn off, or reduce the weighting on the 'Misspelling' and 'Misspelling2' spam filter rules.

The VPOP3 spam filter also uses 2 types of spam detection which are not directly controlled by the spam definitions:

  • Bayesian filtering - this looks at the previous messages you have sent and received and whether those messages were classed as spam or not spam, and compares the words in those messages with the words in any new messages to decide the probability of the new messages being spam or not. If the filter has not been particularly well trained in the past, then the Bayesian filter may get things wrong until it is retrained.

  • Real-time black lists - these are DNS based Blacklists for either mail server IP addresses or URL links containined inside messages which indicate whether these are used by spammers or not. For most people these are a very accurate way of detecting spam messages. However, sometimes a mail server used by legitimate users will get put onto the blocked mail server blacklists so their mail may be affected.

If you get false positives from the VPOP3 spam filter, the first thing to do is to look for the X-VPOP3-SPAM: line in the message header, this lists all the rules which were triggered for that message, and the 'score' which that rule contributed to the whole score for that message. By default any message with a score of 100 or more is treated as spam by VPOP3.

Some examples of X-VPOP3-Spam: header lines are below, with brief descriptions:

X-VPOP3-SpamBayes: 0
X-VPOP3-Spam: 132 - Misspelling(50) SuspiciousURL(10) xbl.spamhaus.org(72)

This means that the message had a word (or more) which looks to have been misspelt (VPOP3 only checks certain words which spammers regularly misspell to try to avoid filters - eg 'V@lium' etc), which contributed 50 points to the score. It had a URL link in the message which looked like it may have been made by a spammer, but it wasn't certain so that only contributed 10 points to the score. The main “problem” with this email was that the sending mail server was listed with the 'xbl.spamhaus.org' blacklist, so that contributed 72 points to the score. The total score was thus 132 points, which meant that the message would be treated as spam.

X-VPOP3-SpamBayes: 100
X-VPOP3-Spam: 125 - AllCaps(50) bayes99(75)

This message had long sequences of capital letters, which matches the 'AllCaps' rule. On its own this wouldn't be a problem as that only contributed 50 points to the score, but the message also had words previously seen mainly in 'spam', so the Bayesian statistical filter gave the message a “100%” probability of being spam (the 'X-VPOP3-SpamBayes' header line). This matches the 'Bayes99' rule (Bayesian score of 99 or bigger) which adds 75 points to the score.

In both these cases the fixed rules in the VPOP3 spam filter definitions contributed less than half of the overall spam score of the message, the majority of the score was contributed by something outside the control of the fixed definitions. In this case, if the message is reported as a false positive to us, we may not be able to alter the filter definitions to have allowed the message through without reducing its effectiveness on detecting actual spam.

If you regularly get these types of false positives, the solution may be to decrease the weighting of the appropriate DNS blacklist test or the Bayesian filter tests. To do this, in the VPOP3 settings, go to Settings → Spam Filter → General → Rule Weights, and find the rules in question, and decrease their weighting (the rules are named as the entries in the X-VPOP3-Spam header line)