Raggedstaff Internet
The friendly ISP
 

Tips on scores



This page offers some tips on picking scores for use in customised envelope blocking policies. The scores in pre-set policies have been carefully selected and tested to give reasonable results in most common situations. However, each person or company has different ways of working and communicates with different types of organisations in different places. Scores that work well for us may not work well for you.

How scores work

Simple. Each test is assigned a score. The scores for each test that is failed are added up, and if the total exceeds the reject level for the policy, the message is rejected (or a special header is added if you have selected 'tag only' for the particular recipient).

The basis for scores

Scores are actually based on the number of 'false positives' that are expected from a test. That is, the proportion of legitimate emails that you expect to fail a test. Tests which only spam ever fails (low false positive rate) will have a high score. Tests which significant amounts of legitimate mail fails will have a low score. For those of you with a mathematical bent, the relationship between the rate of false positives and the score is logarithmic.

We suggest that you consider what proportion of false positives you expect from a particular test, and then use the following table to pick scores based on that rate.

FP rate (1 per...) Score
10,000 9.2
5,000 8.5
2,000 7.6
1,000 6.9
750 6.6
500 6.2
250 5.5
100 4.6
75 4.3
50 3.9
25 3.2
10 2.3
5 1.6
3 1.1
2 0.7

Things to consider

Your mileage may vary

When estimating a false positive rate consider your own particular circumstances. Not everybody sees the same false positive rate.

For example, we at Raggedstaff Internet assign a high score to mail arriving from servers in Korea as we have never seen legitimate mail originating from there. But if you have family, friends or business contacts in Korea then clearly you will expect to get legitimate mail from them. Your score for mail from Korea should be much lower - in fact you probably shouldn't be testing for this at all.

It's false positives that matter

There are many things that all SPAM has in common. For example, all email SPAM is email. This doesn't really help us distinguish it from legitimate email though. Useful tests are ones that a significant amount of SPAM will fail, but relatively little legitimate mail. That is, a high false positive rate makes a poor test.

Returning to country blocks, according to SOPHOS the country from which the greatest amount of SPAM comes is the USA. Over a third of all SPAM originates there. But for many people, so does a vast amount of legitimate mail. So testing if mail is being sent from the USA is not generally useful for identifying SPAM.

By contrast, about one quarter of all SPAM originates in Korea and just under 10% in China. Most people in Western Europe get very little legitimate mail from these countries, so testing if mail originates in them is valuable.

Consider how tests interact

The model we use is rather simplistic. It assumes that all the tests are statistically independent. This is not the case in the real world though, so having picked your scores based on the rate of false positives, you need to tweak them to allow for this.

For example, the combined DNS blocklist combined.njabl.org returns 127.0.0.3 for dynamic or dial-up hosts. The DNSBL dnsbl.sorbs.net return 127.0.0.10 for dynamic or dial-up hosts. If a server is in one list, it is probably in the other, as both have very similar criteria. If you think that 1 in 100 messages from a dial up host is legitimate, you would pick a score of 4.6. But if you set a score of 4.6 for both these lists, you're effectively assigning a score of 9.2 to many dial-up hosts.

Similarly, everything in l1.spews.dnsbl.sorbs.net is also in l2.spews.dnsbl.sorbs.net. If you think that l1.spews.dnsbl.sorbs.net has a 1 in 50 false positive rate you would assign a score of 3.9. But if you have a score of 1.0 for l2.spews.dnsbl.sorbs.net already, you would effectively be scoring l1.spews.dnsbl.sorbs.net at 4.9. A more sensible score for l1.spews.dnsbl.sorbs.net would be 2.9.

Research your DNSBLs

Every DNSBL has a different policy on listing hosts. Some, like list.dsbl.org are relatively cautious and have a low false positive rate. Others, like spews, are aggressive, often listing entire networks and producing high rates of false positives.

To have any idea of the rate of false positives a particular DNSBL has you need to acquaint yourself with its policies and its reputation, and probably to use the 'tag only' option for a while whilst you assess how well it meets your needs.