Hi,
For some time now we have been running with
Popfile and Spamhalter in tandem with the server archiving and not
delivering and emails detected as spam by both filters. This has
worked well for us dramatically reducing the number of spams our users
have had to screen. We also run a limited content filter to detect spams
with obvious spam words in them and the output of the filter is used to
train Spamhalter. Another measure is a pre-dated filter so any message
with an age of fewer than -2 days are treated as spam and again used to
train Spamhalter. Actually I wonder why a date filter of this type is not more straightforward in Mercury as I can see no legitimate reason to pre-date messages and it is a very common spam practice, we had to implement the pre-date filter in our Pegasus Archive account.
Anyway we have recently started to get a few legitimate emails detected as false positives by both filters which is clearly a cause for concern. We have always had a few false positives from one or other filter.
Up to last week we had Spamhalter set to 'Train On Errors'. Since almost all errors were false negatives this meant that there were a lot more additions to the black database than the white one so I have set it to Train Always now in the hope that this will keep the White database more up to date. I have also just increased the spam probability setting from 70 to 75. The other classification settings are Probability for unknown Tokens = 40, Level of not Spam Preference = 3 and count of classified tokens = 20, I am not sure whether to twwek these.
Does anyone else have a similar setup and practice, if so what measures do you take to avoid false positives?
Chris Clarke-Williams
I T Manager
Wicks and Wilson Ltd