Community Discussions and Support
False positives in both Popfile and Spamhalter - Measures to avoid this happening.

Ah the blacklist does now seem to have burst into life.  I'm a happy bunny as it were.

Ah the blacklist does now seem to have burst into life.  I'm a happy bunny as it were.

Hi,

For some time now we have been running with

Popfile and Spamhalter in tandem with the server archiving and not

delivering and emails detected as spam by both filters.  This has

worked well for us dramatically reducing the number of spams our users

have had to screen.  We also run a limited content filter to detect spams

with obvious spam words in them and the output of the filter is used to

train Spamhalter.  Another measure is a pre-dated filter so any message

with an age of fewer than -2 days are treated as spam and again used to

train Spamhalter.  Actually I wonder why a date filter of this type is not more straightforward in Mercury as I can see no legitimate reason to pre-date messages and it is a very common spam practice, we had to implement the pre-date filter in our Pegasus Archive account.

Anyway we have recently started to get a few legitimate emails detected as false positives by both filters which is clearly a cause for concern.  We have always had a few false positives from one or other filter.

Up to last week we had Spamhalter set to 'Train On Errors'.  Since almost all errors were false negatives this meant that there were a lot more additions to the black database than the white one so I have set it to Train Always now in the hope that this will keep the White database more up to date.  I have also just increased the spam probability setting from 70 to 75.  The other classification settings are Probability for unknown Tokens = 40, Level of not Spam Preference = 3 and count of classified tokens = 20, I am not sure whether to twwek these.

Does anyone else have a similar setup and practice, if so what measures do you take to avoid false positives?

 

 

 

<p>Hi,</p><p>For some time now we have been running with Popfile and Spamhalter in tandem with the server archiving and not delivering and emails detected as spam by both filters.  This has worked well for us dramatically reducing the number of spams our users have had to screen.  We also run a limited content filter to detect spams with obvious spam words in them and the output of the filter is used to train Spamhalter.  Another measure is a pre-dated filter so any message with an age of fewer than -2 days are treated as spam and again used to train Spamhalter.  Actually I wonder why a date filter of this type is not more straightforward in Mercury as I can see no legitimate reason to pre-date messages and it is a very common spam practice, we had to implement the pre-date filter in our Pegasus Archive account. </p><p>Anyway we have recently started to get a few legitimate emails detected as false positives by both filters which is clearly a cause for concern.  We have always had a few false positives from one or other filter. </p><p>Up to last week we had Spamhalter set to 'Train On Errors'.  Since almost all errors were false negatives this meant that there were a lot more additions to the black database than the white one so I have set it to Train Always now in the hope that this will keep the White database more up to date.  I have also just increased the spam probability setting from 70 to 75.  The other classification settings are Probability for unknown Tokens = 40, Level of not Spam Preference = 3 and count of classified tokens = 20, I am not sure whether to twwek these.</p><p>Does anyone else have a similar setup and practice, if so what measures do you take to avoid false positives? </p><p> </p><p> </p><p> </p>

We use a fairly tight SMTP transaction filter followed by a Spamcop lookup then Spamhalter.

 

Train Always (for the same reason you stated)

Probability for unknown Tokens = 80 (with train always most good tokens are known if your mail isn't too diverse)

Level of not Spam Preference = 1 (good tokens are worth the same as bad rather than 3 times as much, catches a lot more spam but could increase your FP's, didn't for me but YMMV)

Count of classified tokens = 20

 

The transfilters & Spamcop reject > 60% of all SMTP connections.

Of the accepted mails, Spamhalter tags about 35% as spam and misses < 0.1% (just got my first one for 2 months)

Any tagged mails get moved for manual FP review and autodeleted by a batch job after 7 days.

My FP rate is about ~0.2%  (1-2 per month, usually newly signed up newsletters etc,)

 

&lt;p&gt;We use a fairly tight SMTP transaction filter followed by a Spamcop lookup then Spamhalter. &lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Train Always (for the same reason you stated)&lt;/p&gt;&lt;p&gt;Probability for unknown Tokens = 80 (with train always most good tokens are known if your mail isn&#039;t too diverse)&lt;/p&gt;&lt;p&gt;Level of not Spam Preference = 1 (good tokens are worth the same as bad rather than 3 times as much, catches a lot more spam but could increase your FP&#039;s, didn&#039;t for me but YMMV)&lt;/p&gt;&lt;p&gt;Count of classified tokens = 20&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;The transfilters &amp;amp; Spamcop reject &amp;gt; 60% of all SMTP connections.&lt;/p&gt;&lt;p&gt;Of the accepted mails, Spamhalter tags about 35% as spam and misses &amp;lt; 0.1% (just got my first one for 2 months) &lt;/p&gt;&lt;p&gt;Any tagged mails get moved for manual FP review and autodeleted by a batch job after 7 days.&lt;/p&gt;&lt;p&gt;My FP rate is about ~0.2%&amp;nbsp; (1-2 per month, usually newly signed up newsletters etc,)&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;

Thanks for that I have increased the unknown token probability to 80% but left the non spam preference at 3 as false Positives, if they happen are potentially more of a problem.  The explanation of twhat each parameter does is particularly useful.

I will look in to spamcop, its not something we have tried.

&lt;p&gt;Thanks for that I have increased the unknown token probability to 80% but left the non spam preference at 3 as false Positives, if they happen are potentially more of a problem.&amp;nbsp; The explanation of twhat each parameter does is particularly useful.&lt;/p&gt;&lt;p&gt;I will look in to spamcop, its not something we have tried. &lt;/p&gt;

I tried a couple of other DNSBL's as well (can't remember which ones now) but found the FP rate way too high,

I was tagging and reviewing for a start, but after six months not one of the Spamcop tagged mails were FP's so I switched to rejecting.

I now prefer an 'SMTP reject' wherever possible (transflt.mer is your friend :)) to an 'accept then silently delete' policy (and bounces are just idiotic) as any legit mailer will get a failure message from there own server, and at least in our situation, I will hear from them presently (watch for typo's in you transflt rules :)).

My single biggest reducer of spam is this transflt rule

D, "*.*", B-N, "554 unresolvable host name"

(see transflt.mer for a translation)

A secondary advantage of the 'D'eferred HELO processing is that it captures the MAIL FROM address, even for rejected connections, in the logs.

&lt;p&gt;I tried a couple of other DNSBL&#039;s as well (can&#039;t remember which ones now) but found the FP rate way too high,&lt;/p&gt;&lt;p&gt;I was tagging and reviewing for a start, but after six months not one of the Spamcop tagged mails were FP&#039;s so I switched to rejecting.&lt;/p&gt;&lt;p&gt;I now prefer an &#039;SMTP reject&#039; wherever possible (transflt.mer is your friend :)) to an &#039;accept then silently delete&#039; policy (and bounces are just idiotic) as any legit mailer will get a failure message from there own server, and at least in our situation, I will hear from them presently (watch for typo&#039;s in you transflt rules :)).&lt;/p&gt;&lt;p&gt;My single biggest reducer of spam is this transflt rule&lt;/p&gt;&lt;p&gt;D, &quot;*.*&quot;, B-N, &quot;554 unresolvable host name&quot; &lt;/p&gt;&lt;p&gt;(see transflt.mer for a translation)&lt;/p&gt;&lt;p&gt;A secondary advantage of the &#039;D&#039;eferred HELO processing is that it captures the MAIL FROM address, even for rejected connections, in the logs. &lt;/p&gt;

I tried a couple of other DNSBL's as well (can't remember which ones now) but found the FP rate way too high,

I

was tagging and reviewing for a start, but after six months not one of

the Spamcop tagged mails were FP's so I switched to rejecting.

I too went through the same process with blacklisting.  I ran for over a year just tagging with the SpamCop and SpamHaus-Zen blacklists with zero false positives so I also went to rejecting.  I do not reject on invalid host names in the EHLO string since there are quite a few people with a bad strings here since the new Mercury users in many cases do not understand the requirements.  FWIW, almost all the spam that I receive do use a proper, if false, EHLO string. 

&lt;blockquote&gt;&lt;p&gt;I tried a couple of other DNSBL&#039;s as well (can&#039;t remember which ones now) but found the FP rate way too high,&lt;/p&gt;&lt;p&gt;I was tagging and reviewing for a start, but after six months not one of the Spamcop tagged mails were FP&#039;s so I switched to rejecting.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;I too went through the same process with blacklisting.&amp;nbsp; I ran for over a year just tagging with the SpamCop and SpamHaus-Zen blacklists with zero false positives so I also went to rejecting.&amp;nbsp; I do not reject on invalid host names in the EHLO string since there are quite a few people with a bad strings here since the new Mercury users in many cases do not understand the requirements.&amp;nbsp; FWIW, almost all the spam that I receive do use a proper, if false, EHLO string.&amp;nbsp; &lt;/p&gt;

I tried adding the blacklist entries in the Mercury SMTP server setup dialogue, more or less copying what you did but so far in 24 hours the blacklists do not seem to have trapped a single message.  I have done a quick check on the spamcop site and we do seem to have received at least one from a blacklisted sender.  I cannot see a problem with what I have set up, here is my MS_SPAM.MER file:

  # Mercury/32 SMTP server block query definitions data file.
# Mercury/32 Mail Transport System, Copyright 1993-2006, David Harris.

Begin
Name: Spamcop
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: bl.spamcop.net
Strictness: Normal
Action: Tag
Parameter: X-Blocked see: http://spamcop.net/bl.shtml?
End

Begin
Name: PSBL
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: psbl.surriel.com
Strictness: Normal
Action: Tag
Parameter: X-Blocked: by PSBL See http://psbl.surriel.com for removal instructions
End

Begin
Name: SpamHaus-Zen
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: zen.spamhaus.org
Strictness: Range 127.0.0.2 - 127.0.0.8
Action: Tag
Parameter: X-Blocked by SpamHaus.org See http://spamhaus.org for removal instructions
End

Begin
Name: Spamhaus Zem PBL
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: zen.spamhaus.org
Strictness: Range 127.0.0.10 - 127.0.0.11
Action: Tag
Parameter: X-Blocked:  by SpamHaus.org PBL See http://spamhaus.org for removal instructions
End

Any ideas?

NB   Initially I did not have the ranges set for SpamHaus I changed that this morning.

&lt;p&gt;I tried adding the blacklist entries in the Mercury SMTP server setup dialogue, more or less copying what you did but so far in 24 hours the blacklists do not seem to have trapped a single message.&amp;nbsp; I have done a quick check on the spamcop site and we do seem to have received at least one from a blacklisted sender.&amp;nbsp; I cannot see a problem with what I have set up, here is my MS_SPAM.MER file:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;&amp;nbsp; # Mercury/32 SMTP server block query definitions data file. # Mercury/32 Mail Transport System, Copyright 1993-2006, David Harris. Begin Name: Spamcop Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: bl.spamcop.net Strictness: Normal Action: Tag Parameter: X-Blocked see: http://spamcop.net/bl.shtml? End Begin Name: PSBL Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: psbl.surriel.com Strictness: Normal Action: Tag Parameter: X-Blocked: by PSBL See http://psbl.surriel.com for removal instructions End Begin Name: SpamHaus-Zen Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: zen.spamhaus.org Strictness: Range 127.0.0.2 - 127.0.0.8 Action: Tag Parameter: X-Blocked by SpamHaus.org See http://spamhaus.org for removal instructions End Begin Name: Spamhaus Zem PBL Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: zen.spamhaus.org Strictness: Range 127.0.0.10 - 127.0.0.11 Action: Tag Parameter: X-Blocked:&amp;nbsp; by SpamHaus.org PBL See http://spamhaus.org for removal instructions End &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Any ideas?&lt;/p&gt;&lt;p&gt;NB &amp;nbsp; Initially I did not have the ranges set for SpamHaus I changed that this morning. &lt;/p&gt;

I noted your setup for blacklists from another thread and used the Mercury SMTP Server setup dialogue to define Spamcop, Spamhaus and PSBL then later copied the ranges you used for Spamhaus later.  I decided to tag the messages at least initially to see how effective the blacklists were.  I also set a filter to file any emails which were blacklisted in my archive account.  So far after just over 24 hours I am surprised to note that apprently no emails have been tagged with the defined text.  I did a quick double check in our spam trap folder and discovered that some of our spam was indeed from senders blacklisted on Spamcop.  I presume that fact that none of our spam appears to have been blocked means that the blacklisting is not working at all, is there something else we need to turn on to make this work or have I made a mistake?  Our MS_SPAM.MER file follows.

# Mercury/32 SMTP server block query definitions data file.
# Mercury/32 Mail Transport System, Copyright 1993-2006, David Harris.

Begin
Name: Spamcop
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: bl.spamcop.net
Strictness: Normal
Action: Tag
Parameter: X-Blocked see: http://spamcop.net/bl.shtml?
End

Begin
Name: PSBL
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: psbl.surriel.com
Strictness: Normal
Action: Tag
Parameter: X-Blocked: by PSBL See http://psbl.surriel.com for removal instructions
End

Begin
Name: SpamHaus-Zen
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: zen.spamhaus.org
Strictness: Range 127.0.0.2 - 127.0.0.8
Action: Tag
Parameter: X-Blocked by SpamHaus.org See http://spamhaus.org for removal instructions
End

Begin
Name: Spamhaus Zem PBL
Enabled: Y
QueryType: Blacklist
QueryForm: Address
Hostname: zen.spamhaus.org
Strictness: Range 127.0.0.10 - 127.0.0.11
Action: Tag
Parameter: X-Blocked:  by SpamHaus.org PBL See http://spamhaus.org for removal instructions
End

Thanks

 

Chris 

&lt;p&gt;I noted your setup for blacklists from another thread and used the Mercury SMTP Server setup dialogue to define Spamcop, Spamhaus and PSBL then later copied the ranges you used for Spamhaus later.&amp;nbsp; I decided to tag the messages at least initially to see how effective the blacklists were.&amp;nbsp; I also set a filter to file any emails which were blacklisted in my archive account.&amp;nbsp; So far after just over 24 hours I am surprised to note that apprently no emails have been tagged with the defined text.&amp;nbsp; I did a quick double check in our spam trap folder and discovered that some of our spam was indeed from senders blacklisted on Spamcop.&amp;nbsp; I presume that fact that none of our spam appears to have been blocked means that the blacklisting is not working at all, is there something else we need to turn on to make this work or have I made a mistake?&amp;nbsp; Our MS_SPAM.MER file follows.&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;# Mercury/32 SMTP server block query definitions data file. # Mercury/32 Mail Transport System, Copyright 1993-2006, David Harris. Begin Name: Spamcop Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: bl.spamcop.net Strictness: Normal Action: Tag Parameter: X-Blocked see: http://spamcop.net/bl.shtml? End Begin Name: PSBL Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: psbl.surriel.com Strictness: Normal Action: Tag Parameter: X-Blocked: by PSBL See http://psbl.surriel.com for removal instructions End Begin Name: SpamHaus-Zen Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: zen.spamhaus.org Strictness: Range 127.0.0.2 - 127.0.0.8 Action: Tag Parameter: X-Blocked by SpamHaus.org See http://spamhaus.org for removal instructions End Begin Name: Spamhaus Zem PBL Enabled: Y QueryType: Blacklist QueryForm: Address Hostname: zen.spamhaus.org Strictness: Range 127.0.0.10 - 127.0.0.11 Action: Tag Parameter: X-Blocked:&amp;nbsp; by SpamHaus.org PBL See http://spamhaus.org for removal instructions End &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Thanks &lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Chris&amp;nbsp; &lt;/p&gt;
live preview
enter atleast 10 characters
WARNING: You mentioned %MENTIONS%, but they cannot see this message and will not be notified
Saving...
Saved
With selected deselect posts show selected posts
All posts under this topic will be deleted ?
Pending draft ... Click to resume editing
Discard draft