Community Discussions and Support
Spamhalter VS Popfile - Slow to learn?

[quote user="Moshe"]

Hi Thomas,

I did enable the password feature and am using the correct syntax in the "to" field (spam+password@site.com where "spam" is the account, "password" is replaced by a real password and the "site" is replaced by the site name).

 The thing is, the spamhalter config page says "password for offsite corrections" and I do specify the password...

 I am sending the correction mails using an authenticated (password on SMTP) account.  I'm using the same domain for both the server and the mail account.

I'm still confused as to what's wrong.

[/quote]

 

As am I.  I do exactly what you are doing when sending the mail to Spamhalter.  The logs show that the correction is being made using the password.

D 20071116 110119.824 MG000004 Mercury version >= 4.1
D 20071116 110119.824 MG000004 jobfile: C:\MERCURY\QUEUE\MG000004.QDF
D 20071116 110119.824 MG000004 spamdir: \\THOMAS\SYS\MAIL\6000001
D 20071116 110119.824 MG000004 nospamdir: \\THOMAS\SYS\MAIL\7000001
D 20071116 110119.840 MG000004 origin by MercuryD
  20071116 110119.840 MG000004 from: support@tstephenson.com
D 20071116 110119.840 MG000004 > Internet sender
D 20071116 110119.840 MG000004 > Need to test
P 20071116 110119.840 MG000004 Correction enabled by password
_ 20071116 110119.840 MG000004 Correction request saved as: \\THOMAS\SYS\MAIL\7000001\AAB85WGU.CNM

My only guess is the password is wrong on your setup..


 

[quote user="Moshe"]<p>Hi Thomas,</p> <p>I did enable the password feature and am using the correct syntax in the "to" field (<a href="mailto:spam+password@site.com" mce_href="mailto:spam+password@site.com">spam+password@site.com</a> where "spam" is the account, "password" is replaced by a real password and the "site" is replaced by the site name).</p> <p> The thing is, the spamhalter config page says "password for offsite corrections" and I do specify the password...</p> <p> I am sending the correction mails using an authenticated (password on SMTP) account.  I'm using the same domain for both the server and the mail account.</p> <p>I'm still confused as to what's wrong.</p><p>[/quote]</p><p> </p><p>As am I.  I do exactly what you are doing when sending the mail to Spamhalter.  The logs show that the correction is being made using the password.</p><p>D 20071116 110119.824 MG000004 Mercury version >= 4.1 D 20071116 110119.824 MG000004 jobfile: C:\MERCURY\QUEUE\MG000004.QDF D 20071116 110119.824 MG000004 spamdir: \\THOMAS\SYS\MAIL\6000001 D 20071116 110119.824 MG000004 nospamdir: \\THOMAS\SYS\MAIL\7000001 D 20071116 110119.840 MG000004 origin by MercuryD   20071116 110119.840 MG000004 from: support@tstephenson.com D 20071116 110119.840 MG000004 > Internet sender D 20071116 110119.840 MG000004 > Need to test P 20071116 110119.840 MG000004 Correction enabled by password _ 20071116 110119.840 MG000004 Correction request saved as: \\THOMAS\SYS\MAIL\7000001\AAB85WGU.CNM</p><p>My only guess is the password is wrong on your setup..</p><p>  </p>

We have just switched from using popfile version 2.1.1 to using the version of spamhalter which ships with Mercury 4.51.   Spamhalter was certainly fairly easy and seamless to set up and emailing spams to account@domain.co.uk to update it does seem simpler than the Popfile method of using a web interface.   However Spamhalter seems to be very slow to learn, in fact it often lets through emails identical in content to ones which were sent to the spam learning account!   Have other people had this problem or am I doing something wrong?   I note that none of the mails sent to the spam learning account ever actually get there, is this normal, does the system automatically delete them having learnt from them?   I have also noted that every email I've ever checked the stats for in raw view has a spam probability of 0.0% if not detected as spam and 100% if detected as spam.   On the plus side the only false positives we have had were in the first few minutes of operation!

 Is anyone using both Spamhalter and Popfile in tandem?   I'm thinking of putting a later version of popfile on our server, we once had 2.2.4 working so I should be able to make it work again.   I quite like being able to tag commercial (not spam but also probably not solicited) emails with a separate status so that users can choose whether or not to filter them.   Popfile of course does allow one to do this.

 We are also now using Greywall and that does seem quite effective although we are slightly nervous about delayed inquiries from our web site, we will of course be OK if people use our web form to contact us but might loose mails sent but website users we do not already know if they just use our email address.

 A benefit of our changes is that the mailserver is not MUCH faster to respond, especially noticeable when using an IMAP connection.
 

 

 

<p>We have just switched from using popfile version 2.1.1 to using the version of spamhalter which ships with Mercury 4.51.   Spamhalter was certainly fairly easy and seamless to set up and emailing spams to account@domain.co.uk to update it does seem simpler than the Popfile method of using a web interface.   However Spamhalter seems to be very slow to learn, in fact it often lets through emails identical in content to ones which were sent to the spam learning account!   Have other people had this problem or am I doing something wrong?   I note that none of the mails sent to the spam learning account ever actually get there, is this normal, does the system automatically delete them having learnt from them?   I have also noted that every email I've ever checked the stats for in raw view has a spam probability of 0.0% if not detected as spam and 100% if detected as spam.   On the plus side the only false positives we have had were in the first few minutes of operation! </p><p> Is anyone using both Spamhalter and Popfile in tandem?   I'm thinking of putting a later version of popfile on our server, we once had 2.2.4 working so I should be able to make it work again.   I quite like being able to tag commercial (not spam but also probably not solicited) emails with a separate status so that users can choose whether or not to filter them.   Popfile of course does allow one to do this.</p><p> We are also now using Greywall and that does seem quite effective although we are slightly nervous about delayed inquiries from our web site, we will of course be OK if people use our web form to contact us but might loose mails sent but website users we do not already know if they just use our email address.</p><p> A benefit of our changes is that the mailserver is not MUCH faster to respond, especially noticeable when using an IMAP connection.  </p><p> </p><p> </p>

We're using SpamHalter and Greywall, and are enjoying a huge reduction in the amount of Spam.  When a message is sent to the spam learning account, it should show up in the Mercury Core Process with an entry like this:  06:27:13: Job MG000392:  * killed by Daemon.  We have seen percentages all through the range from 0-100%, but mostly they lie at 0% or 100%.  We currently tag spam at the 80% level.  Not sure if this is much help, but good luck, and I'll be happy to share my settings if you need them.

-=Glen

 

<p>We're using SpamHalter and Greywall, and are enjoying a huge reduction in the amount of Spam.  When a message is sent to the spam learning account, it should show up in the Mercury Core Process with an entry like this:  06:27:13: Job MG000392:  * killed by Daemon.  We have seen percentages all through the range from 0-100%, but mostly they lie at 0% or 100%.  We currently tag spam at the 80% level.  Not sure if this is much help, but good luck, and I'll be happy to share my settings if you need them.</p><p>-=Glen</p><p> </p>

Hi,

Yes I'm using both Spamhalter and PopFile in tandem for two reasons:

1. Spamhalter only allows two classifications; ham or spam, whereas with PopFile I can use as many classifications [bucktes] as I like.

2. Spamhalter gets the first look at the mail arriving and it is simpler to send corrections to the relative e-mail accounts setup to retrain it. PopFile [latest version using PopfileD] gets to see the message after, and I can then fine tune any of the more granular mis-classification it makes, such as Phishing Scams, 419 Scams, Malware, etc. via the PopFile web interface. Also, I wasn't sure if I would continue to use Spamhalter at first, but it seems to be quite good.

I also used Graywall as an experiment to see how much difference it made, and although it was pretty good I did find out that some e-mails were lost because of poor mail server implementations on-route to me. I also noticed that spammers, scammers and malware authors have already started to adapt their techniques so that their creations will retry and therefore defeat graylisting. I have since turned off Graywall because of the lost mail and the fact that the 'Bad Guys and Girls' are already able to bypass it. I use a mixture of transaction filters, DNS blacklists and other techniques instead.

Hope this helps?

Regards,

Martin
 


 

<p>Hi,</p><p>Yes I'm using both Spamhalter and PopFile in tandem for two reasons:</p><p>1. Spamhalter only allows two classifications; ham or spam, whereas with PopFile I can use as many classifications [bucktes] as I like.</p><p>2. Spamhalter gets the first look at the mail arriving and it is simpler to send corrections to the relative e-mail accounts setup to retrain it. PopFile [latest version using PopfileD] gets to see the message after, and I can then fine tune any of the more granular mis-classification it makes, such as Phishing Scams, 419 Scams, Malware, etc. via the PopFile web interface. Also, I wasn't sure if I would continue to use Spamhalter at first, but it seems to be quite good.</p><p>I also used Graywall as an experiment to see how much difference it made, and although it was pretty good I did find out that <b>some e-mails were lost</b> because of poor mail server implementations on-route to me. I also noticed that spammers, scammers and malware authors have already started to adapt their techniques so that their creations will retry and therefore defeat graylisting. I have since turned off Graywall because of the lost mail and the fact that the 'Bad Guys and Girls' are already able to bypass it. I use a mixture of transaction filters, DNS blacklists and other techniques instead. </p><p>Hope this helps?</p><p>Regards,</p><p>Martin  </p><p>  </p>

[quote user="chriscw"]

We have just switched from using popfile version 2.1.1 to using the version of spamhalter which ships with Mercury 4.51.   Spamhalter was certainly fairly easy and seamless to set up and emailing spams to account@domain.co.uk to update it does seem simpler than the Popfile method of using a web interface.   However Spamhalter seems to be very slow to learn, in fact it often lets through emails identical in content to ones which were sent to the spam learning account! 

I agree it's slower than POPFile to learn but you might try shifting to TOE from TA to see what happens.  I also lower the "level of not spam" from 3 to 1. That said I find both system operate at the 99% effectiveness when trained, with POPFile a bit more accurate on detecting the spam.  (99.81% against 99.32%)


Have other people had this problem or am I doing something wrong?   I note that none of the mails sent to the spam learning account ever actually get there, is this normal, does the system automatically delete them having learnt from them?  

The message sent to the SPAM and NOSPAM accounts should in fact go to these accounts and it will be processed.  Upon completion of the processing it is deleted.
 

I have also noted that every email I've ever checked the stats for in raw view has a spam probability of 0.0% if not detected as spam and 100% if detected as spam.   On the plus side the only false positives we have had were in the first few minutes of operation!

 Is anyone using both Spamhalter and Popfile in tandem?   I'm thinking of putting a later version of popfile on our server, we once had 2.2.4 working so I should be able to make it work again.   I quite like being able to tag commercial (not spam but also probably not solicited) emails with a separate status so that users can choose whether or not to filter them.   Popfile of course does allow one to do this.

I've got both working on my main server, seems to work just fine for the months I've had this going.

 We are also now using Greywall and that does seem quite effective although we are slightly nervous about delayed inquiries from our web site, we will of course be OK if people use our web form to contact us but might loose mails sent but website users we do not already know if they just use our email address.

 You are right to be nervous.  There are a number of servers out there that do not retry when they get a 400 series temporary error.  They are broken (many are being fixed) but they also are some bigs ones. If the e-mail is business critical you might want to rethink this one.
 

 A benefit of our changes is that the mailserver is not MUCH faster to respond, especially noticeable when using an IMAP connection.

 Not at all sure way this should have any affect on the speed of the system if you were using the latest POPFileD with POPFile.
 

[/quote]
[quote user="chriscw"]<p>We have just switched from using popfile version 2.1.1 to using the version of spamhalter which ships with Mercury 4.51.   Spamhalter was certainly fairly easy and seamless to set up and emailing spams to account@domain.co.uk to update it does seem simpler than the Popfile method of using a web interface.   However Spamhalter seems to be very slow to learn, in fact it often lets through emails identical in content to ones which were sent to the spam learning account!  <b> I agree it's slower than POPFile to learn but you might try shifting to TOE from TA to see what happens.  I also lower the "level of not spam" from 3 to 1. That said I find both system operate at the 99% effectiveness when trained, with POPFile a bit more accurate on detecting the spam.  (99.81% against 99.32%) </b></p><p> Have other people had this problem or am I doing something wrong?   I note that none of the mails sent to the spam learning account ever actually get there, is this normal, does the system automatically delete them having learnt from them?   </p><p><b>The message sent to the SPAM and NOSPAM accounts should in fact go to these accounts and it will be processed.  Upon completion of the processing it is deleted.</b>  </p><p>I have also noted that every email I've ever checked the stats for in raw view has a spam probability of 0.0% if not detected as spam and 100% if detected as spam.   On the plus side the only false positives we have had were in the first few minutes of operation! </p><p> Is anyone using both Spamhalter and Popfile in tandem?   I'm thinking of putting a later version of popfile on our server, we once had 2.2.4 working so I should be able to make it work again.   I quite like being able to tag commercial (not spam but also probably not solicited) emails with a separate status so that users can choose whether or not to filter them.   Popfile of course does allow one to do this.</p><p><b>I've got both working on my main server, seems to work just fine for the months I've had this going.</b></p><p> We are also now using Greywall and that does seem quite effective although we are slightly nervous about delayed inquiries from our web site, we will of course be OK if people use our web form to contact us but might loose mails sent but website users we do not already know if they just use our email address.</p><p><b> You are right to be nervous.  There are a number of servers out there that do not retry when they get a 400 series temporary error.  They are broken (many are being fixed) but they also are some bigs ones. If the e-mail is business critical you might want to rethink this one. </b>  </p><p> A benefit of our changes is that the mailserver is not MUCH faster to respond, especially noticeable when using an IMAP connection.</p><p> <b>Not at all sure way this should have any affect on the speed of the system if you were using the latest POPFileD with POPFile.</b>  </p>[/quote]

Thanks to everyone for their experience, it is re assuring to know that the system is doing what it should and that other people seem to face similar issues.

 I'm  going to have a look at the popfile site for the latest versions of everything and get that working too.

 I think the mail server is running faster simply because Greywall is stopping well over 50% of the spam from ever needing to be processed by either Mercury or Spamhalter!   I don't think it really affects the speed of the mail just the speed that other things happen like IMAP connections and remote desktop admin sessions.   I was not using the latest popfile as I could not get it working again after our email server died a while back and did not have time to fiddle around.   I'm sure it will not be a problem now!
 

<p>Thanks to everyone for their experience, it is re assuring to know that the system is doing what it should and that other people seem to face similar issues.</p><p> I'm  going to have a look at the popfile site for the latest versions of everything and get that working too.</p><p> I think the mail server is running faster simply because Greywall is stopping well over 50% of the spam from ever needing to be processed by either Mercury or Spamhalter!   I don't think it really affects the speed of the mail just the speed that other things happen like IMAP connections and remote desktop admin sessions.   I was not using the latest popfile as I could not get it working again after our email server died a while back and did not have time to fiddle around.   I'm sure it will not be a problem now!  </p>

I m ight need to start a new thread for this but should popfile 2.2.5 be compatible with the 2.2x version of PopfileD?  

 I have re installed Popefile updating a 2.1.1 version and Popfile will start, I can access it with Firefox if I start Firefox and type in the local host address and port number, choosing popefileUI for the system tray results in a "This file does not have a program associated wit it...." error.   I can look at the history and advanced tabs but none of the others appear to open and although he system messages do indicate PopefileD starting as far as I can see no new messages get to Popfile.   I have not changed any of the default port settings.

 I am still using Spamhalter again with default settings except that I changed to only train on errors.

 Any suggestions as to what I've done wrong or what more information might help would be welcome.
 

<p>I m ight need to start a new thread for this but should popfile 2.2.5 be compatible with the 2.2x version of PopfileD?   </p><p> I have re installed Popefile updating a 2.1.1 version and Popfile will start, I can access it with Firefox if I start Firefox and type in the local host address and port number, choosing popefileUI for the system tray results in a "This file does not have a program associated wit it...." error.   I can look at the history and advanced tabs but none of the others appear to open and although he system messages do indicate PopefileD starting as far as I can see no new messages get to Popfile.   I have not changed any of the default port settings.</p><p> I am still using Spamhalter again with default settings except that I changed to only train on errors.</p><p> Any suggestions as to what I've done wrong or what more information might help would be welcome.  </p>

[quote user="chriscw"] Any suggestions as to what I've done wrong or what more information might help would be welcome. [/quote]

Chris I'm running Popfile v0.22.5 with the latest PopfileD and it is working fine. Did you follow the instructions supplied with PopfileD and copy the files to the right directories after upgrading Popfile? The supplied files overwrite the default ones that Popfile came with. I suspect that is why the UI is not working properly for you.

Quote from the POPFileD.html from the latest version ZIP file [Version 1.22.4]

"Copy the file MERC.pm to the "proxy" subdirectory of your

POPFile installation. The name of the file is case sensitive.

Copy all the files in the provided "languages" directory

to the "languages" subdirectory of your

POPFile installation, replacing the files by the same names that may already

there.

Then double-click on the copy of the

install.bat
script that you just

put in the "languages" subdirectory of your POPFile installation.

POPFileD adds a few lines to each of the .msg files in that directory,

which contain text strings used by POPFile to construct its browser

interface.

If you are using a language other than English, I have done my best

to translate to all the languages POPFile supports.

Copy the files merc-configuration.thtml and merc-security-local.thtml to the

"skins\default" subdirectory of your POPFile installation (there should be

many other ".thtml" files in there already.

These files define the HTML forms used in configuring the Mercury interfaces"

Hope this helps?

Regards,

Martin
 

[quote user="chriscw"] Any suggestions as to what I've done wrong or what more information might help would be welcome. [/quote]<p>Chris I'm running Popfile v0.22.5 with the latest PopfileD and it is working fine. Did you follow the instructions supplied with PopfileD and copy the files to the right directories after upgrading Popfile? The supplied files overwrite the default ones that Popfile came with. I suspect that is why the UI is not working properly for you.</p><p>Quote from the POPFileD.html from the latest version ZIP file [Version 1.22.4]</p><p>"<i>Copy the file MERC.pm to the "proxy" subdirectory of your POPFile installation. The name of the file is case sensitive.</i></p><p><i>Copy all the files in the provided "languages" directory to the "languages" subdirectory of your POPFile installation, replacing the files by the same names that may already there. Then double-click on the copy of the <code>install.bat</code> script that you just put in the "languages" subdirectory of your POPFile installation. POPFileD adds a few lines to each of the .msg files in that directory, which contain text strings used by POPFile to construct its browser interface. If you are using a language other than English, I have done my best to translate to all the languages POPFile supports.</i></p><p><i>Copy the files merc-configuration.thtml and merc-security-local.thtml to the "skins\default" subdirectory of your POPFile installation (there should be many other ".thtml" files in there already. These files define the HTML forms used in configuring the Mercury interfaces</i>" Hope this helps?</p><p>Regards,</p><p>Martin  </p>

Thanks toaster,

I had forgotten to do the merc.pm file.   Lets see how far I get now...

<p>Thanks toaster,</p><p>I had forgotten to do the merc.pm file.   Lets see how far I get now... </p>

The system is working well now, I even set up a filter to detect massages that Popefile recognised as Spam and Spamhalter did not then forward them to my spam training account.   That helped transfer knowledge from our existing popfile database to spamhalter.   I will have to turn the rule off eventually because Spamhalter seems to have far fewer false positives than popfile (non at all as opposed to maybe one a week).

 I actually set up Spamhalter to tag spam with [[spam]] and popfile to use [spam] this meant that users existing rules set up when we were using only popfile still work.
 

<p>The system is working well now, I even set up a filter to detect massages that Popefile recognised as Spam and Spamhalter did not then forward them to my spam training account.   That helped transfer knowledge from our existing popfile database to spamhalter.   I will have to turn the rule off eventually because Spamhalter seems to have far fewer false positives than popfile (non at all as opposed to maybe one a week).</p><p> I actually set up Spamhalter to tag spam with [[spam]] and popfile to use [spam] this meant that users existing rules set up when we were using only popfile still work.  </p>

I'm a bit at a loss with Spamhalter... Similar to the original post, I installed SpamHalter, read the entire manual, made sure I understood what every setting does (including setting up the spam/nospam accounts and enabling the "+" in the email address so that the account would work as instructed).

I then downloaded the existing database off the SpamHalter web site and integrated it into my own (empty) database.  I then for about a week sent every spam I got (about 1000-1200 spams) to the training account.  I double checked within the mercury console that the email was removed by the daemon, so I assume that it was processed correctly.

The SpamHalter web site mentioned that you need about 200 spams to begin.  I sent way more than that, but the only tagged (as spam) mails I get either contain the word pharmacy or rolex.  Nearly no other spam seems to be detected, even though I received and sent multiple identical spam mails to the daemon in order to train it.

So...  What is a reasonable amount of spam that needs to pass through the daemon to get a 90%+ identification ratio?  Is it possible that integrating the big database from the spamhalter site a mistake?  Should I reset the database and re-send all the spam to the training account?

<P>I'm a bit at a loss with Spamhalter... Similar to the original post, I installed SpamHalter, read the entire manual, made sure I understood what every setting does (including setting up the spam/nospam accounts and enabling the "+" in the email address so that the account would work as instructed).</P> <P>I then downloaded the existing database off the SpamHalter web site and integrated it into my own (empty) database.  I then for about a week sent every spam I got (about 1000-1200 spams) to the training account.  I double checked within the mercury console that the email was removed by the daemon, so I assume that it was processed correctly.</P> <P>The SpamHalter web site mentioned that you need about 200 spams to begin.  I sent way more than that, but the only tagged (as spam) mails I get either contain the word pharmacy or rolex.  Nearly no other spam seems to be detected, even though I received and sent multiple identical spam mails to the daemon in order to train it.</P> <P>So...  What is a reasonable amount of spam that needs to pass through the daemon to get a 90%+ identification ratio?  Is it possible that integrating the big database from the spamhalter site a mistake?  Should I reset the database and re-send all the spam to the training account?</P>



> I'm a bit at a loss with Spamhalter... Similar to the
> original post, I installed SpamHalter, read the entire
> manual, made sure I understood what every setting does
> (including setting up the spam/nospam accounts and enabling
> the "+" in the email address so that the account would work
> as instructed.

How did you set it up?  
Train Always (TA) or Train on Error (TOA?
What is your Not spam setting?
What is the level you have set for the spam.

I use TOA, Not spam 1 and the spam level 60% and I'm running well over 95% spam detection rate with a FPR of less than 1%

>
> I then downloaded the existing database off the SpamHalter
> web site and integrated it into my own (empty) database.  I
> then for about a week sent every spam I got (about 1000-1200
> spams) to the training account.  I double checked within the
> mercury console that the email was removed by the daemon, so
> I assume that it was processed correctly.

I did not use the database, I just started with about 200 spam and about 500 good messages.  Also, do not assume they were properly processed.  Checkout the logs to verify the daemon got the mail and properly processed it as either spam or nospam.


>
> The SpamHalter web site mentioned that you need about 200
> spams to begin.  I sent way more than that, but the only
> tagged (as spam) mails I get either contain the word pharmacy
> or rolex.  Nearly no other spam seems to be detected, even
> though I received and sent multiple identical spam mails to
> the daemon in order to train it.
>
> So...  What is a reasonable amount of spam that needs to pass
> through the daemon to get a 90%+ identification ratio?  Is it
> possible that integrating the big database from the
> spamhalter site a mistake?  Should is reset the database and
> re-send all the spam to the training account?

SpamHalter is slower to learn than POPFile for me.  However once you have fed it enough corrections and the corrections were acted on then it works well over 90%.

>


> I'm a bit at a loss with Spamhalter... Similar to the > original post, I installed SpamHalter, read the entire > manual, made sure I understood what every setting does > (including setting up the spam/nospam accounts and enabling > the "+" in the email address so that the account would work > as instructed. How did you set it up?   Train Always (TA) or Train on Error (TOA? What is your Not spam setting? What is the level you have set for the spam. I use TOA, Not spam 1 and the spam level 60% and I'm running well over 95% spam detection rate with a FPR of less than 1% > > I then downloaded the existing database off the SpamHalter > web site and integrated it into my own (empty) database.  I > then for about a week sent every spam I got (about 1000-1200 > spams) to the training account.  I double checked within the > mercury console that the email was removed by the daemon, so > I assume that it was processed correctly. I did not use the database, I just started with about 200 spam and about 500 good messages.  Also, do not assume they were properly processed.  Checkout the logs to verify the daemon got the mail and properly processed it as either spam or nospam. > > The SpamHalter web site mentioned that you need about 200 > spams to begin.  I sent way more than that, but the only > tagged (as spam) mails I get either contain the word pharmacy > or rolex.  Nearly no other spam seems to be detected, even > though I received and sent multiple identical spam mails to > the daemon in order to train it. > > So...  What is a reasonable amount of spam that needs to pass > through the daemon to get a 90%+ identification ratio?  Is it > possible that integrating the big database from the > spamhalter site a mistake?  Should is reset the database and > re-send all the spam to the training account? SpamHalter is slower to learn than POPFile for me.  However once you have fed it enough corrections and the corrections were acted on then it works well over 90%. >

It's quite possible your spam training messages did not get received.  I've had a lot of trouble getting the spam training bin working correctly.  In the end, I opted to simply dump .cnm files into it by hand which works perfectly.  You can get the files out of Pegasus by either right-clicking on the message and selecting 'save to data file', or just open your new mail folder in Windows and copy out the relevant messages.

If you really want to verify that your spams are getting to the training bin, you need to watch the server as you send the message.  Sometimes spamhalter will pick them up almost immediately, sometimes it will be a few minutes before they disappear.  However, if you never see them at all, chances are they are not being received.
 

<p>It's quite possible your spam training messages did not get received.  I've had a lot of trouble getting the spam training bin working correctly.  In the end, I opted to simply dump .cnm files into it by hand which works perfectly.  You can get the files out of Pegasus by either right-clicking on the message and selecting 'save to data file', or just open your new mail folder in Windows and copy out the relevant messages.</p><p>If you really want to verify that your spams are getting to the training bin, you need to watch the server as you send the message.  Sometimes spamhalter will pick them up almost immediately, sometimes it will be a few minutes before they disappear.  However, if you never see them at all, chances are they are not being received.  </p>

[quote user="Moshe"]I then downloaded the existing database off the SpamHalter web site and integrated it into my own (empty) database.  I then for about a week sent every spam I got (about 1000-1200 spams) to the training account.  I double checked within the mercury console that the email was removed by the daemon, so I assume that it was processed correctly.[/quote]

Have you sent your valid (non-spam) messages to train Spamhalter?

To quote from the website where you downloaded the spam database:

[quote]Starter database for SpamWall/SpamHalter 4.x.x You can merge it with your database!(contains lot of spams... you must add your legal messages before use!)[/quote]

 

 

<P>[quote user="Moshe"]I then downloaded the existing database off the SpamHalter web site and integrated it into my own (empty) database.  I then for about a week sent every spam I got (about 1000-1200 spams) to the training account.  I double checked within the mercury console that the email was removed by the daemon, so I assume that it was processed correctly.[/quote]</P> <P>Have you sent your valid (non-spam) messages to train Spamhalter?</P> <P>To quote from the website where you downloaded the spam database:</P> <P>[quote]Starter database for SpamWall/SpamHalter 4.x.x You can merge it with your database!(contains lot of spams... you must add your legal messages before use!)[/quote]</P> <P mce_keep="true"> </P> <P mce_keep="true"> </P>

Ok, this is something I indeed forgot, I never sent any valid eMails to the nonspam account, I'll be sending a few 100's to see if it improves the prediction.

 I have SpamHalter set to Train Always.  I did check the logs and the mails are being processed correctly by SpamHalter.  Level of Not-Spam was set as 3.  I've set it now to 2 and reduced the SPAM probability from 80 to 75.  I feel a bit uneasy going down to 60.

I did find a bug in SpamHalter.  If the message doesn't have any topic, SpamHalter will not add its own tag to the subject, even if it was detected as spam (the mail headers get tagged, just not the subject).

<P>Ok, this is something I indeed forgot, I never sent any valid eMails to the nonspam account, I'll be sending a few 100's to see if it improves the prediction.</P> <P> I have SpamHalter set to Train Always.  I did check the logs and the mails are being processed correctly by SpamHalter.  Level of Not-Spam was set as 3.  I've set it now to 2 and reduced the SPAM probability from 80 to 75.  I feel a bit uneasy going down to 60.</P> <P>I did find a bug in SpamHalter.  If the message doesn't have any topic, SpamHalter will not add its own tag to the subject, even if it was detected as spam (the mail headers get tagged, just not the subject).</P>

Spamhalter will add a tag to the Subject header if one exists, but it won't create a Subject header if it doesn't. I believe this is intentional.

/Rolf

 

<p>Spamhalter will add a tag to the Subject header if one exists, but it won't create a Subject header if it doesn't. I believe this is intentional. </p><p>/Rolf </p><p> </p>

There are compliance rules in Mercury SMTP which allow messages with no subject field and no date field to be refused.

 I have these refusals turned on as I can see no valid reason for a legitimate email to have either of these fields missing.   The rule does still allow messages with a blank subject through which is fine as even I (when using primitive clients) occasionally omit a subject, Pegasus of course reminds me that I am being a berk for which I am invariably grateful!

I would suggest refusing messages with no Subject field.
 

<p>There are compliance rules in Mercury SMTP which allow messages with no subject field and no date field to be refused.</p><p> I have these refusals turned on as I can see no valid reason for a legitimate email to have either of these fields missing.   The rule does still allow messages with a blank subject through which is fine as even I (when using primitive clients) occasionally omit a subject, Pegasus of course reminds me that I am being a berk for which I am invariably grateful! </p><p>I would suggest refusing messages with no Subject field.  </p>

We actually use the Compliance controls in Mercury SMTP Server to refuse mail with no subject field, this still allows mail with blank subjects through of course so we never see the problem of Spamhalter or for that matter popefile being unable to add a tag.

 I know of no reason why a legitimate email should have no subject field so I'm happy to block such emails.
 

<p>We actually use the Compliance controls in Mercury SMTP Server to refuse mail with no subject field, this still allows mail with blank subjects through of course so we never see the problem of Spamhalter or for that matter popefile being unable to add a tag.</p><p> I know of no reason why a legitimate email should have no subject field so I'm happy to block such emails.  </p>

I'm still getting lousy results :(

I submitted over 500 non-spam and over 1500 spam and still getting near-zero spam identification.

<P>I'm still getting lousy results :(</P> <P>I submitted over 500 non-spam and over 1500 spam and still getting near-zero spam identification.</P>

[quote user="Moshe"]

I'm still getting lousy results :(

I submitted over 500 non-spam and over 1500 spam and still getting near-zero spam identification.

[/quote]

 

I suspect that are not being actually applied.  Check out the logs to verify that the spam/nospam is actually  getting to Spamhalter and the correction are being applied.

 

[quote user="Moshe"]<p>I'm still getting lousy results :(</p> <p>I submitted over 500 non-spam and over 1500 spam and still getting near-zero spam identification.</p><p>[/quote]</p><p> </p><p>I suspect that are not being actually applied.  Check out the logs to verify that the spam/nospam is actually  getting to Spamhalter and the correction are being applied.</p><p> </p>
live preview
enter atleast 10 characters
WARNING: You mentioned %MENTIONS%, but they cannot see this message and will not be notified
Saving...
Saved
With selected deselect posts show selected posts
All posts under this topic will be deleted ?
Pending draft ... Click to resume editing
Discard draft