Community Discussions and Support
SpamHalter failing to catch junk mail

 

Some time after asserting SH hiccup fix I found that it wasn't true —

Pmail/Spamhalter was failing again. I decided to find more on it,

specially on what was happening behind curtains.

One thing was very evident: Pmail/Spamhalter looses its handle on

words4.db3 file every time a new mail message or a reclassification

produced an error at System Message's pane. Something in my hardware,

OS or Pmail lead it to miss control over SH database, thus the error.

Using Sysinternal's Process Explorer I found that when working

properly, words4.db3 had this status:


 

On SH error, the above pane was shown completely blank, that is, no application was using wrds4.db3 file. Then I decided to change mailbox address from UNC to mapped path

notation (following a hunch), so it looked like this on Process

Explorer Search pane, after the change:

 

 

 

As you can see the System handle

doesn't appear with this setup. With this setup I removed the Lanman

Redirector (a middle man) from the process with a gain in speed and

no more Spanhalter hiccups so far. My fingers still crossed anyway...

I still think something may be

misconfigured in my WinXP, or maybe it's just the old hardware

signaling it will give up some day, but Pegasus Mail has been running

much better (stabler?) after this change. Unless one day I return to

LAN usage (I doubt it very much) this shall remain like this.

Just to mention, Dear Wife's machine (a

WinXP SP3 desktop) still uses UNC notation without problems, probably

because she uses it very little and the SH hiccup occurs after some

time online.

I hope this can give you more to think.

Any comments will be welcomed.

 

<p> </p><p>Some time after asserting SH hiccup fix I found that it wasn't true — Pmail/Spamhalter was failing again. I decided to find more on it, specially on what was happening behind curtains.</p> <p>One thing was very evident: Pmail/Spamhalter looses its handle on words4.db3 file every time a new mail message or a reclassification produced an error at System Message's pane. Something in my hardware, OS or Pmail lead it to miss control over SH database, thus the error. Using Sysinternal's Process Explorer I found that when working properly, words4.db3 had this status:</p> <p><img src="http://img99.imageshack.us/img99/9342/pmailonuncpath.png" mce_src="http://img99.imageshack.us/img99/9342/pmailonuncpath.png"> </p><p> </p><p style="margin-bottom: 0cm">On SH error, the above pane was shown completely blank, that is, no application was using wrds4.db3 file. Then I decided to change mailbox address from UNC to mapped path notation (following a hunch), so it looked like this on Process Explorer Search pane, after the change:</p><p style="margin-bottom: 0cm"><img src="http://img11.imageshack.us/img11/8064/pmailonmappedpath.png" mce_src="http://img11.imageshack.us/img11/8064/pmailonmappedpath.png"> </p><p> </p><p> </p><p style="margin-bottom: 0cm">As you can see the System handle doesn't appear with this setup. With this setup I removed the Lanman Redirector (a middle man) from the process with a gain in speed and no more Spanhalter hiccups so far. My fingers still crossed anyway... </p><p style="margin-bottom: 0cm">I still think something may be misconfigured in my WinXP, or maybe it's just the old hardware signaling it will give up some day, but Pegasus Mail has been running much better (stabler?) after this change. Unless one day I return to LAN usage (I doubt it very much) this shall remain like this.</p> <p style="margin-bottom: 0cm">Just to mention, Dear Wife's machine (a WinXP SP3 desktop) still uses UNC notation without problems, probably because she uses it very little and the SH hiccup occurs after some time online.</p> <p style="margin-bottom: 0cm">I hope this can give you more to think. Any comments will be welcomed.</p><p style="margin-bottom: 0cm"> </p>

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

Hello folks,

I've been using Pmail with SpamHalter for a long time and it

has proved very effective. Strangely, it has been failing randomly and

without a known cause lately, most likely related to its SQLite3 database as pointed out in

Pegasus' System Messages pane here.

Unfortunately

I couldn't get the whole error message as System Message AFAIK does not

wrap long lines. I also couldn't find a text file containing it's text

in none of \Pmail folders or sub-folders. It maybe somewhere else.

Anyway, as it fails to execute SQLite 3 query it fails spam filtering thus leaving junk mail in New Mail folder, as shown here.

Shutting down Pegasus Mail and running it again resolves the problem. As I've

said, it may happen any day, anytime without a warning. A corpus cleanup

did not help at all. Below some more information about SpamHalter.

SpamHalter plugin version: 4.5.2.179
Tokens in database: 24188
Statistics collected from: 21.06.2007 8:26:14

All classified messages: 70791
Classified messages as spam: 32207
 ... it is: 45.50%

Corrected classification mistakes
Missed spams: 2238
 ... it is: 3.1614%
False positives: 395
 ... it is: 0.5580%


Not sure if I'm using the last plugin. At least I think I am. Any hints will be very much appreciated.

<p>Hello folks, I've been using Pmail with SpamHalter for a long time and it has proved very effective. Strangely, it has been failing randomly and without a known cause lately, most likely related to its SQLite3 database as pointed out in Pegasus' System Messages pane <a href="http://img17.imageshack.us/img17/7205/20120625162417.png" target="_blank" mce_href="http://img17.imageshack.us/img17/7205/20120625162417.png">here</a>. </p>Unfortunately I couldn't get the whole error message as System Message AFAIK does not wrap long lines. I also couldn't find a text file containing it's text in none of \Pmail folders or sub-folders. It maybe somewhere else. Anyway, as it fails to execute SQLite 3 query it fails spam filtering thus leaving junk mail in New Mail folder, as shown <a href="http://img716.imageshack.us/img716/8948/20120625162606.png" target="_blank" mce_href="http://img716.imageshack.us/img716/8948/20120625162606.png">here</a>. Shutting down Pegasus Mail and running it again resolves the problem. As I've said, it may happen any day, anytime without a warning. A corpus cleanup did not help at all. Below some more information about SpamHalter. SpamHalter plugin version: 4.5.2.179 Tokens in database: 24188 Statistics collected from: 21.06.2007 8:26:14 All classified messages: 70791 Classified messages as spam: 32207  ... it is: 45.50% Corrected classification mistakes Missed spams: 2238  ... it is: 3.1614% False positives: 395  ... it is: 0.5580% Not sure if I'm using the last plugin. At least I think I am. Any hints will be very much appreciated.

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

Never mind, I found the answer myself. :)

Never mind, I found the answer myself. :)

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

Could you post the solution please?

Martin 

<p>Could you post the solution please?</p><p>Martin </p>

[quote user="irelam"]

Could you post the solution please?

Martin 

[/quote]

Sure! Below is what I answered to Jerry Wise a few days ago:

[quote]

Hi Jerry,

Seems to me not many in the list use SpamHalter.<g>

Anyway, I had a clue it could be something within SH database (SQLite 3) indexing. Unfortunately it's not immune to system crashes like on most professional SQL managers, a small inconvenience to a popular, small, free and very fast database manager.

I didn't try do rebuild indexes. It's much more fast and troublesome do replace a database by a previous backup, with a very small (or none at all) impact in SH accuracy. I use to save a 7-zipped copy of SH database prior to every cleanup, as to avoid re-training. 

[/quote]

Sorry for delaying. <g>

[quote user=&quot;irelam&quot;]&lt;p&gt;Could you post the solution please?&lt;/p&gt;&lt;p&gt;Martin&amp;nbsp;&lt;/p&gt;&lt;p&gt;[/quote] &lt;/p&gt;&lt;p&gt;Sure! Below is what I answered to Jerry Wise a few days ago:&lt;/p&gt;&lt;p&gt;[quote]&lt;/p&gt;&lt;p&gt;Hi Jerry, Seems to me not many in the list use SpamHalter.&amp;lt;g&amp;gt; Anyway, I had a clue it could be something within SH database (SQLite 3) indexing. Unfortunately it&#039;s not immune to system crashes like on most professional SQL managers, a small inconvenience to a popular, small, free and very fast database manager. I didn&#039;t try do rebuild indexes. It&#039;s much more fast and troublesome do replace a database by a previous backup, with a very small (or none at all) impact in SH accuracy. I use to save a 7-zipped copy of SH database prior to every cleanup, as to avoid re-training.&amp;nbsp;&lt;/p&gt;&lt;p&gt;[/quote]&lt;/p&gt;&lt;p&gt;Sorry for delaying. &amp;lt;g&amp;gt; &lt;/p&gt;

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

I don't know why but spamhalter is basically not working at all.  I have classified at least 100 messages as spam
 and even when I get the very same email again, it is not recognized as spam.  Can someone help, please?

&lt;p&gt;I don&#039;t know why but spamhalter is basically not working at all.&amp;nbsp; I have classified at least 100 messages as spam &amp;nbsp;and even when I get the very same email again, it is not recognized as spam.&amp;nbsp; Can someone help, please?&lt;/p&gt;

Check the setup of SpamHalter via menu Tools/Spam and Content Control/SpamHalter. Is SpamHalter enabled? Does the Spam folder exist or have you moved location of Pegasus Mail?  Finally after starting Pegasus Mail, click on Window/System Messages. You should see lines that show SpamHalter starting up.

HTH

Martin 

&lt;p&gt;Check the setup of SpamHalter via menu Tools/Spam and Content Control/SpamHalter. Is SpamHalter enabled? Does the Spam folder exist or have you moved location of Pegasus Mail? &amp;nbsp;Finally after starting Pegasus Mail, click on Window/System Messages. You should see lines that show SpamHalter starting up.&lt;/p&gt;&lt;p&gt;HTH&lt;/p&gt;&lt;p&gt;Martin&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;img style=&quot;width: 483px; height: 234px;&quot; src=&quot;http://www.bob-betts.com/main.php?g2_view=core.DownloadItem&amp;amp;g2_itemId=17496&amp;amp;g2_serialNumber=1&quot; width=&quot;483&quot; height=&quot;234&quot; mce_src=&quot;http://www.bob-betts.com/main.php?g2_view=core.DownloadItem&amp;amp;g2_itemId=17496&amp;amp;g2_serialNumber=1&quot;&gt;&lt;/p&gt;

here's system messages.  I had trouble before and lost a bunch of things, folders, messages.
Since then if I check consistency, recover deleted space and reindex a folder, many emails get marked as "not read"
Maybe that's why?

Still, I'd like to know how to fix this.  I did re-add the content control which is now doing the majority of the spam -sorting

Spamhalter is enabled and the spam is supposedly being sent to a folder named *SPAM*

&lt;p&gt;here&#039;s system messages.&amp;nbsp; I had trouble before and lost a bunch of things, folders, messages. Since then if I check consistency, recover deleted space and reindex a folder, many emails get marked as &quot;not read&quot; Maybe that&#039;s why?&lt;/p&gt;&lt;p&gt;Still, I&#039;d like to know how to fix this.&amp;nbsp; I did re-add the content control which is now doing the majority of the spam -sorting&lt;/p&gt;&lt;p&gt;Spamhalter is enabled and the spam is supposedly being sent to a folder named *SPAM*&lt;/p&gt;

Hi Usagi,

Have you notice if Pegasus System Messages' pane has any lines like those shown here (marked in red) when it fails to classify?

&lt;p&gt;Hi Usagi,&lt;/p&gt;&lt;p&gt;Have you notice if Pegasus System Messages&#039; pane has any lines like those shown &lt;a href=&quot;http://img17.imageshack.us/img17/7205/20120625162417.png&quot; mce_href=&quot;http://img17.imageshack.us/img17/7205/20120625162417.png&quot;&gt;here&lt;/a&gt; (marked in red) when it fails to classify? &lt;/p&gt;

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

No I don't see any messages at all as it's classifying. 

I think it is classifying the emails as spam.  The little traffic light is showing red on some
 of the newly downloaded emails.  It's not moving them to the spam folder, though.
I tried renaming the spam folder thinking that pegasus might not like the name I chose, *SPAM*
Renaming didn't help.

&lt;p&gt;No I don&#039;t see any messages at all as it&#039;s classifying.&amp;nbsp; &lt;/p&gt;&lt;p&gt;I think it is classifying the emails as spam.&amp;nbsp; The little traffic light is showing red on some &amp;nbsp;of the newly downloaded emails.&amp;nbsp; It&#039;s not moving them to the spam folder, though. I tried renaming the spam folder thinking that pegasus might not like the name I chose, *SPAM* Renaming didn&#039;t help.&lt;/p&gt;

OK, we may assume all's fine with SH. Now, as Martin already pointed out, you could review your SH settings. Please, pick your current settings and publish them here. You can get them from Tools » Spam and content controls » Spamhalter... like Train on classification errors only or not, Spam level (%) and Not-spam boost. My own settings are Yes, 50 and 1, respectively.

Click the Select... button to check if it is pointing to the desired folder, *SPAM* in your case. Later you can also check this folder sanity too at Folder pane.

If all that is fine than you should review your classification methods. Once Lukas Gebauer pointed out that there's a great difference between training SH by moving false-negatives IN — and false-positives OUT of — your junk folder (again, *SPAM* in your case) than is when using the traffic lights button or right-click menu options. If you don't move messages IN and OUT either by drag and drop or by Quick actions (my choice), than this may be your culprit.

&lt;p&gt;OK, we may assume all&#039;s fine with SH. Now, as Martin already pointed out, you could review your SH settings. Please, pick your current settings and publish them here. You can get them from &lt;i&gt;Tools &raquo; Spam and content controls &raquo; Spamhalter...&lt;/i&gt; like &lt;b&gt;Train on classification errors only&lt;/b&gt; or not, &lt;b&gt;Spam level (%)&lt;/b&gt; and &lt;b&gt;Not-spam boost&lt;/b&gt;. My own settings are Yes, 50 and 1, respectively.&lt;/p&gt;&lt;p&gt;Click the &lt;b&gt;Select...&lt;/b&gt; button to check if it is pointing to the desired folder, &lt;i&gt;*SPAM*&lt;/i&gt; in your case. Later you can also check this folder sanity too at &lt;b&gt;Folder&lt;/b&gt; pane.&lt;/p&gt;&lt;p&gt;If all that is fine than you should review your classification methods. Once Lukas Gebauer pointed out that there&#039;s a great difference between training SH by moving false-negatives &lt;b&gt;IN&lt;/b&gt; &mdash; and false-positives &lt;b&gt;OUT&lt;/b&gt; of &mdash; your junk folder (again, &lt;i&gt;*SPAM*&lt;/i&gt; in your case) than is when using the traffic lights button or right-click menu options. If you don&#039;t move messages IN and OUT either by drag and drop or by &lt;b&gt;Quick actions&lt;/b&gt; (my choice), than this may be your culprit. &lt;/p&gt;

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

OK, you're using SH default settings. I found 80% a too high probability for spam. In other words, you're telling SH it should expect 8 in 10 messages to be spam. I'd say 6 in 10 would be a fairly guess (60%) for the average user. In my particular case I change it to 50% because most of my mail come through Gmail which already have a superb spam filter. I also found that 1 is a better value for Not-spam boost but don't have a good reason to support that. Finally, I would drop Train always in favor of Train on classification errors only unless you have a very high mail traffic, and a database clean-up could also help.

Database clean-ups may reduce SH's accuracy and errors as well, but it's a good measure from time to time. If you don't feel comfortable with clean-ups, shut down Pmail and save SH's database (words4.db3) from your Home mailbox location folder. I use to just zip a copy it into words4.db3.zip. If results affect accuracy too much I shut down Pmail again to restore old database over the cleaned one. Below are my SH statistics from Tools » Spam and content controls » Spamhalter » About Spamhalter... » Statistics...:

SpamHalter plugin version: 4.5.2.179
Tokens in database: 24826
Statistics collected from: 21.06.2007 8:26:14

All classified messages: 71540
Classified messages as spam: 32325
 ... it is: 45.18%

Corrected classification mistakes
Missed spams: 2245
 ... it is: 3.1381%
False positives: 414
 ... it is: 0.5787%

&lt;p&gt;OK, you&#039;re using SH default settings. I found 80% a too high probability for spam. In other words, you&#039;re telling SH it should expect 8 in 10 messages to be spam. I&#039;d say 6 in 10 would be a fairly guess (60%) for the average user. In my particular case I change it to 50% because most of my mail come through Gmail which already have a superb spam filter. I also found that 1 is a better value for &lt;i&gt;Not-spam boost&lt;/i&gt; but don&#039;t have a good reason to support that. Finally, I would drop &lt;i&gt;Train always&lt;/i&gt; in favor of &lt;i&gt;Train on classification errors only&lt;/i&gt; unless you have a very high mail traffic, and a database clean-up could also help.&lt;/p&gt;&lt;p&gt;Database clean-ups may reduce SH&#039;s accuracy and errors as well, but it&#039;s a good measure from time to time. If you don&#039;t feel comfortable with clean-ups, shut down Pmail and save SH&#039;s database (words4.db3) from your &lt;i&gt;Home mailbox location&lt;/i&gt; folder. I use to just zip a copy it into words4.db3.zip. If results affect accuracy too much I shut down Pmail again to restore old database over the cleaned one. Below are my SH statistics from Tools &raquo; Spam and content controls &raquo; Spamhalter &raquo; About Spamhalter... &raquo; Statistics...:&lt;/p&gt;&lt;p&gt;&lt;i&gt;SpamHalter plugin version: 4.5.2.179 Tokens in database: 24826 Statistics collected from: 21.06.2007 8:26:14 All classified messages: 71540 Classified messages as spam: 32325 &amp;nbsp;... it is: 45.18% Corrected classification mistakes Missed spams: 2245 &amp;nbsp;... it is: 3.1381% False positives: 414 &amp;nbsp;... it is: 0.5787% &lt;/i&gt; &lt;/p&gt;

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

Just to add to my prior comments:

It's also important to check if white-listing (Tools » Spam and content controls » Global whitelist...) could be interfering with SH. White-listed addresses are always handled as not-spam and very often seen (erroneously) as false-negatives.

As SH mistakes so little I never ever use white-listing. Anyway, if you do, try not using Automatically whitelist any address to which I send mail option from its control pane.

&lt;p&gt;Just to add to my prior comments:&lt;/p&gt;&lt;p&gt;It&#039;s also important to check if white-listing (Tools &raquo; Spam and content controls &raquo; Global whitelist...) could be interfering with SH. White-listed addresses are always handled as not-spam and very often seen (erroneously) as false-negatives.&lt;/p&gt;&lt;p&gt;As SH mistakes so little I never ever use white-listing. Anyway, if you do, try not using &lt;i&gt;Automatically whitelist any address to which I send mail&lt;/i&gt; option from its control pane. &lt;/p&gt;

-- Euler

Pegasus Mail 4.81.1154 Windows 7 Ultimate
IERenderer: 2.7.1.5 AttachMenu: 1.0.1.2
PMDebug: 2.5.8.34 BearHTML 4.9.9.6

live preview
enter atleast 10 characters
WARNING: You mentioned %MENTIONS%, but they cannot see this message and will not be notified
Saving...
Saved
With selected deselect posts show selected posts
All posts under this topic will be deleted ?
Pending draft ... Click to resume editing
Discard draft