AW: AW: Spam probability always at 0.0%

Joerg

posted Dec 13 '16 at 3:16 pm

Moin Bernward,

Nothing heard from you since I've provided you with the screenshots of our spamhalter settings. Could you solve the problem in the meantime?

BTW: We are not using honeypots like Jyrki, just a simple "train yourself" setup.

Gruß

Joerg

Moin Bernward,Nothing heard from you since I've provided you with the screenshots of our spamhalter settings. Could you solve the problem in the meantime?BTW: We are not using honeypots like Jyrki, just a simple "train yourself" setup. GrußJoerg

bmpan

posted Nov 17 '16 at 10:17 am

Hello,

I want to give Spamhalter a try. My Mercury version is currently 4.80

I have two different versions of Spamhalter installer. One is V4.6.1.433, sized 2766 kB, which came with my Mercury install I suppose. Second is V4.5.0, sized 2301 kB, which I downloaded from the Lukas Gebauer's ararat.cz website today.

There is another one provided in the Mercury Downloads section here at pmail.com, 4.5.1.166 Spamhalter for Pegasus. Can I use this for Mercury?

Thank you.

Hello,I want to give Spamhalter a try. My Mercury version is currently 4.80I have two different versions of Spamhalter installer. One is&nbsp; V4.6.1.433, sized 2766 kB, which came with my Mercury install I suppose. Second is V4.5.0, sized 2301 kB, which I downloaded from the Lukas Gebauer's ararat.cz website today. There is another one provided in the Mercury Downloads section here at pmail.com, 4.5.1.166 Spamhalter for Pegasus. Can I use this for Mercury? Thank you.

Brian Fluet

posted Nov 17 '16 at 1:34 pm

I can tell you that v4.6.1.433 is the version packaged with the latest version of Mercury (v4.80).

I can tell you that v4.6.1.433 is the version packaged with the latest version of Mercury (v4.80).&nbsp; &nbsp;

bmpan

posted Nov 25 '16 at 9:46 am

Thank you Brian. I installed the provided version of spamhalter, exactly following the PDF instructions.

After that, I did some training as follows:

I exported about 500 spam mails out of my existing junk folder in a raw text format for training spam using the SpamhalterTools.exe program. I exported about 5000 mails out of some user accounts in raw text format and trained for nospam. My words4 database is now 2.6 MB large and contains about 50000 words.

Spamhalter is now active (parallel to Mercury's spamfilter). For my understanding, spamhalter does no filtering at all, it only calculates a spam probability, and, if this value is greater than the threshold of 80% (default), it adds the "Spam detected!" flag to the headers. So I would have to create a new Mercury rule to check for this flag. In the meantime, I watch spamhalter working...

However, the spam probability is always at 0.0% for all mails, that mercury filtered out using its existing ruleset. The same kind of messages I used for training.

I am getting the following debug results for example:

X-MERCURY-SPAMHALTER: Passed through antiSPAM test by Spamhalter 4.6.1.433 on xxxxxx.de (282)

X-MERCURY-SPAMHALTER: probability - 0.0%

X-MERCURY-SPAMHALTER: Debug - Arial 0.0008802816901

X-MERCURY-SPAMHALTER: Debug - FONT 0.0015822784810

X-MERCURY-SPAMHALTER: Debug - charset 0.9980079681275

X-MERCURY-SPAMHALTER: Debug - Sales 0.0021449833654

X-MERCURY-SPAMHALTER: Debug - regards 0.0021625915791

X-MERCURY-SPAMHALTER: Debug - Best 0.0022276066906

X-MERCURY-SPAMHALTER: Debug - delivery 0.0029112081514

X-MERCURY-SPAMHALTER: Debug - information 0.0038537549407

X-MERCURY-SPAMHALTER: Debug - send 0.0042447824549

X-MERCURY-SPAMHALTER: Debug - mail 0.0044592113316

X-MERCURY-SPAMHALTER: Debug - Hello 0.0058968058968

X-MERCURY-SPAMHALTER: Debug - This 0.0059199904323

X-MERCURY-SPAMHALTER: Debug - payment 0.0060741687980

X-MERCURY-SPAMHALTER: Debug - DIV 0.0068493150685

X-MERCURY-SPAMHALTER: Debug - price 0.0075819672131

X-MERCURY-SPAMHALTER: Debug - please 0.0084932715641

X-MERCURY-SPAMHALTER: Debug - www 0.0086485542063

X-MERCURY-SPAMHALTER: Debug - utf-8 0.9909090909091

X-MERCURY-SPAMHALTER: Debug - http-equiv 0.9903846153846

X-MERCURY-SPAMHALTER: Debug - the 0.0097377577470

X-MERCURY-SPAMHALTER: Debug - ... 0.0000000000000

What am I doing wrong? I had one or two out of 100 today, that met ~10%. Should I switch to the Train Always method? And yes, I am already doing corrections and get "C" lines in my spamhalter log.

Thank you,

Bernward

Thank you Brian. I installed the provided version of spamhalter, exactly following the PDF instructions.After that, I did some training as follows:I exported about 500 spam mails out of my existing junk folder in a raw text format for training spam using the SpamhalterTools.exe program. I exported about 5000 mails out of some user accounts in raw text format and trained for nospam. My words4 database is now 2.6 MB large and contains about 50000 words.Spamhalter is now active (parallel to Mercury's spamfilter). For my understanding, spamhalter does no filtering at all, it only calculates a spam probability, and, if this value is greater than the threshold of 80% (default), it adds the "Spam detected!" flag to the headers. So I would have to create a new Mercury rule to check for this flag. In the meantime, I watch spamhalter working...However, the spam probability is always at 0.0% for all mails, that mercury filtered out using its existing ruleset. The same kind of messages I used for training. I am getting the following debug results for example:<pre>X-MERCURY-SPAMHALTER: Passed through antiSPAM test by Spamhalter 4.6.1.433 on xxxxxx.de (282) X-MERCURY-SPAMHALTER: probability - 0.0% X-MERCURY-SPAMHALTER: Debug - Arial 0.0008802816901 X-MERCURY-SPAMHALTER: Debug - FONT 0.0015822784810 X-MERCURY-SPAMHALTER: Debug - charset 0.9980079681275 X-MERCURY-SPAMHALTER: Debug - Sales 0.0021449833654 X-MERCURY-SPAMHALTER: Debug - regards 0.0021625915791 X-MERCURY-SPAMHALTER: Debug - Best 0.0022276066906 X-MERCURY-SPAMHALTER: Debug - delivery 0.0029112081514 X-MERCURY-SPAMHALTER: Debug - information 0.0038537549407 X-MERCURY-SPAMHALTER: Debug - send 0.0042447824549 X-MERCURY-SPAMHALTER: Debug - mail 0.0044592113316 X-MERCURY-SPAMHALTER: Debug - Hello 0.0058968058968 X-MERCURY-SPAMHALTER: Debug - This 0.0059199904323 X-MERCURY-SPAMHALTER: Debug - payment 0.0060741687980 X-MERCURY-SPAMHALTER: Debug - DIV 0.0068493150685 X-MERCURY-SPAMHALTER: Debug - price 0.0075819672131 X-MERCURY-SPAMHALTER: Debug - please 0.0084932715641 X-MERCURY-SPAMHALTER: Debug - www 0.0086485542063 X-MERCURY-SPAMHALTER: Debug - utf-8 0.9909090909091 X-MERCURY-SPAMHALTER: Debug - http-equiv 0.9903846153846 X-MERCURY-SPAMHALTER: Debug - the 0.0097377577470 X-MERCURY-SPAMHALTER: Debug - ... 0.0000000000000</pre><pre>&nbsp;</pre><pre>What am I doing wrong? I had one or two out of 100 today, that met ~10%. Should I switch to the Train Always method? And yes, I am already doing corrections and get "C" lines in my spamhalter log.</pre><pre>Thank you,</pre><pre>Bernward </pre>

Brian Fluet

posted Nov 25 '16 at 2:58 pm

Sorry that I can't help further. I had POPFile in place before Spamhalter was introduced so have never used it. The function of Spamhalter is similar to POPFile but the training method is different so is unfamiliar to me.

Sorry that I can't help further.&nbsp; I had POPFile in place before Spamhalter was introduced so have never used it.&nbsp; The function of Spamhalter is similar to POPFile but the training method is different so is unfamiliar to me.

bmpan

posted Nov 28 '16 at 9:39 am

Monday Morning - lots of spam.

Seems my initial training with those raw text files had no effect for some reason. So I switched to Train Always last week and started manual training. Now I am getting about 10-15% of my spam marked between 90 and 100% probability, while the rest is stuck at 0.0%. So in principle it works, but it needs more training.

Monday Morning - lots of spam.Seems my initial training with those raw text files had no effect for some reason. So I switched to Train Always last week and started manual training. Now I am getting about 10-15% of my spam marked between 90 and 100% probability, while the rest is stuck at 0.0%. So in principle it works, but it needs more training.

Joerg

posted Nov 28 '16 at 4:41 pm

Moin Bernward, hi Brian,

I was some weeks off but now I'm back in my office.

We are using Spamhalter v4.5.1.411 which came with Mercury 4.8. And you are right: Spamhalter is not filtering but only marking Spam with additional e-mail headers. In case the header X-SPAMHALTER contains "SPAM*" a Mercury Global Filter Rule, which has to be established manually, is filtering this mail out and forwards it to another User called "spam". This "user" could be checked by each normal user for false positives.

Further, when activating Spamhalter, I have created two further additional users: "is_spam" and "no_spam". My users are able to forward not detected spam mails to the local user "is_spam" where Spamhalter can learn from. On the other hand I'm able to forward false positives from the user "spam" to the "no_spam" account to train Spamhalter in the other direction.

I never used the SpamhalterTools.exe for training. I've taken only all of our spam mails and have forward them to the local "is_spam" account. Half an hour later Spamhalter has trained itself and has removed all mails from the "is_spam" account.

In any case, Spamhalter has significantly reduced our amount of spam. Further I won spare time because there is no further need to permanently adjust the Mercury content control, which I used for spam filtering before. All in all very nice and simple.

Bernward, please tell me where you have still problems. Then I will check how I've adjusted Spamhalter.

Regards

Joerg

Moin Bernward, hi Brian,I was some weeks off but now I'm back in my office.We are using Spamhalter v4.5.1.411 which came with Mercury 4.8. And you are right: Spamhalter is not filtering but only marking Spam with additional e-mail headers. In case the header X-SPAMHALTER contains "SPAM*" a Mercury Global Filter Rule, which has to be established manually, is filtering this mail out and forwards it to another User called "spam". This "user" could be checked by each normal user for false positives. Further, when activating Spamhalter, I have created two further additional users: "is_spam" and "no_spam". My users are able to forward not detected spam mails to the local user "is_spam" where Spamhalter can learn from. On the other hand I'm able to forward false positives from the user "spam" to the "no_spam" account to train Spamhalter in the other direction.I never used the SpamhalterTools.exe for training. I've taken only all of our spam mails and have forward them to the local&nbsp; "is_spam" account. Half an hour later Spamhalter has trained itself and has removed all mails from the "is_spam" account. In any case, Spamhalter has significantly reduced our amount of spam. Further I won spare time because there is no further need to permanently adjust the Mercury content control, which I used for spam filtering before. All in all very nice and simple.Bernward, please tell me where you have still problems. Then I will check how I've adjusted Spamhalter.RegardsJoerg &nbsp;&nbsp;

bmpan

posted Nov 30 '16 at 11:16 am

Moin Jörg,

I'm still on training. My only problem right now is why so many spam still is not detected. I thought this Bayesian self-learning thing would do better.

"Viagra" - 0.0000000000% spam probability

"Million dollar transaction" - 0.0000000000% spam probability

"Betriebshaftpflicht" - 0.0000000000% spam probability

I wonder how often I will have to train these again and again and again to get a result > 0%

However, by doing this training, I found out that Mercury SMTP server rejects any mail containing the string "viagra" in its subject line by simply cutting the connection. Found something in the docs regarding MercuryS transaction level filtering. Will have to check.

Moin Jörg,I'm still on training. My only problem right now is why so many spam still is not detected. I thought this Bayesian self-learning thing would do better."Viagra" - 0.0000000000% spam probability "Million dollar transaction" - 0.0000000000% spam probability&nbsp; "Betriebshaftpflicht" - 0.0000000000% spam probability&nbsp; I wonder how often I will have to train these again and again and again to get a result &gt; 0%However, by doing this training, I found out that Mercury SMTP server rejects any mail containing the string "viagra" in its subject line by simply cutting the connection. Found something in the docs regarding MercuryS transaction level filtering. Will have to check.

Joerg

posted Nov 30 '16 at 1:43 pm

Do you have any additional Mercury Rule Sets (Configuration > Filtering Rules > Global Rules) or Content Control (Configuration > Content Control) in place? On the other hand this should not influence the recognition ability of Spamhalter. maybe it would help, when I make some screenshots of my SpamHalter configuration. Give me some minutes ...

Do you have any additional Mercury Rule Sets (Configuration &gt; Filtering Rules &gt; Global Rules) or Content Control (Configuration &gt; Content Control) in place? On the other hand this should not influence the recognition ability of Spamhalter. maybe it would help, when I make some screenshots of my SpamHalter configuration. Give me some minutes ...

J_Aarni

posted Nov 30 '16 at 4:32 pm

Hi!

We have used Spamhalter for years, and it is working good. I think you can try to change "Database update strategy" to "Train on errors", specify honeypot address, and put it to web page to collect spams.

This works for us.

regards Jyrki

Hi!We have used Spamhalter for years, and it is working good. I think you can try to change "Database update strategy" to "Train on errors", specify honeypot address, and put it to web page to collect spams.&nbsp;This works for us.&nbsp;regards Jyrki&nbsp;

Related Topics

Pending draft

Confirm move posts

Insufficient permissions

Select a different topic

Edit history