Mercury and dealing with Spam

GordonM

posted Sep 15 '09 at 4:13 am

I have now switched the X-Originating-IP spam recognition process to Policy from filtering. This now allows me to make a decision on whether to delete a message or not. I have recognized a few issues, but these are minor, e.g. the national IP address range databases that I am using are not 100% accurate (I encountered one IP address which wasn't in my US database, but an address look-up on the web identified it as US) and I will probably add NZ to the admitted address ranges (David's TheThousand address was identified as Spam!).

I have one issue that I would like advice on. I am using global filtering, which is then followed by my Policy check for Spam. I had assumed that messages that were deleted by filtering and messages that were moved to another user by filtering would not be exposed to the Policy check. However, any remaining messages would be tested by Policy. Is this correct? I am asking this as a couple of messages from new, but non-Spam addresses are being subjected to the policy check. I didn't expect this, as I have moved them to another user, prior to the Policy check.

Gordon

I have now switched the X-Originating-IP spam recognition process to Policy from filtering.&nbsp; This&nbsp;now allows me to make a decision on whether to delete a message or not.&nbsp; I have recognized a few issues, but these are minor, e.g. the national IP address range databases that I am using are not 100% accurate (I encountered one IP address which wasn't in my US database, but an address look-up on the web identified it as US) and&nbsp;I will probably add NZ to the admitted address ranges (David's TheThousand address was identified as Spam!). I have one issue that I would like advice on.&nbsp; I am using global filtering, which is then followed by my Policy check for Spam.&nbsp; I had assumed that messages that were deleted by filtering and&nbsp;messages that were moved to another user by filtering would not be exposed to the Policy check.&nbsp; However, any remaining messages would be tested&nbsp;by Policy.&nbsp; Is this correct?&nbsp; I am asking this as a couple of messages from new, but non-Spam addresses are being subjected to the policy check.&nbsp; I didn't expect this, as I have moved them to another user, prior to the Policy check. Gordon

GordonM

posted Aug 21 '09 at 10:34 pm

I am looking at additional means of dealing with spam and I would appreciate some opinions/advice. The context here is that I use Mercury as a server in my home, where the "real" users are just my wife and myself. I am using Mercury's Distributing POP3 Client, which picks up mail from several ISP accounts and IMAP for local mail access. Without spam filtering, we would be seeing several hundred spam messages a day. The current way (which has evolved over a couple of years) that I "fight the Spam battle" is as follows:

I have a personal e-mail address that I only give out to close friends and associates. My wife's e-mail address is different in that she uses a single one for all purposes and it is definitely compromised. Mail using these two e-mail addresses is only forward to the local user accounts if there is a match with a sender list (separate ones for my wife and me). All other messages are classified as candidates for Spam. There is one exception which is allowed when the userlist test fails. This is if the Subject contains the word "address" and is to catch known correspondents who are providing an e-mail address change notification (all these messages are moved by Mercury to my main account). There is a small chance that some real Spam might contain "address" in the Subject, but this hasn't happened yet.

For commercial and other non-personal contacts, I use individual disposable addresses. I have several hundred of these.and, potentially, I could change them if any are compromised. However, this hasn't happened so far (though the common disposable address that I use for sending non-personal mail is currently being exploited by one source).

I use Mercury's content control for weeding-out messages with objectionable words. These messages are deleted by Mercury's filtering process, without any manual inspection. Content Control is also used to detect mail from obvious foreign countries with which I have no connection (these are not automatically deleted, but sent to a "Foreign" account for inspection). This isn't just from senders with foreign e-mail addresses but any foreign connection indicated by any of the header information. I will probably implement a deletion policy on these messages shortly, as no required mail has ever been classified wrongly as Foreign.

I have recently implemented a test for the Character Set used, as discussed in another thread on the forum. This is now working well and cyrillic and non-Western character sets are bering reliably detected. This has classified a large amount of the residual spam. I am not yet automatically deleting this mail and will leave it that way for a few months to check for any false positives.

This now brings me to the main point of this post. There is still a small amount of residual spam, I would guess about 2% which is classified as potential spam and needs manual inspection. The vast majority of this comes from foreign sources, but not from obvious foreign sources. It is only detectable by doing search against national IP address range databases using the earliest X-Originating-IP header information. In principle, this information could be found automatically through a call from Mercury's filtering process. I know how to do this, but it will take some effort to implement. I am wondering if anyone has tried this approach and how reliable it is likely to be. IP addresses can be spoofed, but my experience so far is that this approach would catch most of the residual spam.

I know there are other techniques that I could use, Baysian approaches, Graywall, blacklists etc, but I haven't chosen to go those routes so far. What I am looking for is zero false positives and a very small amount of residual spam. BTW, Graywall looks like a good approach, but my impression is that I would have to run my own SMTP server, which I am reluctant to do for most purposes.

Thank you

Gordon

I am looking at additional means of dealing with spam and I would appreciate some opinions/advice.&nbsp; The context here is that I use Mercury as a server in my home, where the "real" users are just my wife and myself.&nbsp; I am using Mercury's Distributing POP3 Client, which picks up mail from several ISP accounts and IMAP for local mail access.&nbsp; Without spam filtering,&nbsp;we would be seeing several hundred spam messages a day.&nbsp; The current way (which has evolved over a couple of years) that I "fight the Spam battle" is as follows: I have a personal e-mail address that I only give out to close friends and associates.&nbsp; My wife's e-mail address is different in that she uses a single one for all purposes and it is definitely compromised.&nbsp; Mail using&nbsp;these two e-mail addresses&nbsp;is only forward to the local user accounts if there is a match with a sender list (separate ones for my wife and me).&nbsp; All other messages are classified as candidates for&nbsp;Spam.&nbsp; There is one exception which is allowed when the&nbsp;userlist test fails.&nbsp; This is if the Subject contains the word "address" and&nbsp;is to catch known correspondents who are providing an e-mail address change notification&nbsp;(all these messages are moved by Mercury to my main account).&nbsp; There is a small chance that some real Spam might contain "address" in the Subject, but this hasn't happened yet. For commercial and other non-personal contacts, I use individual&nbsp;disposable addresses.&nbsp; I have several hundred of these.and, potentially, I could change them if any are compromised.&nbsp; However, this hasn't happened so far (though the common disposable address that I use for sending non-personal mail is currently being exploited by one source). I use Mercury's content control for weeding-out messages with objectionable words.&nbsp; These messages are deleted by Mercury's filtering process, without any manual inspection.&nbsp; Content Control is also used to detect mail from obvious foreign countries with which I have no connection (these are not automatically deleted, but sent to a "Foreign" account for inspection).&nbsp;&nbsp;This isn't just from senders with foreign e-mail addresses but any foreign connection indicated by any of the header information.&nbsp; I will probably implement a deletion policy on these messages shortly, as no required mail has ever been classified wrongly as Foreign. I have recently implemented a test for the Character Set used, as discussed in another thread on the forum.&nbsp; This is now working well and cyrillic and non-Western character sets are bering reliably detected.&nbsp; This has classified a large amount of the residual spam.&nbsp; I am not yet automatically deleting this mail and will leave it that way for a few months to check for any false positives. This now brings me to the main point of this post.&nbsp; There is still a small amount of residual spam, I would guess about 2% which is classified as potential spam and needs manual inspection.&nbsp; The vast majority of this comes from foreign sources, but not from obvious foreign sources.&nbsp; It is only detectable by doing search against national IP address range databases using the earliest X-Originating-IP header information.&nbsp; In principle, this information could be found automatically through a call from Mercury's filtering process.&nbsp; I know how to do this, but it will take some effort to implement.&nbsp; I am wondering if anyone has tried this approach and how reliable it is likely to be.&nbsp; IP addresses can be spoofed, but my experience so far is that this approach would catch most of the residual spam. I know there are other techniques that I could use, Baysian approaches, Graywall, blacklists etc, but I haven't chosen to go those routes so far.&nbsp; What I am looking for is zero false positives and a very small amount of residual spam.&nbsp; BTW, Graywall looks like a good approach, but my impression is that I would have to run my own SMTP server, which I am reluctant to do for most purposes. Thank you Gordon &nbsp; &nbsp;

PiS

posted Aug 21 '09 at 11:32 pm

Start by installing Lukas Graywall which is included within the setup. Spam will then be reduced by some 80%. Graywall has zero false positives for "legit" mail, as mass spammers seldom build queues to resend.

When you feel comfortable with Graywall and you feel you need more, then dive deeper into other means of fighting spam in a "home server".

Start by installing Lukas Graywall which is included within the setup. Spam will then be reduced by some 80%. Graywall has zero false positives for "legit" mail, as mass spammers seldom build queues to resend.When you feel comfortable with Graywall and you feel you need more, then dive deeper into other means of fighting spam in a "home server".

GordonM

posted Aug 22 '09 at 12:58 am

Peter - As I intimated in my initial post, I like the idea of Graywall, but I don't see how it's going to work unless I am running a SMTP server. I do have the server installed, but at the moment, I am refusing all connections (I only use it for special; purposes, now and again). Given that I am picking up all of my mail with Mercury D, I don't see how Graywall is going to help. If I have misunderstood the function of Graywall, I would be pleased to be "educated".

Thank you

Gordon

Peter - As I intimated in my initial post, I like the idea of Graywall, but I don't see how it's going to work unless I am running a SMTP server.&nbsp; I do have the server installed, but at the moment, I am refusing all connections (I only use it for special; purposes, now and again).&nbsp; Given that I am picking up all of my mail with Mercury D, I don't see how Graywall is going to help.&nbsp; If I have misunderstood the function of Graywall, I would be pleased to be "educated". Thank you Gordon &nbsp;

Rolf Lindby

posted Aug 22 '09 at 1:52 am

If you collect your email with POP3 the frontline defense against spam is up to the receiving SMTP server (greylisting, DNS blacklists, etc.). Your primary anti-spam tool in Mercury should be SpamHalter, which you can combine with content control and filtering rules.

SpamHalter will need a fairly large number of training messages, both spam and no-spam, to work efficiently.

/Rolf

If you collect your email with POP3 the frontline defense against spam is up to the receiving SMTP server (greylisting, DNS blacklists, etc.). Your primary anti-spam tool in Mercury should be SpamHalter, which you can combine with content control and filtering rules.SpamHalter will need a fairly large number of training messages, both spam and no-spam, to work efficiently./Rolf&nbsp;

GordonM

posted Aug 22 '09 at 3:06 am

Rolf - Thank you for the reply. My ISP does have a Spam management capability but,so far as I know, it only works properly if one uses the web-client. It does do something without this but, as with Spamhalter needs to be trained (with the web-client). I can't do any training of the ISPs system, when I am using a POP3 download to Mercury. It does mark some messages as [Bulk], indicating possible spam, but it's grossly unreliable in its untrained state. I have no idea how well it works when properly trained through the web-client. My ISP may use black-lists, but there is no information about this and getting anything technical from CSRs is almost impossible.

I suppose what I was thinking of doing is basically using a blacklist method, based on certain target countries.

I have tried using spam training at the client end with Thunderbird, but it seems to take a long time to get trained. Unfortunately, spam evolves, so what is good now, is not good enough in a few months time. Certainly my use of Mercury Content Control indicates this. I suppose it is a question of continuous training and I suspect that this doesn't lead to absolute absence of false positives .. I am probably being too optimistic to expect this, without a significant amount of spam!

I may continue to explore the ideas that I put forward in my earlier post, unless I hear from someone that this has been tried and doesn't work. Certainly, my observations so far indicate that this approach should put a fair dent into residual spam, though it's a rather brute force method, perhaps with some significant processing overhead.

Gordon

Rolf - Thank you for the reply.&nbsp; My ISP does have a Spam management capability but,so far as I know, it only works properly if one uses the web-client.&nbsp; It does do something without this but, as with Spamhalter needs to be trained (with the web-client).&nbsp; I can't do any training of the ISPs system, when I am using a POP3 download to Mercury.&nbsp; It does mark some messages as [Bulk], indicating possible spam, but it's grossly unreliable in its untrained state.&nbsp; I have no idea how well it works when properly trained through the web-client.&nbsp; My ISP may use black-lists, but there is no information about this and getting anything technical from CSRs is almost impossible. I suppose what I was thinking of doing is basically using a blacklist method, based on certain target countries. I have tried using spam training at the client end with Thunderbird, but it seems to take a long time to get trained.&nbsp; Unfortunately, spam&nbsp;evolves, so what is good now,&nbsp;is&nbsp;not good enough in a few months time.&nbsp; Certainly my use of Mercury Content Control indicates this.&nbsp; I suppose it is a question of continuous training and I suspect that this doesn't lead to absolute absence of false positives .. I am probably being too optimistic to expect this, without a significant amount of spam! I may continue to explore the ideas that I put forward in my earlier post, unless I hear from someone that this has been tried and doesn't work.&nbsp; Certainly, my observations so far indicate that this approach should put a fair dent into residual spam, though it's a rather brute force method, perhaps with some significant processing overhead. Gordon

tBB

posted Aug 23 '09 at 10:11 am

If you use ClamAV (ClamWall) you could have a look at the third party signatures which are highly effective against spam. See http://sanesecurity.com/databases.htm

Best regards,

Nico

If you use ClamAV (ClamWall) you could have a look at the third party signatures which are highly effective against spam. See http://sanesecurity.com/databases.htmBest regards,Nico

PiS

posted Aug 23 '09 at 11:42 pm

[quote user="GordonM"] Given that I am picking up all of my mail with Mercury D, I don't see how Graywall is going to help.[/quote]

You're absolutely correct Graywall will not help when you collect e-mail from a maildrop service. Sorry I didn't read your post carefully enough. The only thing you can do then are other means of post-processing by analyzing the contents of the e-mail, as Rolf correctly has suggested.

For any other ones, that run MercuryS and you want to start fighting off spam, with any form of effeciency and no risk (practically) - start using Graywall.

[quote user="GordonM"] Given that I am picking up all of my mail with Mercury D, I don't see how Graywall is going to help.[/quote] You're absolutely correct Graywall&nbsp;will not help when you collect e-mail from a maildrop service. Sorry&nbsp;I didn't read your post carefully enough. The only thing you can do then are other means of post-processing by analyzing the contents of the e-mail, as Rolf correctly has suggested. For any other ones, that run MercuryS and you want to start fighting off spam, with any form of effeciency and no risk (practically) - start using Graywall.

GordonM

posted Aug 25 '09 at 5:08 am

Whether or not this turns out to be a good idea or not, it has been an intersting exercise. I have now implemented a script outside of Mercury that examines the earliest X-Originating-IP header of a message and then looks this up in IP range databases of selected countries. Although this causes a fair amount of overhead, it is applied to very few message, typically fewer than half-a-dozen or a day, which can't be dealt with by my Content Control rules or other filtering means. This seems to be working pretty well except for one thing ... which I should probably have thought of at the beginning. When a successful match is obtained, I set a sentinel file. However, I have been unable to do anything as a result of this action using Mercury's filtering rules. There doesn't seem to be a way to take a certain action if the sentinel file becomes present (in the 2 minutes allowed by Mercury's "Wait until a file exists" rule), but take a different action if the sentinel file doesn't appear. In fact, I am wondering what the purpose of sentinel files is, if they can't cause any changes in the filtering rule processing. Am I missing something here?

Thank you

Gordon

Whether or not this turns out to be a good idea or not, it has been an intersting exercise.&nbsp; I have now implemented a script outside of Mercury that examines the earliest X-Originating-IP header of a message and then looks this up in IP range databases of selected countries.&nbsp; Although this causes a fair amount of overhead, it is applied to very few message, typically fewer than half-a-dozen or a day, which can't be dealt with by my Content Control rules or other filtering means.&nbsp; This seems to be working pretty well except for one thing ... which I should probably have thought of at the beginning.&nbsp; When a successful match is obtained, I set a sentinel file.&nbsp; However, I have been unable to do anything as a result of this action using Mercury's filtering rules.&nbsp; There doesn't seem to be a way to take a certain action if the sentinel file becomes present (in the 2 minutes allowed by Mercury's "Wait until a file exists" rule), but take a different action if the sentinel file doesn't appear.&nbsp; In fact, I am wondering what the purpose of sentinel files is, if they can't cause any changes in the filtering rule processing.&nbsp; Am I missing something here? Thank you Gordon

PaulW

posted Aug 25 '09 at 9:33 am

You should be running this script in a Policy not as a filter. A sentinel file is a flag for Mercury to determine when the script has finished. It's the presence of the results file that means the policy can block the message.

Check out the help at Configuration / Mercury core / Policy.

You should be&nbsp;running this script in a Policy not as a filter.&nbsp; A sentinel file is a flag for Mercury to determine when the script has finished.&nbsp; It's the presence&nbsp;of the results file that means the policy&nbsp;can block&nbsp;the message. Check out the help at Configuration / Mercury core / Policy.

GordonM

posted Aug 25 '09 at 1:56 pm

Paul - Thank you for this. I had not really appreciated what policies could do. Whether I can use this approach, I am not sure. What I don't want to do is to run a policy on every message, as this might become a major proccessing load. I want to understand the application of a policy in conjunction with filtering. This is what I want to confirm .... If the filtering is run first (i.e. "The task should be applied before filtering any rules" is not checked) and does things like deleting messages or moving mesages to other users, have these messages then left the message queue and won't be processed by any policy that I set up? If this is so, it will only be the "left over" messages, i.e. the potential spam that I have not been able to deal with by other means, that is going to be dealt with by the policy script.

Thank you

Gordon

Paul - Thank you for this.&nbsp; I had not really appreciated what policies could do.&nbsp; Whether I can use this approach, I am not sure.&nbsp; What I don't want to do is to run a policy on every message, as this might become a major proccessing load.&nbsp;&nbsp;I want to understand the application of a policy in conjunction with filtering.&nbsp; This is what I want to confirm .... If the filtering is run first (i.e. "The task should be applied before filtering any rules" is&nbsp;not checked) and does things like deleting messages or moving mesages to other users, have these messages then left the message queue and won't be processed by any policy that I set up?&nbsp; If this is so, it will only be the "left over" messages, i.e. the potential spam that I have not been able to deal with by other means, that is going to be dealt with by the policy script. Thank you Gordon

PaulW

posted Aug 25 '09 at 7:44 pm

That's correct. The policy can be run before all filtering rules, or after. By putting a policy after the filtering, you can reduce the number of messages affected.

That's correct.&nbsp; The policy can be run before all filtering rules, or after.&nbsp; By putting a policy after the filtering, you can reduce the number of messages affected. &nbsp;

Rolf Lindby

posted Aug 25 '09 at 7:45 pm

You can find a description of the processing order for Mercury in this thread:

http://community.pmail.com/forums/thread/152.aspx

/Rolf

You can find a description of the processing order for Mercury in this thread:<a href="/forums/thread/152.aspx" mce_href="/forums/thread/152.aspx">http://community.pmail.com/forums/thread/152.aspx </a>/Rolf

GordonM

posted Aug 26 '09 at 12:05 am

Thank you for the confirmation, Paul. I had seen the process flow information, Rolf, but forgotten about it. A useful reminder.

Gordon

Thank you for the confirmation, Paul.&nbsp; I had seen the process flow information, Rolf, but forgotten about it.&nbsp; A useful reminder. Gordon

GordonM

posted Sep 2 '09 at 11:12 pm

This is an update of my progress in dealing with spam by checking the X-Originating-IP header against databases through a script called by Mercury. Originally, I tried to use IP range databases of countries from which spam often originates. This turned out to be a major management problem .... there are just too many of them. I, therefore, reversed the logic and installed databases of only US, GB, AU and CA (the countries from which I receive almost all of my wanted e-mail .... individual addresses from other countries can be dealt with by inclusion in a scanned UserList.txt) and identified messages from these 4 countries as not spam. I originally didn't do this, as I though that the US database would be so large that it would take a long time for the script to go through it. This didn't turn out to be the case.

According to Mercury statistics, I am fairly consistently receiving about 220 e-mail messages a day (this is for both my wife and myself). Of these, probably about 180 are spam. In the last few days no spam has leaked through (though I have not deleted everything identified as spam, while I am testing). I'll leave it to run with this arrangement until I am satisfed that nothing is being misidentified. I am pleased with the results so far. In fact, I am still running the script from a global filter as my initial attempt to call it as part of a Policy didn't work. My next step is to implement things as a Policy.

One thing that I would find useful is for my script to send e-mail to a Mercury account from within my LAN, without having to go outside. I have had trouble with this, but I'll deal with this issue as a separate posting, as it may be of more general interest than for my anti-spam application.

Gordon

This is an update of my progress in dealing with spam by checking the X-Originating-IP header against databases through a script called by Mercury.&nbsp; Originally, I tried to use IP range databases of countries from which spam often originates.&nbsp; This turned out to be a major management problem .... there are just too many of them.&nbsp; I, therefore, reversed the logic and installed databases of only US, GB, AU and CA (the countries from which I receive almost all of my wanted e-mail .... individual addresses from other countries can be dealt with by inclusion in a scanned UserList.txt) and identified messages from these 4 countries as not spam.&nbsp; I originally didn't do this, as I though that the US database would be so large that it would take a long time for the script to go through it.&nbsp; This didn't turn out to be the case. According to&nbsp;Mercury statistics, I am fairly consistently receiving about 220 e-mail messages a day (this is for both my wife and myself).&nbsp; Of these, probably about 180 are spam.&nbsp; In the last few days no spam has leaked through (though I have not deleted everything identified as spam, while I am testing).&nbsp; I'll leave it to run with this arrangement until I am satisfed that nothing is being misidentified.&nbsp; I am pleased with the results so far.&nbsp; In fact, I am still running the script from a global filter as my initial attempt to call it as part of a Policy didn't work.&nbsp; My next step is to implement things as a Policy. One thing that I would find useful is for my script to send e-mail to a Mercury account from within my LAN, without having to go outside.&nbsp; I have had trouble with this, but I'll deal with this issue as a separate posting, as it may be of more general interest than for my anti-spam application. Gordon &nbsp;

Related Topics

Pending draft

Confirm move posts

Insufficient permissions

Select a different topic

Edit history