The Spam Firewall is a mechanism designed to reduce the amount of junk mail that enters an email inbox. When a large amount of emails is trafficked, some messages will be useful and desired and others will be unsolicited advertisements and unwanted debris. A spam filter tries to effectively separate these messages into the proper categories and send through the good while leaving out the bad.
The problem is that the criteria for what is good and what is bad is not clear. If the filter is set too tight, many legitimate messages will be blocked. If the filter is set too loose, the inbox will get peppered with garbage messages. The difficult trick is finding the optimal tradeoff to get the best of both worlds.
Spam filtering is a difficult process for several reasons:
First, setting the filter must be done delicately so that the good messages get through and the bad messages are left behind. This is not easy. Consider this analogy:
Filtering is a process that is a lot like catching fish by setting the nets for the correct sized fish. If you set the net size too small, most fish never make it into the net and you have no catch. If you set the net too large, you get all of the fish, but also dolphins, squid, octopus, and a variety of other unwanted creatures. The key in this case, is how to set the nets to best catch fish while leaving the dolphins behind. At the risk of overextending the metaphor, this is made more difficult because fish and dolphins come in many sizes.
Second, spammers do not want their messages filtered. They continuously attempt to fool, circumvent, or negate the filtering mechanism using a variety of creative techniques. This prompts a response from the antispam community which gets involved and improves the filters only to have the cycle repeat once more.
The nature of the problem is such that there will always be mistakes on both sides: some trash will get in and some good messages will be blocked. The Spam Firewall attempts to remedy the latter condition. When potentially good messages enter the system and are classified for some reason as bad, the filter holds the message in a special area. The messages in this area can be viewed, whitelisted, deleted, or delivered to the email inbox.
A one week period of email traffic (including Moodle traffic handled offsite) is shown below.
Total emails received: 2,182,887
Blocked: 1,803,114 (83%)
Quarantined: 44,315 (2%)
Allowed: 335,458 (15%)
From the breakdown, we see that almost two thirds of all emails received were classified as spam or problem emails. This might suggest that without the filter, the amount of spam that reaches the inbox would be considerably more. The filtering process occurring behind the scenes protecting end users from these messages before they are even seen. The small amount of quarantined messages suggests that most messages received are obviously spam or not without much ambiguity in either direction.
*** Note *** Email figures were updated 9-17-2012 and now include Moodle traffic which was not included in the previous statistics. The older figures showed aproximately two-thirds of incoming email was classified as undesirable and blocked. One-third of email traffic was allowed to reach the user inboxes. The new numbers reflect the changes. Even with the Moodle email, it is apparent an overwhelming majority of message traffic is blocked now by an even more pronounced margin.
If there are questionable messages that enter the system, the Spam Firewall may either block them outright or place them into a special quarantine area. In this quarantine area, messages are stored for later review. A user may log into the Spam Firewall and see all outstanding filtered messages in the quarantine area. These messages may be marked for deletion, sent to the inbox for regular email inbox manipulation, or labeled for whitelisting. A message whitelisted in the filter will have the originating address added as a trusted sender. Trusted senders are allowed through the filter and will not show up in the quarantine list again. Any future messages from a whitelisted address will be allowed through unimpeded.
The quarantine area is specific to an individual user. Every users has his or her own quarantine area containing only their messages.
The amount of time that the quarantine area holds questionable messages is 31 days. Any message in the quarantine area during this time may be scheduled for delivery or have the sender added to the whitelist. After this period expires, the message will be deleted from this area.
Factors involved in messages blocked, allowed, or quarantined
The filter utilizes a mathematical matching routine which evaluates the message and assigns a score to it based on several criteria. A message with a high score is blocked from the system. This means it will never been seen at all, even within the quarantine area. Moderately scored messages are sent to the quarantine area where they may be viewed, deleted, or marked for delivery to the email inbox.
Another factor in determining what is and is not spam is something called spam reputation. Basically this is a blacklist of all known spam websites. Whenever a message comes in, the sending address is checked against this list and handled accordingly. Anyone sending messages from a blacklisted address will have those messages deleted before anyone ever sees them.
One safeguard in place is a rate control mechanism which controls the frequency of email sending over a period of time. Since spam messages often attempt to flood the email dispatcher with as many emails as possible in the shortest amount of time, that behavior is often an indicator of spam behavior. It someone tries to send out too many emails from a single location in a short period of time, the spam filter may stop that behavior.
Sending Origin Blacklist
The filters match according to several different attributes including IP, address, or domain. Emails originating from locations that have one of these attributes included on the blacklist get dumped into quarantine.
Email messages containing various types of attachments are blocked to prevent problems with virus infection. Since many viruses conveyed through email use attachments to do their damage, keeping those problem files out of the system reduces the risk to everyone on the network. Some files are more dangerous than others (e.g. EXE files). A list of blocked attachment types can be found below:
Blocked outright (these file types are stopped)
Quarantined (these types are contained for review)
Banned or Quarantined Content
Some content filters are set up to quarantine messages containing specific words known to be spam. Many advertisers, get-rich-quick schemes, fantastic products, and the like often are in this blocked category and their messages go to quarantine.
Other Triggers include emails with foreign character sets, emails that originate on a foreign domain (the domain name maps back to a foreign country – domains such as .com can exist in any country), or domains known to send spam.
The benefit of the spam firewall is obvious when one checks the quarantine queue. The usually large number of messages good and bad take time to read, save, or delete. A spam filter saves the user from having to cull through great gobs of potential garbage while still providing an area of last resort for those useful messages accidentally misclassified.
The key to making use of this resource is to remember it exists. If a user knows to check the filter, then missing messages may be located and retrieved. Check the filter regularly and you may find a wealth of treasure buried among the refuse.