2004-02-26
Editor's note: a French translation of this article, courtesy of Jerome Athias, is available here as a PDF document. Other requests for translation can be sent to the editors.
1. OverviewIn a recent survey, 93% of respondents reported dissatisfaction with the large volume of unsolicited email (spam) they receive. [ref 1] The problem has grown to the point where nearly 50% of the world's email is spam [ref 2], yet only a few hundred groups are responsible. [ref 3] Many anti-spam solutions have been proposed and a few have been implemented. Unfortunately, these solutions do not prevent spam as much as they interfere with every-day email communications. The problems posed by spam have grown from simple annoyances to significant security issues. The deluge of spam costs up to an estimated $20 billion each year in lost productivity -- according to the same document, spam within a company can cost between $600 and $1,000 per year for every user.[ref 4]
1.1 Security issuesIn addition to the wasted time spent viewing and deleting spam, spam also poses security risks including:
The existing and proposed anti-spam solutions attempt to mitigate the spam problem and address security needs. By correctly identifying spam, the impact from email viruses, exploits, and identity theft can be reduced. These solutions implement various types of security in an effort to thwart spam. Current anti-spam solutions fall into four primary categories: filters, reverse lookups, challenges, and cryptography. Each of these solutions offers some relief to the spam problem, but they also have significant limitations. The first part of this two-part paper looks at filters and reverse lookup solutions. The second part focuses on the various types of challenges, such as challenge-response and computational challenges as well as cryptographic solutions. While there are many different aspects to these solutions, this paper only discusses the most common and significant concerns -- this paper is not intended to be a complete listing of implementation options, solutions, and issues. 1.2 Common terminology
1.3 FiltersFilters are used by a recipient system to identify and organize spam. There are many different types of filter systems including:
Filters are ranked based on their false-negative and false-positive results. A false negative indicates an actual spam message that manages to pass the filter. In contrast, a false positive indicates a non-spam email that was incorrectly classified as spam. An ideal spam filter would generate no false-positives and very few false-negatives. These filter-based anti-spam approaches have three significant limitations:
More important than the limitations of spam filters is the common myth around the success of filters -- there is a widely held belief that filters stop spam. Spam filters do not stop spam. In all cases, the spam is still generated, still traverses the network, and still gets delivered. And unless the user does not mind missing the occasional misclassified desirable email, the spam is still viewed. While filters do help organize and separate email into spam and non-spam groupings, filters do not prevent spam. 1.4 Reverse lookupNearly all spam uses forged sender ("From:") addresses; very few spam emails use the sender's true email address. Furthermore, most forged email addresses appear to come from trusted domains. For example, in 15 months our spam archive collected 9300 emails that claimed to come from 2400 unique domains. The "yahoo.com" domain accounted for nearly 20% of sender addresses in the archive, but spam that actually came from the "yahoo.com" domain accounted for less than 1%. Similarly, "aol.com" and "hotmail.com" accounted for 5% each, and "msn.com" accounted for 3% even though spam, originating from all of these domains (cumulative), accounted for less than 1% of all spam received. Spam senders forge email for numerous reasons.
By addressing the forgery problem, spam senders will lose the ability to remain anonymous. Without being able to operate anonymously, laws such as the U.S.-based CAN-SPAM Act will become enforceable for spammers operating from and in the United States. In an effort to limit the ability to forge sender addresses, a number of proposed systems have surfaced for validating a sender's email. These systems include:
These approaches are very similar to each other and in many ways they are identical. DNS is a global network service used to match IP addresses with hostnames and vice versa. In 1986 DNS was extended to associate mail exchanger ("MX") records. [ref 7] When delivering email, a mail server determines where to pass the message based on the MX record associated with the recipient's domain name. Similar to MX records, the reverse lookup solutions define reverse-MX records ("RMX" for RMX, "SPF" for SPF, and "DMP" for DMP) for determining whether email from a particular domain is permitted to originate from any particular IP address. The basic idea is that forged email addresses do not originate from the correct RMX (or SPF or DMP) address range and therefore can be immediately identified as forged. While these solutions are viable in certain situations, they share some significant limitations. 1.4.1 Host-less and vanity domainsThe reverse lookup approach requires email to originate from a known and trusted mail server located at a well-known IP address (the reverse-MX record). Unfortunately, the majority of domain names are not associated with static IP addresses. Omitting cyber squatters, the general case includes individuals and small companies that want to use their own domain rather than their ISP's, but cannot afford their own static IP address and mail server. DNS registration hosts, such as GoDaddy, provide free mail forwarding services to people that register host-less or vanity domains. Although these mail forwarding services can manage incoming email, they do not offer free out-going email access.Reverse-lookup solutions cause a few problems for these host-less and vanity domain users:
In both cases, someone that uses a vanity domain, or a domain that does not have its own mail server, will be blocked by reverse-lookup systems. 1.4.2 Mobile computingMobile computing is a very common practice. People take their laptops to conferences, off-site meetings, and home in order to work away from the office or in a location that is convenient. Hotels, airports, and even coffee shops cater to the mobile computing crowd. Unfortunately, the reverse-lookup solution will likely prevent many mobile users from sending email.
While reverse-lookup solutions are viable for internal networks, these are not globally practical for external practice. Companies that wish to support host-less domains, vanity domains, and mobile or off-site users may wish to reconsider implementing reverse-lookup anti-spam technologies. 2. SummarySpam has reached epidemic proportions and people are looking for quick fixes of any kind. Spam filters are the most successful solution to date -- filters attempt to identify spam and limit a recipient's exposure. But filters do not prevent spam any more than recording a television show with a VCR prevents TV commercials. Reverse-lookup systems attempt to address the forgery problem. While reverse lookups are viable in closed environments, such as a corporate internal network, the solutions are not general enough for worldwide acceptance. Part II of this investigation will focus on challenge-based systems and proposed cryptographic solutions. |
About the authorNeal Krawetz has a Ph.D. in Computer Science and over 15 years of computer security experience. Dr. Krawetz is considered one of the leading experts in spam research and anti-spam technologies. References[ref 1] "Majority in Favor of Making Mass-Spamming Illegal Rises to 79% of Those Online." The Harris Poll ® #38. July 16, 2003. [ref 2] "Spam On Course to Be Over Half of All Email This Summer," Brightmail press release. July 1, 2003. [ref 3] According to SpamHaus, a spam content tracking organization, less than 200 spam groups generate more than 90% of spam messages. SpamHaus ROKSO, September 22, 2003. [ref 4] Source: "Spam Costs $20 Billion Each Year in Lost Productivity", by Jay Lyman. December 29, 2003. [ref 5] Source: "Phishing e-mail fraud rises 52% in January, report says", February 18, 2004. [ref 6] Reference: "Multiple Browser URI Display Obfuscation Weakness" [ref 7] Source: "Domain System Changes and Observations", RFC973 by Paul Mockapetris. January 1986. |