Cloudmark Authority is a forward-thinking product that not only breaks all the spam-fighting rules, it doesn’t even bother with them. The spam-fighting software sits on the e-mail gateway and uses predictive analytical statistics to alter itself, keeping up with the changing nature. There's database with thousands of rules to update daily, just a small collection of about 200 statistical models that are updated monthly. Authority is easy to install and invisible in operation.


Varies; based on number of seats in an enterprise environment. Free 30-day trial.

San Francisco, Calif.


Other names in rules-based spam-fighting are Brightmail and MailFrontier. Postini is an outside service to which all e-mail is diverted for scrubbing. Also check out Lyris and McAfee.

4Runs as a Windows DLL or Unix shared sbject

4Small memory footprint

4Updated roughly monthly



Cloudmark Authority


By Joel Shore

May 8, 2003

Spam sucks. Over the last month I’ve been taking a close look at my incoming e-mail. A whopping 47 percent of it is junk. Multiply that by the hundreds or thousands of people in a large corporation, and you’ve got a major headache.

How widespread is spam? According to a forecast from research firm IDC, the spam glut will propel the worldwide daily volume of e-mail from 31 billion messages in 2003 to 60 billion in 2006. My personal guess is that spam is growing even faster than that.


Cloudmark Authority not only breaks the rules of fighting spam, it doesn't even bother with them


The problems with spam are far-flung. Individual workers waste time looking at and deleting junk e-mail, taking a toll on productivity. The network infrastructure has to be robust enough to deal with this incoming deluge. And workers offended by the content in some spam may seek recourse, exposing the employer to potential litigation for not doing enough.

The answer is pretty simple; you’ve got to stop spam before it reaches users. And that means blocking spam at the server handling the incoming message stream.

It turns out there are a couple of different ways to do this. Some spam-blocking software relies on rules to hunt down and delete spam. Not surprisingly, these rules need continuous updating. Another method is to examine the structure of an e-mail message, not its content, to determine if it’s spam.

Cloudmark, a small company based in San Francisco, has adopted the latter approach. Its gateway spam-blocking product, called Authority, uses predictive statistical analysis methods to keep up with the ever-changing nature and sophistication of spam.

The basis for Authority is Bayesian statistics, based on the work of one Thomas Bayes. If you think this guy is some whiz kid from MIT, Cal Tech, or Stanford, then think again. Thomas Bayes, who turned 300 years old in 2001 (1701–1761), was, of all things, an English Presbyterian minister.

Bayesian statistics says that you can combine current data (like today’s incoming e-mail) with historical data (a database of spam and spam characteristics) to predict an outcome—though with varying levels of confidence.

And that’s just what Authority does. It looks at incoming mail and assigns a “confidence factor” to it. If Authority is 99 percent sure that a message is spam, you can take several actions including deleting it or sending a refusal error message back to the sender. Mail can be sent to the addressee with a “SPAM ALERT” warning stuffed into the subject field. Or you might not send it to the addressee at all. Alternatively, if Authority is only one percent sure the message is spam (meaning it almost certainly is not), it is untouched and routed to the addressee.

You’ve already seen Bayesian theory at work. That obnoxious animated little paper clip character in Microsoft Office is an example. The help engine observes a user’s usage patterns and pops up to offer help, based on its internal database and user analysis. As time passes and the help engine continually adds its observations, it pops up less and less often. (Of course, you probably disabled the darn thing right after it popped up for the first time.)

What’s cool about Authority is that it doesn’t need daily updating to a set of rules, potentially an enormous headache. Instead of tens of thousands of rules, Authority has only about 200 Bayesian statistical models. Cloudmark refers to each of these as a spamGene. Collectively, they constitute spamDNA. Instead of daily updates, spamDNA is updated about once a month.

Authority examines not just content, but the structure of an e-mail message, the IP relay path from sender to all servers to addressee, and more. IP spoofing or forging is always caught as is any attempt to bypass the e-mail server by encoding the body of a message in base64. The use of nonsense text, embedded spaces in words, upper case, exclamation points, presence of external links, presence of graphics, recipient’s name embedded in the subject or sender name fields, and a whole lot more are all considered. Authority's spamDNA looks at these factors as a whole, making judgments based on its ever-changing experience.

The genetic reference is apt. As human genes sometimes mutate, so too does spam. And because spamDNA is predictive, it can modify itself to a certain degree.

And if you're wondering where Cloudmark gets a never-ending torrent of spam messages to add to its database, look no further than it's own SpamNet product. This inexpensive plug-in for Microsoft Outlook filters out most incoming spam. Whatever slips through is reported back to Cloudmark with a single mouse click. The worldwide SpamNet community is nearly 450,000 strong and growing.

Testing the Product

Authority is less than a megabyte in size. It runs at the message transport authority (MTA) as either a .dll on a Windows SMTP server or as an .so shared object on a Unix or Linux Sendmail server. The spamDNA module, which Cloudmark oddly calls a “cartridge,” is about 460Kbytes. That’s miniscule.

In a UNIX or Linux Sendmail environment, Authority is installed via the Milter interface. The product plugs into Sendmail, using it to relay messages. In a Windows server environment, Authority uses the SMTP Server that is part of Windows 2000 Server. Authority interoperates with all SMTP-compliant e-mail servers (Lotus, Exchange, Eudora, PostFix, etc.) via standard SMTP relay methods

The way Authority works is pretty simple: It intercepts the incoming stream of SMTP mail traffic through either the Sendmail Milter interface (UNIX and Linux) environments, or the Windows SMTP Server (Windows 2000 Server or later). It examines the stream of SMTP messages, filters the spam, and then returns the filtered stream back to the same SMTP source. Spam never reaches the corporate e-mail server.

We looked at Authority on a Windows 2000 Server.

Installation was simple. Authority installs via the included InstallShield utility. In a Unix/Linux environment, it installs from a console command. One caveat for Unix/Linux environments: If milter is not installed, you’ll have to recompile the operating system to enable it. Not a big deal, but an extra task nonetheless.

Next, we defined confidence levels and how messages at each level should be handled. Normally, you’d simply delete messages with a high spam likelihood, but since we wanted to keep track of each message, we chose to save each message to a quarantine folder. That allowed us to keep an overall tally. Several actions are possible: delete, delete and send a refusal to the sender, save to a quarantine area and do not route to the addressee, insert a warning message and deliver, or take no action and deliver.

We used three different message streams. The first was all legitimate mail. No messages from this stream should be filtered out. The second stream was all spam. All the messages in this stream should be filtered out. The third was a combination, pretty much what you’d expect to see in day-to-day operation of a business.

Filtering out legitimate mail can be, at best, a mere inconvenience or, at worst, a very, very bad thing. There’s the “false positive,” in which, say, a newsletter you subscribe to is considered spam. Though that’s incorrect, it's not a big problem. Far worse is the dreaded “false critical.” Suppose the company CEO sends an urgent e-mail message with “Fix this problem now!!!” in the subject field. Because three consecutive exclamation points raises the likelihood that this message is spam, filtering it out could have dire consequences—not just for the person who failed to receive it, but for the IT department that removed it. Of course, many other factors are taken into consideration.

To see where the mail is going, we used a client PC running Outlook 2002. For Unix/Linux power users who’d scoff at a client e-mail viewer, the server-based Mutt utility does just fine.

Excellent Results

First we detected no human-obvious delay in processing the mail stream. Even if a mail message is held up by a quarter millisecond, it’s more than made up for in the prevention of productivity loss at the end user level and with the cost avoidance in perhaps otherwise necessary network infrastructure expansion. 

In our test, 98.5 percent of known spam was filtered out. That’s really good. For every 10,000 incoming spam messages, only 150 will get through. Even better was the stream of all-legitimate mail. Only three legit messages were considered potential spam: one was a newsletter, one a routine e-mail blast from Expedia, and a survey from a major airline sent to its frequent fliers. That’s a “false positive” success rate exceeding 99.99 percent. Simply assign a low enough confidence level, and these messages will get delivered—perhaps with a potential spam warning, but delivered nonetheless. Best of all, not one “false critical” occurred.

Bottom Line

Face it, fighting spam is a fact of life. Who these people are and what profits they may reap from inundating every e-mail user with spam beats the hell out of me, but legions of them are out there.

For now, Authority is sold by a small direct sales team. It will be interesting to see if demand will overwhelm that team, pushing the product into the distribution channel and the hands of solutions integrators.

Pricing is by the seat, based on an annual subscription. As the number of seats climbs, the price per seat drops.

4Easy installation

4Compatible with Unix, Linux, and Windows servers

4Updated roughly monthly; needs no daily update

4Requires no dedicated server

4No security risk; it keeps all e-mail in-house


4Company isn't well known

4Predictive statistical method is not familiar to IT execs

