Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

[Boston.pm] Mail and spam: msg#00031

lang.perl.perl-mongers.boston

Subject: [Boston.pm] Mail and spam

We talked a little bit about spam tonight, and I mentioned my messed up
setup. Here's what I have:

Emails enter my network into a sendmail server - all it does is do
graylisting (bounce everything with a temporary failure code), then
forward to another server running qmail.

The qmail server is using maildrop to filter duplicate emails (based on
message IDs), does whitelist filtering (mailing lists and such), then
feeds the remaining emails to CRM (crm114.sf.net).

CRM does an amazing job filtering spam. If it says something isn't
spam, I take it's word for it, and deliver it to my inboxs. If it does
say it's spam, I feed the email to my own filtering script, which
does whitelisting and blacklisting based on addresses in the headers.
If something is whitelisted, it goes to the inboxes. If something is
blacklisted, it goes to /dev/null. If it can't decide one way or the
other, the email moves on...

The last step is feeding the email to spamassassin. If SA says it's
spam (with a really high threshold), the mail goes to /dev/null. If
not, it goes to a spam folder for me to look at later.

Here's how the numbers end up looking (on Dec 11):

Emails before graylisting : 209,856
Emails after graylisting : 185,657 (24,199 dropped, 11.5%)
Emails after dropping duplicates : 93,685 (91,972 dropped, 49.6%)
Emails after maildrop filtering : 93,312 (373 delivered, 0.4%)
Emails CRM identified as spam : 93,134 (178 delivered to inbox, 0.1%)
Emails my script let through : 889 (92,245 dropped, 99%)
Emails spamassassin agreed is spam : 844 (45 delivered to spambox, 5%)

So at the end of the day, 551 delivered to various mailboxes (0.26% of
the original mass), and 45 had to be manually inspected (0.02%). Out of
those, I think 1 was actually ham, which means false negative rate of
0.00047%, which isn't bad.

Anyway, that's my sad story. Let me know if anyone has any questions!

--
Dan Boger
dan-rlx3YLNxYWXQT0dZR+AlfA@xxxxxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
qnx.openqnx.dev...    gcc.libstdc++.c...    solaris.opensol...    information-ret...    misc.misterhous...    web.catalyst.ge...    apache.webservi...    redhat.release....    hardware.lirc/2...    kernel.autofs/2...    technology.sust...    linux.vdr/2003-...    editors.lyx.gen...    org.user-groups...    netbsd.devel.pk...    xdg.devel/2004-...    version-control...    jakarta.slide.d...    debian.packages...    creativecommons...    ports.ppc.embed...    bug-tracking.bu...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe