HEADSUP: changes to GNATS spam filtering

Mark Linimon linimon at lonesome.com
Sun Oct 16 19:23:23 UTC 2005

One of the tasks that I've been performing for quite some time is to go
into the GNATS database and close spam PRs.  These generally wind up in
the 'pending' state.  On any given day there can be 5-10 of these, and
sometimes even more.  It's getting really boring to go clean these up.

It turns out that spamassassin is set up as a first line of defense but
the rules have not been tweaked for ages.  I've now gone and tweaked the
rulesets, while running regression tests on PRs that should have been
rejected but weren't, but also on PRs that must still make it through.

Here's the status:

 - It will now be difficult for a PR initially formatted in nothing but
   HTML to get through these filters.  Regression testing of all ports
   PRs starting with 75000, which also included HTML (either in patches
   or in email replies), shows that zero legitimate PRs would have been
   rejected by these changes.  (The rules that I have emphasized are the
   usual suspects of garish fonts, specially formatted URLs, images, and
   similar nonsense).  In particular, if anything submitted from the web
   interface would have been affected by this, I expect my testing would
   have showed it.

 - It will now be extremely difficult for you to enter PRs about your
   home financing, mutually beneficial business relationships, stock
   portfolio, EBay and PayPal accounts, and medical problems.

 - I reasonably believe that no other PRs will be affected.

 - I will start actively monitoring the reject bin to look for any new
   problems.  It turns out that on occasion things may have already been
   landing in there and no one noticed (one legitimate one today wound
   up there with no clue as to why it was rejected, the entire log
   consisted of "0/0".)  However, note that more often, legitimate PRs
   get "lost" only temporarily -- they get a PR number and can be
   accessed individually from GNATSWeb, but can fail to show up in the
   overall index until our next forced index rebuild (run from cron).
   This is a separate issue.

Please let me know *offlist* if you suspect that a new PR you submitted
has been mishandled by GNATS and I'll investigate.

I will be adding some text to the submission guidelines deprecating the
use of HTML to submit PRs later today.

Thanks for your patience and understanding.

Mark Linimon
bugmeister at FreeBSD.org

