Protecting Mailman with bogofilter against spam
Also tired of manually polishing the Mailman mail queue of trapped spam? You can
easily insert the bogofilter spam filter
into the mail flow. How to set it up?
Naturally, you need a working installation of Mailman
(the GNU Mailing List Manager). And an installation of bogofilter.
Setup of bogofilter: training the filter
After defining things in '/etc/bogofilter.cf' (not much to change), we first train the filter.
This requires two "mbox" style email files, one containing the good
emails (ham) and one containing the bad emails (spam). The Mutt E-Mail Client
is a good friend to select messages and create these files. We generate them as
'/var/spool/mail/bogotrain-as-spam' and '/var/spool/mail/bogotrain-as-good'.
Next you need to get the permissions of '/var/spool/bogofilter/' right to enable
bogofilter to auto-update the wordlist there.
Later all filtered spam will arrive in '/var/spool/mail/spam.bogofilter' (see below).
- train on spams:
cat /var/spool/mail/bogotrain-as-spam | bogofilter -s -v
- train on ham (non-spams):
cat /var/spool/mail/bogotrain-as-good | bogofilter -n -v
Procmail definitions for bogofilter
We have to use procmail to filter the incoming emails with bogofilter.
Get the bogofilterrc file and store it in
Modifying the Mailman definitions
Next step is to first pass incoming list emails to bogofilter before handing them
over to Mailman. Basically one line is modified in '/etc/aliases' (we assume a
working Mailman installation here):
For other lists just replace the 'MAILMAN=xxxx' parameter accordingly.
Do this for all lists which you are running... Easy, no?
## grass-dev mailing list
#grass-dev: "|/usr/lib/mailman/mail/mailman post grass-dev"
grass-dev: "|/usr/bin/procmail -m MAILMAN=grass-dev /etc/mail/procmail/bogofilterrc"
grass-dev-admin: "|/usr/lib/mailman/mail/mailman admin grass-dev"
grass-dev-bounces: "|/usr/lib/mailman/mail/mailman bounces grass-dev"
grass-dev-confirm: "|/usr/lib/mailman/mail/mailman confirm grass-dev"
grass-dev-join: "|/usr/lib/mailman/mail/mailman join grass-dev"
grass-dev-leave: "|/usr/lib/mailman/mail/mailman leave grass-dev"
grass-dev-owner: "|/usr/lib/mailman/mail/mailman owner grass-dev"
grass-dev-request: "|/usr/lib/mailman/mail/mailman request grass-dev"
grass-dev-subscribe: "|/usr/lib/mailman/mail/mailman subscribe grass-dev"
grass-dev-unsubscribe: "|/usr/lib/mailman/mail/mailman unsubscribe grass-dev"
Don't forget to run this as 'root' after modification:
Define training cronjob for life-time learning
In the beginning you will observe, that some mails aren't yet properly
classified. For the GRASS mailing lists it took less than 3 days to
get it quite perfectly working, so no need to be nervous about this.
We simply define an overnight cronjob to re-train bogofilter from
the spam/ham collection. Save this as '/usr/bin/bogolearn.sh':
# TRAIN bogofilter CRONJOB
# save this as /usr/bin/bogolearn.sh
# train bogofilter with new spam and non-spam (ham)
cat $SPAMTRAINFILE | $BOGOFILTER -s -v
cat $NOSPAMTRAINFILE | $BOGOFILTER -n -v
As root, define the following cronjob:
chmod a+x /usr/bin/bogolearn.sh
Verify the job list:
You can now simply store wrongly classified emails into
'/var/spool/mail/bogotrain-as-spam' or '/var/spool/mail/bogotrain-as-good',
respectively. Bogofilter will take care to learn from that.
#train bogolearn every morning at 3:30.
30 3 * * * sh /usr/bin/bogolearn.sh
Watch it working...
Don't forget to check this mail folder from time to time:
Save wrongly classified emails into '/var/spool/mail/bogotrain-as-spam'
or '/var/spool/mail/bogotrain-as-good', respectively.
mutt -f /var/spool/mail/spam.bogofilter
© 2007 Markus Neteler (neteler AT itc.it)
$Date: 2007-01-15 18:17:33 +0100 (Mo, 15 Jan 2007) $