Basic operation: Scripts
There is generally
little point in running the commands above by hand, except if you want to understand
how dbacl(1) operates, or want to experiment with switches.
Note, however, that simple scripts often do not check for error and warning messages on
STDERR. It is always worth rehearsing the operations you intend to script, as
dbacl(1) will let you know on
STDERR if it encounters problems during learning. If
you ignore warnings, you will likely end up with suboptimal classifications,
because the dbacl system prefers to do what it is told predictably,
rather than stop when an error condition occurs.
Once you are ready for spam filtering, you need to handle two issues.
The first issue is when and how to learn.
You should relearn your categories whenever you've received an appreciable number of emails or whenever you like. Unlike other spam filters, dbacl cannot
learn new emails incrementally and update its category files. Instead, you
must keep your messages organized and dbacl(1) will take a snapshot.
This limitation is actually advantageous in the long run, because it forces you to
keep usable archives of your mail and gives you control over every message
that is learned. By contrast, with incremental learning you must remember
which messages have already been learned, how many times, and whether to unlearn
them if you change your mind.
A dbacl category model normally doesn't change dramatically if you add a single new email (provided the
original model depends on more than a handful of emails). Over time, you
can even stop learning altogether when your error rate is low enough.
The simplest strategy for continual learning is a cron(1) job run once a day:
% crontab -l > existing_crontab.txt
Edit the file existing_crontab.txt with your favourite editor and add the following
three lines at the end:
CATS=$HOME/.dbacl
5 0 * * * dbacl -T email -H 18 -l $CATS/spam $HOME/mail/notspam
10 0 * * * dbacl -T email -H 18 -l $CATS/notspam $HOME/mail/notspam
Now you can install the new crontab file by typing
% crontab existing_crontab.txt
The second issue is how to invoke and what to do with the dbacl classification.
Many UNIX systems offer procmail(1)
for email filtering. procmail(1) can pipe a copy of each incoming email into dbacl(1), and use the
resulting category name to write the message directly to the appropriate mailbox.
To use procmail, first verify that the file
$HOME/.forward exists and contains the single line:
|/usr/bin/procmail
Next, create the file $HOME/.procmailrc and make sure it contains something like this:
PATH=/bin:/usr/bin:/usr/local/bin
SHELL=/bin/bash
MAILDIR=$HOME/mail
DEFAULT=$MAILDIR/inbox
#
# this line runs the spam classifier
#
:0
YAY=| dbacl -vT email -c $HOME/.dbacl/spam -c $HOME/.dbacl/notspam
#
# this line writes the email to your mail directory
#
:0:
* ? test -n "$YAY"
# if you prefer to write the spam status in a header,
# comment out the first line and uncomment the second
$MAILDIR/$YAY
#| formail -A "X-DBACL-Says: $YAY" >>$DEFAULT
#
# last rule: put mail into mailbox
#
:0:
$DEFAULT
The above script will automatically file your incoming email into one of two folders named $HOME/mail/spam and $HOME/mail/notspam respectively (if you have a POP account, and your mailreader contacts your ISP directly, this won't work. Try using fetchmail(1)).
|