Can a Bayesian spam filter play chess?
by Laird A. Breyer
Introduction
Many people these days depend on Bayesian filters to
protect them from the ever present email scourge that is spam.
Unlike older technologies, these programs' claim to fame is that
they learn the spam patterns automatically, and more importantly,
learn personalized spam (bad) and ham (good) email patterns.
Like many others, I wrote a Bayesian filter to protect me from unwanted
email, which I called dbacl.
My implementation functions as a Unix command line text classifier,
with special email support, and can be used with procmail.
People are often astonished at how well statistical mail filtering works
after they first try it, and it's tempting to imagine that such programs
actually understand the emails being delivered, rather than merely matching
patterns.
Now chess
has always been a popular gauge of intelligence that everyone
can understand, so if we put all these ideas together,
then the question "Can a Bayesian spam filter play chess?"
seems like a fun experiment with a lot of appeal.
Let's put down some ground rules: This experiment will test a real spam filter, not a specially designed chess program. It won't aim to beat
Deep Thought
(I wouldn't know where to start, and I have a feeling this could be difficult anyway ;-), but it will aim to show signs of "intelligence", or we won't claim
success.
Finally, since dry tables and graphs are no fun, a theoretical proof of concept is not enough:
the spam filter must really play chess
in a way that everyone can see, and try out at home.
The account below is designed so that you can follow and duplicate
it by yourself. All
you need is a Unix compatible computer. You'll have to open a terminal
and be ready to type shell commands. All shell commands below are preceded
by % to indicate the prompt, but you don't type the '%'.
Instructions are fairly detailed, and
various scripts can be downloaded when needed, but it helps if you're
familiar with the shell. Ask a friend if you need help.
Important: You must follow these instructions if you want to actually play chess
against the spam filter. You must also download some training games and
teach the filter beforehand. Running the scripts alone is not enough.
The instructions have been tested and work correctly on a GNU system with
the bash shell.
Start by making a directory to keep all the
workings in one place.
% mkdir chess
% cd chess
|