TREC
The (United States) National Institute of Standards and Technology organises an annual conference on text retrieval called
TREC, which in 2005 began a new
track on spam filtering. A goal of this conference is to develop over
several years a set of standard methodologies for evaluating and comparing
spam filtering systems.
For 2005, the initial goal is to compare spam filters in a laboratory
environment, not directly connected to the internet. An identical
stream of email messages addressed to a single person
is shown in chronological order to all
participating filters, which can learn them incrementally and must predict
the type of each message as it arrives.
The spamjig
is the automated system which performs this evaluation. You can download it
yourself and run it with your own email collections to test any participating
filters. Special instructions for dbacl can be found in the
TREC subdirectory of the source package. Many other open source spam filters
can also be tested in this framework.
|