Introduction
dbacl is a UNIX/POSIX command line
toolset which can be used in scripts to classify a single email among one
or more previously learned categories.
dbacl(1) supports several different statistical models and several different tokenization
schemes, which can be adjusted to trade in speed and memory performance for
statistical sophistication. dbacl(1) also permits the user to select cost weightings
for different categories, thereby permitting simple adjustments to the
type I and type II errors (a.k.a. false positives, etc.). Finally,
dbacl(1) can print confidence percentages which help decide when
a decision is ambiguous.
dbacl(1)
is a general purpose text classifier which can understand email message formats.
The tutorial explains general classification only. It is worth reading (of course :-), but doesn't describe the extra steps necessary
to enable the email functionality. This document describes the necessary switches and caveats from first principles.
You can learn more about the dbacl suite of utilities (e.g dbacl,
bayesol,
mailinspect,
mailcross) by typing for example:
% man dbacl
|