Support Vector Machine Active Learning with Applications to Text Classification
Simon Tong, Daphne Koller;
2(Nov):45-66, 2001.
Abstract
Support vector machines have met with significant success in numerous
real-world learning tasks. However, like most machine learning
algorithms, they are generally applied using a randomly selected
training set classified in advance. In many settings, we also have
the option of using
pool-based active learning. Instead of using
a randomly selected training set, the learner has access to a pool of
unlabeled instances and can request the labels for some number of
them. We introduce a new algorithm for performing active learning
with support vector machines, i.e., an algorithm for choosing which
instances to request next. We provide a theoretical motivation for the
algorithm using the notion of a
version space. We present
experimental results showing that employing our active learning method
can significantly reduce the need for labeled training instances in
both the standard inductive and transductive settings.
[abs]
[pdf]
[ps.gz]
[ps]