Algorithmic Luckiness
Ralf Herbrich, Robert C. Williamson;
3(Sep):175-212, 2002.
Abstract
Classical statistical learning theory studies the generalisation
performance of machine learning algorithms rather indirectly. One
of the main detours is that algorithms are studied in terms of the
hypothesis class that they draw their hypotheses from. In this paper,
motivated by the luckiness framework of Shawe-Taylor et al. (1998), we
study learning algorithms more directly and in a way that allows us
to exploit the serendipity of the training sample. The main difference
to previous approaches lies in the complexity measure; rather than
covering all hypotheses in a given hypothesis space it is only necessary
to cover the functions which could have been learned using the fixed
learning algorithm. We show how the resulting framework relates to
the VC, luckiness and compression frameworks. Finally, we present
an application of this framework to the maximum margin algorithm for
linear classifiers which results in a bound that exploits the margin,
the sparsity of the resultant weight vector, and the degree of clustering
of the training data in feature space.
[abs]
[pdf]
[ps.gz]
[ps]
errata:
[pdf]
[ps.gz]
[ps]