The Subspace Information Criterion for Infinite Dimensional Hypothesis Spaces
Masashi Sugiyama, Klaus-Robert Müller;
3(Nov):323-359, 2002.
Abstract
A central problem in learning is selection of an appropriate model. This is
typically done by estimating the
unknown generalization errors of a
set of models to be selected from and then choosing the model with minimal
generalization error estimate. In this article, we discuss the problem of
model selection and generalization error estimation in the context of kernel
regression models,
e.g., kernel ridge regression, kernel
subset regression or Gaussian process regression.
Previously, a non-asymptotic generalization error estimator called the
subspace information criterion (SIC) was proposed, that could be
successfully applied to
finite dimensional subspace models. SIC is an unbiased
estimator of the generalization error for the finite sample case under the
conditions that the learning target function belongs to a specified
reproducing kernel Hilbert space (RKHS)
H and the reproducing kernels
centered on training sample points span the whole space
H.
These conditions hold only if dim
H < l, where
l < infinity
is the number of training examples. Therefore, SIC could be applied only to finite
dimensional RKHSs.
In this paper, we extend the range of applicability of SIC, and show that
even if the reproducing kernels centered on training sample points
do not span the whole space
H,
SIC is an unbiased estimator of an essential part of the
generalization error.
Our extension allows the use of any RKHSs including
infinite dimensional ones, i.e.,
richer function classes commonly used in
Gaussian processes, support vector machines or boosting. We further show
that when the kernel matrix is invertible, SIC can be expressed in a much simpler form,
making its computation highly efficient. In computer simulations on ridge
parameter selection with real and artificial data sets, SIC is compared
favorably with other standard model selection techniques for instance
leave-one-out cross-validation or an empirical Bayesian method.
[abs]
[pdf]
[ps.gz]
[ps]