Prior Knowledge and Preferential Structures in Gradient Descent Learning Algorithms
Robert E. Mahony, Robert C. Williamson;
1(Sep):311-355, 2001.
Abstract
A family of gradient descent algorithms for learning linear
functions in an online setting is considered. The family
includes the classical LMS algorithm as well as new variants such
as the Exponentiated Gradient (EG) algorithm due to Kivinen and
Warmuth. The algorithms are based on prior distributions defined
on the weight space. Techniques from differential geometry are
used to develop the algorithms as gradient descent iterations
with respect to the natural gradient in the Riemannian structure
induced by the prior distribution. The proposed framework
subsumes the notion of "link-functions".
[abs]
[pdf]
[ps.gz]
[ps]