1446
Journée Mathematical Foundations of Learning Theory
< précédent | suivant >
On Optimal and Universal Estimators in Learning Theory
Vladimir Temlyakov (University of South Carolina) 2 juin 2006 |
This talk addresses some problems of supervised learning. Supervised learning, or learning-from-examples, refers to a process that builds on the base of available data of inputs xi and outputs yi , i = 1, . . . , m, a function that best represents the relation between the inputs x ∈ X and the corresponding outputs y ∈ Y . The goal is to find an estimator f z on the base of given data z := ((x1 , y1 ), . . . , (xm, ym)) that approximates well the regression function fρ (or its projection) of an unknown Borel probability measure ρ defined on Z = X × Y . We assume that (xi , yi ), i = 1, . . . , m, are indepent and distributed according to ρ.
There are several important ingredients in mathematical formulation of this problem. We follow the way that has become standard in approximation theory and has been used in recent papers. In this approach we first choose a function class W (a hypothesis space H) to work with. After selecting a class W we have the following two ways to go. The first one is based on the idea of studying approximation of the L2 (ρX ) projection fW := (fρ )W of fρ onto W . Here, ρX is the marginal probability measure. This setting is known as the improper function learning problem or the pro jection learning problem. In this case we do not assume that the regression function fρ comes from a specific (say, smoothness) class of functions. The second way is based on the assumption fρ ∈ W . This setting is known as the proper function learning problem. For instance, we may assume that fρ has some smoothness. We will give some upper and lower estimates in both settings.
In the problem of universal estimators we assume that an unknown measure ρ satisfies some conditions. Following the standard way from nonparametric statistics we formulate these conditions in the form fρ ∈ Θ. Next, we assume that the only a priori information available is that fρ belongs to a class Θ (unknown) from a known collection {Θ} of classes. We want to build an estimator that provides approximation of fρ close to the optimal for the class Θ. We use a standard method of penalized least squares estimators for construction of universal estimators.