Global and Local Complexity Measures for Transductive Learning

Ilya Tolstikhin
Germany, Max Planck Institute for Intelligent Systems, Department of Empirical Inference
Transductive learning considers situations when a learner observes labelled training points together with unlabelled test points with the final goal of giving correct answers for the test points. These types of problems naturally appear in many modern applications, including text mining, recommender systems, and computer vision, where the objects to be classified are often available beforehand.
One common way to model a data generating process in transductive learning is to assume that the training points are sampled uniformly without replacement from a fixed finite population of examples. This breaks
an independence assumption, which is in the core of a standard inductive i.i.d. learning setting, and makes objects in the training set interdependent. As a result, one can not straightforwardly apply a number of key tools used in analysis of the i.i.d. learning setting, including numerous concentration inequalities and standard results of empirical process theory.
In this talk we will give a short overview of techniques for deriving risk bounds in transductive learning. Following a classic VC-approach, we will base our study of learning algorithms on the analysis of empirical processes. However, in contrast to i.i.d. learning, these processes will be re- lated to sampling without replacement. First we will discuss concentration inequalities, available in this setting, including versions of Hoeffding’s, McDiarmid’s, and Talagrand’s inequalities. Next we will use these results to link the performance of a transductive learning algorithm to various complexity measures of hypothesis classes, such as a standard i.i.d. Rademacher complexity, certain localized complexity measures, and a recently introduced permutational Rademacher complexity. During the talk we will discuss both a worst-case analysis, when nothing is known about the problem at hand, and a refined analysis based on different “niceness” assumptions.