Feature Selection by Conditional Distribution Contrasting

Dr. Varvara Tsurko
Russia, V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow
This presentation investigates the features selection problem. Two classes are considered, y ∈ {0, 1} – the class label. An object x belongs to one of two classes and each object is the n-dimensional vector, the coordinates of which are features.
The classes are described by conditional distributions of features. The goal of the proposed feature selection algorithm is to find the set of features for which conditional distributions in the classes have the maximal difference.
The difference between distributions is characterized by the symmetrized Kullback-Leibler divergence between the conditional density functions
of two classes p(x|y = 0) and p(x|y = 1). We are looking for such a set of features for which the divergence is maximal.
The density functions are unknown, but they can be calculated by solving the Fredholm integral equations. We use the method presented in [3] where the right side of the equation and the operator are defined approximately and the Stefanyuk-Vapnik theory is used in order to solve the equation.
The other approach is to formalize the difference as the functional of the average risk, the maximization of which is equal to the maximization of Kullback-Leibler divergence. The following functional of the average risk is proposed in [2] and is estimated on the empirical data using the data dependent bounds [1].
The author uses both approaches for estimation of the difference between two distributions, compares the results and proposes the procedure of features search.
  1. Koltchinskii V.: Rademacher penalties and structural risk minimization. In: IEEE. Transactions on Information Theory (1999).
  2. Tsurko V., Michalski A.: Feature Selection by Distributions Contrasting // Artificial Intelligence: Methodology, Systems and Applications,
  3. Agre, P. Hitzer, A. A. Krisnadhi, S. O. Kuznetsov (eds.), LNAI 8722. Springer-Verlag. pp. 139–149 (2014).
  4. Vapnik V., Izmailov R. Statistical Inference Problems and Their Rigorous Solutions // Statistical Learning and Data Sciences, A. Gammerman,
  5. Vovk, H. Papadopoulos (eds.), LNAI 9047. Springer-Verlag. pp. 33–71 (2015).