Locally Isometric and Conformal Low-dimensional Data Representation in Data Analysis

Prof. Alex Bernshtein (speaker), Alexander Kuleshov,
Yuri Yanovich

Russia, National Research University Higher School of Economics
In many data analysis tasks, one deals with data that are presented in high-dimensional spaces. However the real-world high-dimensional data typically satisfy the manifold assumption that they lie on or near a certain unknown low-dimensional manifold (data manifold) embedded in an ambient high-dimensional “observation” space. Under this assumption, non-linear machine learning problems are usually referred to as manifold learning problems.
In practice, the original high-dimensional data are transformed into lower-dimensional representations (features), preserving certain subject-driven data properties. Then the constructed low-dimensional representations of the original data can be handled efficiently, avoiding the “curse of dimensionality”. Manifold learning provides a mathematical foundation for such procedures.
Preserving the distances and angles between the original high-dimensional vectors and their features is an important and desirable property of the representation, but an exact isometric transformation to a low-dimensional vector space is possible only in cases where the data manifold has zero sectional curvature.
We propose a new geometrically motivated locally isometric and conformal representation method, which preserves distances and angles between nearby points; the method employs a data-based construction of the smooth orthonormal tangent vector fields on the data manifold. In numerical tests, the proposed method compares favorably with popular manifold learning methods in terms of embedding isometry as well as of accuracy of data manifold reconstruction from the sample.