Data Properties Estimation As a Statistical Inverse Problem

Sc.D. Anatoli Michalski
Russia, V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow
Many problems of data analysis are inverse by nature. In physics, astronomy and many other branches of science, one observes the resulting process and is interested in the process which is the cause of the observations. Demography and epidemiology provide examples of inverse problems when one estimates death rate by the longevity statistics or the infection rate by the number of sick people in the population. In big telecommunication systems such as the internet, the problem of estimation of structure and performance using probing methods is an inverse problem as well.
Mathematically, the relationship between the process, which generates observations, and observed process can be expressed in terms of an operator equation. It is known in mathematics that most operators, which arise in practical applications, have an unbounded inverse operator. The result is that even if the disturbance in the observations tends to be zero, the disturbance in the operator equation solution may not be small. This leads to a misunderstanding of the data generating process. The implementation of proper data analysis techniques helps to clarify intrinsic features of empirical data.
This report considers the problem of integral equation solution in a situation when observations and integral equation kernel are estimated on empirical data. An example of such a problem is failure rate estimation.
In this case, one observes an independent sample of times when failures occurred. The cumulative probability function for failure time forms the observed function and the kernel for an integral equation, the solution of which is the failure rate function. The solution of the approximate equation, obtained by substituting cumulative probability function by empirical cumulative probability function is discussed and the results are presented.