# Course objectives

- Introducing main approaches to define the notions of information and of quantity of information: the combinatorial approach (Hartley information), the probabilistic approach (Shannon entropy) and the algorithmic approach (Kolmogorov complexity).
- Studying properties of these notions.
- Applying information theory in different areas: proving upper and lower bounds of the number of operations (in different contexts), coding and information transmission, learning theory.

# Tentative contents

- Hartley information (the binary logarithm of the number of outcomes).
- Applying Hartley information to sorting and other problems on the number of comparisons: n log n lower bound for the number of comparisons needed to sort n numbers, lower bounds for the number of comparisons needed to find the fake coin (which is lighter or heavier than genuine ones) among several coins.
- Informational approach in communication complexity: the rectangle method.
- Informational analysis in logic of knowledge (the problem of dirty wise men, the problem of two sided cards, the problem of gentlemen's hats).
- Probability distributions over an alphabet (random variables taking values in a finite set). Uniquely decodable codes and prefix codes of variable length. Expected length of code words.
- Shannon entropy and its relation to minimal expected length of a prefix code. Shannon-Fano codes.
- Kraft-McMillan theorem and the lower bound of the expected length of any uniquely decodable code.
- Real texts, as Markov chains of small order, and an upper bound of the quantity of information in them. Redundancy.
- Pairs of jointly distributed random variables taking values in finite sets. The inequality for Shannon entropy of a pair of random variables.
- Conditional Shannon entropy and its properties.
- Independent random variables and their entropy. The quantity of information in a random variable about another random variable.
- Bit guessing game in a randomly chosen string. The cost of insider information and Shannon entropy.
- Information inequalities. Basic inequalities and their consequences (Shannon type inequalities).
- Two random variables that take the same value with high probability and Fano's inequality.
- Shannon noiseless coding theorem.
- Capacity of noisy channels and Shannon's theorem on the block codes for noisy channels.
- Information transmission in the case when the receiver has some partial information. Slepyan-Wolf theorem.
- Shannon entropy and mathematical statistics: information in the data about the model and minimal sufficient statistic.
- Information compression and universal decompressors. The amount of information in a given text (file) according to Kolmogorov (Kolmogorov complexity).
- Properties of Kolmogorov complexity: complexity does not exceed the length, complexity does not increase via algorithmic transformations, complexity is not computable, but it is upper semi-computable.
- The number of words with low complexity, incompressible words.
- An application of Kolmogorov complexity to one-tape Turing machines: a quadratic lower bound for the number of steps needed to copy a given input string.
- Conditional Kolmogorov complexity. The complexity of a pair of strings and Kolmogorov-Levin theorem.
- Similarities between Kolmogorov complexity, Shannon entropy and Hartley information. The relation between Kolmogorov complexity and Shannon entropy: with high probability Kolmogorov complexity of a word consisting of independent randomly chosen symbols is equal to its Shannon entropy.
- Applications of Kolmogorov complexity in mathematical statistics: separating useful information in the data from accidental noise; algorithmic minimal sufficient statistic.
- R. Solomonoff's theory on predicting bits of a sequence that is random with respect to an unknown probability distribution; universal predictors.
- PAC learning and Chervonenkis-Vapnik dimension.