I am posting this while using free wi-fi on board a VIA Rail train (hooray for socialism!). Yesterday I gave a talk at McGill (The Harvard of Canada!) on the fundamental limitations of adaptive dynamical systems, and now I am on my way to do the same at Queen’s University. Regular posts will resume in November.
This passage from “The Glivenko-Cantelli problem, ten years later” by Michel Talagrand (J. Theoretical Probability, vol. 9, no. 2, pp. 371-384, 1996) will most likely be remembered forever as the best example of wry self-deprecating wit in an academic paper:
Over 10 years ago I wrote a paper that describes in great detail Glivenko-Cantelli classes. Despite the fact that Glivenko-Cantelli classes are certainly natural and important, this paper apparently has not been understood. The two main likely reasons are that the proofs are genuinely difficult; and that the paper displays bad taste of monumental proportion, in the sense that a lot of energy is devoted to extremely arcane measurability questions, which increases the difficulty of the proofs even more.
This is the first post in a series aobut the versatility and the power of the good old Kullback-Leibler divergence.
Today I will describe a beautiful application of the divergence to the problem of determining mixing rates of finite Markov chains. This argument is quite recent, and comes from a nice paper by Sergey Bobkov and Prasad Tetali (“Modified logarithmic Sobolev inequalities in discrete settings,” Journal of Theoretical Probability, vol. 19, no. 2, pp. 209-336, 2006; preprint). Since my interest here is information-theoretic, I will take for granted certain facts from the theory of finite Markov chains.
It’s time to fire up the Shameless Self-Promotion Engine again, for I am about to announce a preprint and a paper to be published. Both deal with more or less the same problem — i.e., fundamental limits of certain sequential procedures — and both rely on the same set of techniques: metric entropy, Fano’s inequality, and bounds on the mutual information through divergence with auxiliary probability measures.
So, without further ado, I give you: (more…)