The Information Structuralist

ISIT 2011: plenaries, the Shannon lecture

Posted in Conference Blogging, Information Theory, Mathematics by mraginsky on August 30, 2011

Better late than never, right? Besides, Anand all but made sure that I would blog about it eventually.

I have attended this year’s Shannon lecture and all the plenaries except for Wojtek Szpankowski‘s talk on the information theory of algorithms and combinatorics (I bravely fought the jetlag and lost), but you can watch it here. Here are, shall we say, some impressionistic sketches based on the notes I was taking.

The plenaries

Erdal Arikan, Polar Coding: Status and Prospects
Arikan’s award-winning paper on channel polarization introduced a new way of thinking about and constructing capacity-achieving codes for a class of binary-input channels satisfying a certain symmetry condition. The main idea underlying these polar codes is that, using an encoder with a particular recursive structure and a successive-cancellation decoder, it is possible to transform n parallel uses of such a channel with capacity C — which lies between 0 and 1 and, owing to symmetry, is achieved by using equiprobable inputs — into (roughly) n C uses of a nearly noiseless binary-input channel, (roughly) n(1-C) uses of a nearly useless (zero-capacity) binary-input channel, and the remainder consisting of channels that are neither particularly good, nor particularly bad. Arikan gave one motivation for his construction using the chain rule for mutual information, discussed the difference between choosing a channel “polarizer” at random and using his deterministic recursive construction, went over a number of interesting results that followed his initial work, presented a detailed quantitative comparison of polar codes and turbo codes, and finished with a discussion of open questions and promising further directions. One interesting direction that was not mentioned in the talk would be to explore other manifestations of structure vs. randomness in information theory.

Zhanna Reznikova, Ants and Bits
Zhanna Reznikova is an ethologist and the author of Animal Intelligence: From Individual to Social Cognition. She was talking about her work on understanding the communication capabilities of red wood ants, performed in collaboration with Boris Ryabko. She started by listing three possible approaches to studying animal communication: (1) direct decoding of signals; (2) the use of “intermediate” languages; and (3) the use of information theory. The first approach met with some degree of success in only a few cases (dances of honey bees; acoustic signals of monkeys), one of the main difficulties having to do with encountering natural animal communication in repeatable situations. The second approach only works by substituting natural animal communication with adapted human languages and artificial stimuli, so its usefulness is somewhat limited. Finally, the information-theoretic approach, used by Reznikova and Ryabko, bypasses the problem of decoding in favor of studying the patterns of communication (e.g., duration and complexity of messages) in reproducible laboratory settings in order to detect whether the animals perform any “compression” of observed statistical regularities in their environment. According to Reznikova, their experiments have shown that ants are capable of detecting and efficiently representing certain regular patterns in their environments; transmit to one another information about the number of objects of interest; and even add and subtract small numbers. I cannot comment on the experimental protocol used by Reznikova and Ryabko, and I am somewhat skeptical about the general validity of inference from observed duration and complexity of messages to the structure, meaning, and purpose of actual communication processes, but the idea of using information-theoretic “features” to understand the variety of animal languages seems intriguing. Perhaps, looking beyond single numbers like entropy and focusing on more refined features, like the entire information spectrum, may lead to interesting advances in the future.

Vladimir Tikhomirov, Entropy in Information Theory and Functional Analysis
The seminal paper of Kolmogorov and Tikhomirov on the metric entropy of function spaces has influenced a staggering number of disciplines, e.g., functional analysis, analysis of operators, ergodic theory, theory of dynamical systems, probability theory, statistical learning theory, geometric functional analysis, and many more. Tikhomirov’s talk was devoted to the roots of this work in Shannon’s information theory, its relation to foundations of mathematics (in particular, Hilbert’s 13th problem), and its impact on Kolmogorov’s thinking about the fundamental notions of complexity and randomness in both continuous and discrete worlds. It is only natural that, since then, we have come full circle: see, for instance, David Donoho’s ingenious use of Shannon’s rate-distortion theory to obtain sharp bounds on the Kolmogorov entropy of a ball in Sobolev space.

The 2011 Shannon Lecture

Shlomo Shamai, From Constrained Signaling to Network Interference Alignment via an Information-Estimation Perspective
This was a comprehensive and compelling story of the now-famous I-MMSE relation of Guo, Shamai, and Verdú. Its simplest form is as follows: if X is a real-valued random variable with a finite second moment and Z is a standard Gaussian random variable independent of X, then

\frac{d}{d \rho} I(X; \sqrt{\rho}X + Z) = \frac{1}{2} {\bf E} (X - {\bf E}[X|\sqrt{\rho}X + Z] )^2,

where \rho > 0 is the signal-to-noise ratio (SNR), the quantity on the left is the derivative of the mutual information, and the quantity on the right is one half of the minimum mean squared error (MMSE) of estimating X based on the noisy observation Y = \sqrt{\rho}X + Z. The remarkable thing is that this simple formula holds for regardless of the distribution of X (under the second-moment condition and provided the SNR is not a parameter of this distribution). There are extensions of this basic relation to vector channels, continuous time, channels with feedback, and so on.

Shamai’s lecture was a whirlwind tour of applications of the I-MMSE relations to a wide array of problems in information theory (channels with input constraints; multiterminal settings, such as relay, broadcast or wiretap channels and cellular networks; interference alignment), estimation theory (relations between optimal causal and noncausal filtering), and coding (the occasional merits of using “bad” channel codes). The I-MMSE formula and its relatives show that there are intimate links between information theory (which was conceived by Shannon as a theory of digital communication and storage) and estimation theory (which tends to focus on the “analog” world). Just like with Arikan’s channel polarization idea, I’m sure we’ve seen only the tiniest fraction of what one can do with the conceptual framework behind this deceptively simple formula.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: