Information and Control in Biology, Part 1: Preliminary Considerations

Disclaimer: I am not a biologist, but I have become interested in biology and related matters over the past couple of years. One reason is obviously the pandemic, so the talk of biology, viruses, mRNA, and the like is everywhere. The other, main, reason is that I think we will not get anywhere interesting in AI unless we understand the concepts of autonomy, self-directedness, integration, and adaptation in even very simple biological systems.

This will be the first in a series of posts that are meant as an extended response to Yohan John‘s old post over at 3 Quarks Daily.

Yohan writes:

We are increasingly employing information as an explanation of phenomena outside the world of culture and technology — as the central metaphor with which to talk about the nature of life and mind. Molecular biology, for instance, tells us how genetic information is transferred from one generation to the next, and from one cell to the next. And neuroscience is trying to tell us how information from the external world and the body percolates through the brain, influencing behavior and giving rise to conscious experience.

But do we really know what information is in the first place? And is it really a helpful way to think about biological phenomena? I’d like to argue that explanations of natural phenomena that involve information make inappropriate use of our latent, unexamined intuitions about inter-personal communication, blurring the line between what we understand and what we don’t quite have a grip on yet.

Similar sentiments are quoted by Carl Bergstrom and Martin Rosvall:

Biologists think in terms of information at every level of investigation. Signaling pathways transduce information, cells process information, animal signals convey information. Information flows in ecosystems, information is encoded in the DNA, information is carried by nerve impulses. In some domains the utility of the information concept goes unchallenged: when a brain scientist says that nerves transmit information, nobody balks. But when geneticists or evolutionary biologists use information language in their day-to-day work, a few biologists and many philosophers become anxious about whether this language can be justified as anything more than facile metaphor.

Yohan argues that information theory is, on the whole, not an appropriate framework with which to reason about biological information. Carl and Martin argue otherwise, but propose their own framework, what they refer to as the transmission sense of information, which purportedly resolves the issues that trouble “a few biologists and many philosophers.” My goal in this series of posts is to argue that information theory can indeed be applied to biology, but that its proper application needs to be built up from first principles, starting with a serious engagement with its entire conceptual framework. Moreover, I agree with Yohan that digital communication is not the right conceptual schema; instead, we should be talking about control, programmability, and behaviors.

Continue reading “Information and Control in Biology, Part 1: Preliminary Considerations”

Information theory in economics, Part II: Robustness

As we have seen in Part I, the rational inattention framework of Christopher Sims aims to capture the best a rational agent can do when his capacity for processing information is limited. This rationally inattentive agent, however, has no reason to question his statistical model. In this post we will examine the robustness framework of Thomas Sargent, which deals with the issue of model uncertainty, but does not assume any capacity limitations.

Continue reading “Information theory in economics, Part II: Robustness”

Information theory in economics, Part I: Rational inattention

Economic activity involves making decisions. In order to make decisions, agents need information. Thus, the problem of acquisition, transmission, and uses of information has been occupying the economists’ attention for some time now (there is even a whole subfield of “information economics”). It is not surprising, therefore, that information theory, the brainchild of Claude Shannon, would eventually make its way into economics. In this post and the one to follow, I will briefly describe two specific strands of information-theoretic work in economics: the rational inattention framework of Christopher Sims and the robustness ideas of Thomas Sargent. (As an interesting aside: Sims and Sargent have shared the 2011 Nobel Memorial Prize in Economics, although not directly for their information-theoretic work, but rather for their work related to causality.)

Continue reading “Information theory in economics, Part I: Rational inattention”

Directed stochastic kernels and causal interventions

As I was thinking more about Massey’s paper on directed information and about the work of Touchette and Lloyd on the information-theoretic study of control systems (which we had started looking at during the last meeting of our reading group), I realized that directed stochastic kernels that feature so prominently in the general definition of directed information are known in the machine learning and AI communities under another name, due to Judea Pearlinterventional distributions.

Continue reading “Directed stochastic kernels and causal interventions”


Three cheers for open access!

While searching for a paper on the Rényi entropy, I stumbled across Kybernetika: International journal published by Institute of Information Theory and Automation. Since 1965, this journal has been publishing articles on information theory, statistical decisions, optimal control, finite automata, neural nets, mathematical economics, optimization, adaptive behavior, and other subjects that were, during the heyday of cybernetics, viewed as but individual aspects of a soon-to-be-born grand unifying science of natural and artificial adaptive systems. Even though the cyberneticians’ dream never came true (as detailed in Andrew Pickering‘s fascinating account The Cybernetic Brain, which I am now reading), it gave rise to numerous offshoots in other disciplines.

Rummaging through the journal archives, I found a few interesting articles by information theorists, such as Mark Pinsker, Albert Perez and the recently deceased Igor Vajda, and even by actual cyberneticians, such as Gordon Pask.

Here are a couple of articles that would be interesting to the readers of this blog:

Albert Perez, Information-theoretic risk estimates in statistical decision, Kybernetika, vol. 3, no. 1, pp. 1-21, 1967

In this paper we give some information-theoretical estimates of average and Bayes risk change
in statistical decision produced by a modification of the probability law in action and, in particular,
by reducing or enlarging the sample space as well as the parameter space sigma-algebras. These
estimates, expressed in terms of information growth or generalized f-enrotpy not necessarily of
Shannon’s type, are improved versions of the estimates we obtained in previous papers.

Flemming Topsøe, Information-theoretical optimization techniques, Kybernetika, vol. 15, no. 1, pp. 8-27, 1979

It is the object of this paper to show that a game theoretical viewpoint may be taken to underlie
the maximum entropy principle as well as the minimum discrimination information principle,
two principles of well known significance in theoretical statistics and in statistical thermodynamics. Our setting is very simple and certainly calls for future expansion.

Oddly, the latter paper does not seem to be very well known. However, recent work by Peter Grünwald and Philip Dawid extends Topsøe’s game-theoretic viewpoint and develops generalized notions of entropy and divergence for statistical decision problems with arbitrary loss functions:

Peter Grünwald and Philip Dawid, Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory, Annals of Statistics, vol. 32, no. 4, pp. 1367-1433, 2004

We describe and develop a close relationship between two problems that have customarily been regarded as distinct: that of maximizing entropy, and that of minimizing worst-case expected loss. Using a formulation grounded in the equilibrium theory of zero-sum games between Decision Maker and Nature, these two problems are shown to be dual to each other, the solution to each providing that to the other. Although Topsøe described this connection for the Shannon entropy over 20 years ago, it does not appear to be widely known even in that important special case.

We here generalize this theory to apply to arbitrary decision problems and loss functions. We indicate how an appropriate generalized definition of entropy can be associated with such a problem, and we show that, subject to certain regularity conditions, the above-mentioned duality continues to apply in this extended context. This simultaneously provides a possible rationale for maximizing entropy and a tool for finding robust Bayes acts. We also describe the essential identity between the problem of maximizing entropy and that of minimizing a related discrepancy or divergence between distributions. This leads to an extension, to arbitrary discrepancies, of a well-known minimax theorem for the case of Kullback–Leibler divergence (the “redundancy-capacity theorem” of information theory).

For the important case of families of distributions having certain mean values specified, we develop simple sufficient conditions and methods for identifying the desired solutions. We use this theory to introduce a new concept of “generalized exponential family” linked to the specific decision problem under consideration, and we demonstrate that this shares many of the properties of standard exponential families.

Finally, we show that the existence of an equilibrium in our game can be rephrased in terms of a “Pythagorean property” of the related divergence, thus generalizing previously announced results for Kullback–Leibler and Bregman divergences.

The actual paper is quite lengthy (over 60 pages of generalized entropy goodness!), but well worth the time.