The state of matter that we define as life is different from anything else we have encountered so far in the universe. Living systems not only perpetuate their existence out of equilibrium against the will of the second law of thermodynamics, but they do so while keeping up with an ever-changing environment. A key part of this capacity to adapt to environmental changes is the ability of organisms to gather information from their surroundings to put together an adequate response to the challenges presented to them. This thesis presents an effort to understand, from first principles, this fundamental feature of information gathering that all life on earth shares. We dig into the physics behind one of the most pervasive mechanisms through which living systems sense and respond to the environment–the ability to turn on and off genes. In doing so, we hope to uncover general principles of how organisms deal with the problem of collecting information about the world that surrounds them.
In Chapter 1, we develop the theoretical and conceptual tools to navigate the rest of the thesis. I introduce the idea of gene regulation, as well as different theoretical models of this pervasive biological phenomenon. We also delve into the realm of information theory and learn how the plastic concept of information can be mathematically defined and quantified.
The second stop in our exploration (Chapter 2) asks the following question: can we understand, from first principles, how it is that proteins allow cells to regulate their genes on-demand upon sensing environmental cues? For this, we explore the physics behind transcriptional control due to allosteric transcription factors. Using simple quasi-equilibrium models of the two processes involved in this type of regulation—the regulation of the gene by the binding and unbinding of the transcription factor, and the regulation of the activity of the transcription factor itself by the binding and unbinding of an effector molecule—we are able to predict the input-output function of a simple genetic circuit, and compare such predictions with experimental determinations of the mean response of a population of bacterial cells.
We then expand on these insights to ask questions about the inescapable cell-to-cell variability that isogenic cells encounter. For this, we have to leave behind the pure thermodynamic framework and work in the language of chemical kinetics. This allows us to make predictions beyond the mean input-output gene expression response of cells by reconstructing full gene expression distributions. With these probabilistic input-output functions, in Chapter 3 we formalize the question of the amount of information that cells can gather from the environment. For this, we turn to information-theoretic concepts of maximal mutual information (otherwise known as channel capacity) between the state of the environment and the gene expression response from bacterial cells. Finally, we compare our predictions of the maximum amount of information—measured in bits—that cells can gather with single-cell inferences of this quantity.