2012 Annual Science Report
University of Hawaii, Manoa Reporting | SEP 2011 – AUG 2012
Amino Acid Alphabet Evolution
A genetically encoded alphabet of just 20 amino acids has produced the universe of protein structures and functions found throughout Earth’s biosphere. Relationships within this amino acid alphabet are responsible for fundamental biological phenomena, such as protein folding and patterns of molecular evolution. In attempting to unravel these relationships, considerable scientific ingenuity has been spent developing systems to simplify the genetically encoded alphabet of 20 amino acids while minimizing the associated loss of chemical diversity. These efforts present an opportunity to generate a composite picture of the properties that link the amino acids as a set. We are therefore investigating whether different simplification schemes (“simplified amino acid alphabets”), including those derived from very different approaches, can be combined to create a coherent description of amino acid similarity. By understanding the organization and relationships between amino acids on Earth, we hope to shed light on the chemical logic to be expected as a product of evolution in extraterrestrial environments.
An extensive scientific literature has converged on surprisingly clear agreement that a subset of only around half of the 20 genetically encoded amino acids was likely present from the inception of genetic coding (the “early” amino acids), and an equal sized subset was incorporated through subsequent evolution (the “late” amino acids). A further widespread assumption is that, as the set expanded, natural selection favored the addition of amino acids that extended the range of protein structures and functions. We initiated a quantitative investigation for consilience between these two important ideas.
We have developed a new method for the objective comparison of multiple simplified amino acid alphabets, generated by any means. We have applied this method to an all against all comparison of 34 simplification schemes reported within previous scientific literature. This study revealed that different simplification schemes did not cluster according to the approach by which they were calculated. To understand why this should be so, we extended our alphabet comparison methods to produce a detailed view of amino acid similarity according to the consensus of two or more simplified alphabets. We used this calculation to show a strong and unanticipated match between the consensus view of all “top down” approaches (which derive their measurements of amino acid similarity by considering the roles that amino acids play within proteins and protein evolution) and “bottom up” approaches (which derive their measurements of amino acid similarity from the physics and chemistry of individual amino acid molecules). Taken together, these results indicate that although individual simplified alphabets differ from one another and continue to emerge, taken as a whole they can be deconstructed to reveal a coherent and meaningful picture of amino acid similarity. Our new understanding offers new tractability for fundamental questions from astrochemistry to synthetic biology.
We developed a novel method to represent the protein-building potential of a set of amino acids in terms of the area(s) of chemistry space that these molecules populate. Using this method, we were able to conduct the first direct test of the adaptive hypothesis that the “early” amino acid alphabet grew through the addition of amino acids that enabled the construction of new protein structures and functions. Specifically, our analysis revealed remarkably strong support for the concept of adaptive growth of the amino acid alphabet in terms of amino acid size (Van der Waals volume) and hydrophobicity (LogP), producing a clear signal despite the simplicity of our test. However, amino acid charge (measured as pI) showed no such tendency. We discussed a range of interpretations, including the reasonable expectation that an optimal set of amino acid pI values might entail a more sophisticated distribution than our current test would detect. This work has been submitted for publication in the Journal of Systems Chemistry.
PROJECT MEMBERS:Stephen Freeland
RELATED OBJECTIVES:Objective 3.2
Origins and evolution of functional biomolecules
Earth's early biosphere.
Production of complex life.
Adaptation and evolution of life beyond Earth
Biosignatures to be sought in Solar System materials
Biosignatures to be sought in nearby planetary systems