2015 Annual Science Report
University of Illinois at Urbana-Champaign Reporting | JAN 2015 – DEC 2015
Project 10: Evolution Through the Lens of Codon Usage
Project Summary
The sequences of protein encoding genes are subject to multiple levels of selection. First, amino acid changes that adversely alter protein function are unlikely to survive. In addition, the genetic code is degenerate; it includes alternative (synonymous) codons for most of the amino acids. The codon usages of genes reflect a balance between drift and selection for rapid and accurate translation of mRNAs into proteins, and in the case of horizontally transferred genes, the codon usages of their sources. Our studies of genes and their codon usages have led us to discover that: (i) most of the recently acquired genes come from such closely related organisms that their distinctive codon usages cannot be attributed to a phylogenetically distant source; (ii) the transfers commonly exceed recognized boundaries of microbial species; (iii) some genes do not drift to match the native codon usage of their current genome, but resemble the most recently acquired genes; (iv) many of the genes that are most up-regulated under starvation conditions also have this codon usage; and (v) a distinctive stress/starvation-associated codon usage is a recurring theme that is observed in diverse Bacteria and Archaea.
These results have changed our understanding of the dynamics with which genetic novelties are shared in the biosphere, and revealed that there are selective forces on codon usage beyond those currently appreciated in the field.
Project Progress
Project 10: In our previous work on horizontally acquired genes and codon usage we found that (i) most of the recently acquired genes come from such closely related organisms that their distinctive codon usages cannot be attributed to a phylogenetically distant source; (ii) the transfers commonly exceed recognized boundaries of microbial species; (iii) some genes do not drift to match the native codon usage of their current genome, but resemble the most recently acquired genes; (iv) many of the genes that are most up-regulated under starvation conditions also have this codon usage; and (v) a distinctive stress/starvation-associated codon usage is a recurring theme that is observed in diverse Bacteria and Archaea. In our more recent work, we have begun
Project 10A: Evolution through the lens of codon usage: vertically inherited genes resist amelioration. Our earlier work showed that this distinctive codon usage is not limited to recently acquired genes, but is also characteristic of some genes that have been present in the genome for long enough that they would be expected to have drifted to match “native” codon usages of their genome. This was most dramatically illustrated by the major Salmonella Pathogenicity Islands: SPI1 and SPI2. We continue to examine genes with “alien” codon usage in these and other genomes, seeking factors that are correlated with its occurrence.
Project 10B: Evolution through the lens of codon usage: alien gene codon usage is an intrinsic adaptation to stress. In previous analyses, we concluded that the set of transferred genes and the SPI genes must be subject to positive selection for a novel category of codon usage. In examining possible biological correlates, we saw that the genes most up-regulated (≥10-fold) by ppGpp (the cellular signal for the starvation-induced stringent response) also share this codon usage. If stress (quite likely starvation) is present when the genes are expressed, then the codon usage is likely to facilitate gene expression under this condition. The limited literature that addresses possible connections between starvation and codon usage suggests that starvation for an amino acid leads preferential depletion of the charged tRNAs for the most commonly used codons, so that messenger RNAs that must be expressed under these conditions would benefit from using these codons less frequently. Although the experimental data on charging levels of tRNAs are (largely) consistent with the depletion models proposed, these models do not fully explain our observations. Most importantly, the theoretical and experimental scenarios are contrived, requiring the use of mutant strains that are unable to synthesize a single amino acid, as opposed to addressing a global state of starvation.
Project 10C: Evolution through the lens of codon usage: starvation codon usage is recurring in nature. Our exploration of the distribution of “alien” codon usages in genomes (one that is neither typical nor high-expression) supports the notion that this is very common throughout Archaea and Bacteria. As we have extended our examinations of diverse genomes, we find that many (probably most) genomes fit this pattern of a typical codon usage, a codon usage for abundant proteins, and a third codon usage that we associate with stress or starvation. However, we have found recurrent evidence for yet another surprising departure from our expectations. Specifically, although ribosomal proteins are generally viewed as the prototype of abundant proteins, and they are often used as landmarks to recognize this codon usage, we have found genomes in which ribosomal protein codon usages are largely distinct from the three “expected” codon usages. Our exploration of these departures is in its early stages, but we are convinced that it is not an artifact of our analysis tools.
Project 10D: Evolution through the lens of codon usage: the dynamics of DNA acquisition. Our original studies on the codon usages of recently acquired genes used comparative genome analysis to find protein coding sequences that were unique to a single genome in a set diverse strains. Although mechanisms of DNA transfer and acquisition have been studied for decades, the balance of these mechanisms in nature is varied, and depends upon the particular data. We have begun a systematic examination of the endpoints of the recently acquired DNAs, with the aim of improving our understanding of their acquisition, and the factors that circumscribe the phylogenetic range of this frequent sharing.
Project 10E: Evolution through the lens of codon usage: the evolutionary dynamics of being “occasionally useful.” This facet of the project is constantly being revisited conceptually, but is only in its formative stages of being converted to concrete tests. In brief, the variable genes of the pangenome are, by definition, in only a subset of the strains of a species. If these genes were routinely useful, they would be fixed in the genomes of all strains. We are exploring the question, what bounds can we reasonably place on how commonly they are useful, given their abundance in the phylogenetic group (usually low), known rates of mutation, and the relative paucity of obvious pseudogenes (indicating that they are being occasionally selected? Also, what additional constraints are imposed by the observation that most of these genes share the distinctive stress-adapted codon usage?
Publications
-
Brettin, T., Davis, J. J., Disz, T., Edwards, R. A., Gerdes, S., Olsen, G. J., … Xia, F. (2015). RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Scientific Reports, 5, 8365. doi:10.1038/srep08365
-
Davis, J. J., Gerdes, S., Olsen, G. J., Olson, R., Pusch, G. D., Shukla, M., … Yoo, H. (2016). PATtyFams: Protein Families for the Microbial Genomes in the PATRIC Database. Frontiers in Microbiology, 7. doi:10.3389/fmicb.2016.00118
-
PROJECT INVESTIGATORS:
-
PROJECT MEMBERS:
Scott Dawson
Co-Investigator
Katherine Karberg
Co-Investigator
Rachel Whitaker
Co-Investigator
-
RELATED OBJECTIVES:
Objective 5.1
Environment-dependent, molecular evolution in microorganisms
Objective 5.2
Co-evolution of microbial communities
Objective 5.3
Biochemical adaptation to extreme environments
Objective 6.1
Effects of environmental changes on microbial ecosystems