2001 Annual Science Report
Marine Biological Laboratory Reporting | JUL 2000 – JUN 2001
Origin of Life: Evolution of Proteins
Project Progress
Origin of Life: Evolution of Proteins (dm)
We continue to make steady progress toward enumerating and characterizing the basic protein families that were present in the last common ancestor on Earth. Also our study of the genotype and phenotype relationships for all genes of a single microorganism continues to give information on exactly how a collection of genes in the DNA of an organism translates into a complete, free-living self-reproducing cell. The complete genotype down to DNA sequence is known for Escherichia coli. The phenotype conferred by some of these genes is experimentally known, for others, predicted and for others still unknown.
The organism on Earth that we know the most about is Escherichia coli. This single-cell bacterium has been minutely studied experimentally for the last 60 years and more. We have collected information on the biological function of each one of the 4408 genes and their gene products (protein or RNA). We have assembled experimentally known information from the literature and we have painstakingly assigned functions to unknown proteins by analogy to similar known proteins. The function of 19% of the genes/gene products is not yet known. However, by careful study of the 81%, we documented over half as experimentally known, the rest predicted. This body of work constitutes an unparalleled opportunity to see in ultimate detail what exactly makes a living cell. A complete knowledge of the relationship of genotype to phenotype for all genes of one organism has implications for knowing the entirety of what is required to fire primitive life.
Also, we have been studying protein families in the context of determining the makeup of the basic protein elements that were adequate to give life to a primitive cell on early Earth. Knowing the basic composition of that early cell should provide clues to the kinds of proteins we might expect to detect on other solar system planets if life there is based on carbon, oxygen and nitrogen.
To work toward this goal, we have grouped proteins into families both by amino acid sequence similarity and by elements of tertiary structure. The two approaches, sequence and structure, generate some families of proteins in common, and also some families seen only by sequence or only by structure. We have found that some protein families are uniform in biological action, differing only in specificity of molecules acted on. The origins of such families are easy to visualize in terms of molecular evolution as being the consequence of gene duplications followed by change in substrate specificity. Other protein families, however, are more complex. Our analysis has shown that some families, called superfamilies, can be more varied in function. In these cases replicate genes have used the same basic themes of sequence and structure to branch out to a variety of functions of the same general type. We have studied examples in which different enzymes in the same superfamily catalyze only remotely related reactions. Even more extreme are examples in which a single protein uses only one catalytic site but different amino acids residues at that site to catalyze different reactions by the same protein. This model of efficiency and economy could be characteristic of early evolution, giving more catalytic capacity to a single gene product.
In such ways we are reconstructing molecular modes of evolution in order to be prepared to understand comparable events that may have occurred on other bodies in the solar system.
-
PROJECT INVESTIGATORS:
-
PROJECT MEMBERS:
Monica Riley
Project Investigator
Ping Liang
Postdoc
Tom McCormack
Unspecified Role
Laila Nahum
Unspecified Role
Margrethe Serres
Unspecified Role
-
RELATED OBJECTIVES:
Objective 2.0
Develop and test plausible pathways by which ancient counterparts of membrane systems, proteins and nucleic acids were synthesized from simpler precursors and assembled into protocells.
Objective 4.0
Expand and interpret the genomic database of a select group of key microorganisms in order to reveal the history and dynamics of evolution.