Notice: This is an archived and unmaintained page. For current information, please browse astrobiology.nasa.gov.

2004 Annual Science Report

NASA Ames Research Center Reporting  |  JUL 2003 – JUN 2004

Early Metabolic Pathways

4 Institutions
3 Teams
0 Publications
0 Field Sites
Field Sites

Project Progress

The evolutionary optimization of the de novo evolved non-biological ATP-binding protein has been completed. Previously, a family of non-biological ATP-binding proteins was isolated from an unconstrained random-sequence library. One of these proteins was further optimized for high-affinity binding to ATP, but its biophysical characterization proved impossible due to poor solubility, possibly due to non-unique folding. To improve folding stability, multiple rounds of mRNA-display selection under increasingly denaturing conditions were performed. One protein so obtained was chosen for further characterization. Using several biophysical methods it was demonstrated that this protein has a unique folded structure. This work shows that even if the initially obtained proteins have poor folding and weak binding, subsequent evolutionary optimization readily yields sequence variants with improved folding and improved ligand binding.


While this work was in progress, the three-dimensional structure of earlier high affinity variants was obtained by X-ray crystallography (LoSurdo et al., Nature Struct. Biol., 2004). It appears that the presence of a charged N-terminal sequence improved solubility greatly providing a different pathway to protein optimization. The availability of this structure has allowed for a structural interpretation of the mutations observed during our evolutionary optimization of this protein. The structure of our ATP binding protein reveals a new protein fold, not previously seen in biological proteins. This result provides preliminary support for the idea that only a subset of all possible protein folds is used in biological systems; the determination of additional non-biological protein structures will be required to determine whether biological protein folds represent a small or large fraction of all possible folds.


Studies aimed at solving the solution structure of the optimized protein by nuclear magnetic resonance ( NMR) spectroscopy have been initiated in collaboration with James Chou at Harvard Medical School . It is expected that this work will be completed this coming year. The results will enable the comparison between the structure of the folding-optimized protein and that of the high-affinity variant protein that has been solved by X-ray crystallography. These two proteins differ in about 25% of their residues, so it will be of interest to see how much their structures have diverged.


A simple model of reaction (metabolic) networks catalyzed by functional proteins existing among random sequences has been developed and studied computationally. Biochemically plausible rules for identifying populations of functional proteins have been formulated in previous years of this project. By investigating large populations of networks it was demonstrated that their subset can self-organize and evolve towards increasing complexity even in the absence of a genome. Networks can be classified into families (species) that persist even though individual networks disintegrate or transform with time. As the environmental conditions change, so do relative populations of different families. These results indicate that many concepts, such as speciation, developed in the context of genomic evolution also hold for conditions in the absence of a genome.


Evolutionary progress of the systems was, however, limited by the absence of a memory storage mechanism provided by the genome. To support the increasing complexity of the system, the presence of proteins with increasing efficiency and specificity was needed. These proteins, however, are quite rare among random sequences and are not encountered on a consistent basis. The results support the hypothesis that initial protobiological evolution could have progressed without a genome but could not have reached the complexity approaching real cellular systems.