2000 Annual Science Report
Pennsylvania State University Reporting | JUL 1999 – JUN 2000
Timescale for the Early Evolution of Life on Earth: Molecular Evolutionary Approach - Nei
The main goal of the research in Task III is to develop statistical methods of estimating the times of origin of the major groups of organisms from molecular data and infer major events of early stages of evolution on Earth using a large amount of sequence data.
The main goal of the research in Task III is to develop statistical methods of estimating the times of origin of the major groups of organisms from molecular data and infer major events of early stages of evolution on Earth using a large amount of sequence data. For this purpose, it is essential to have reliable and efficient statistical methods for constructing phylogenetic trees. At the present time, there are three major methods of constructing trees from molecular data: the maximum parsimony (MP), minimum evolution (ME), and maximum likelihood (ML) methods. In these methods, phylogenetic trees are constructed by minimizing a specific optimality score. All of these methods are very time-consuming when an extensive tree search is conducted. However, it is unclear whether extensive tree-search algorithms are really necessary for obtaining the true tree with a high probability. We therefore conducted an extensive computer simulation comparing the efficiencies of simple and extensive search algorithms in obtaining the true tree. The results obtained have shown that although extensive search algorithms always give smaller optimality scores than simple algorithms the deviations of inferred trees from the true tree are nearly the same. This indicates that there is no need to use extensive search algorithms and simple algorithms are sufficient for phylogenetic inference (Takahashi and Nei 2000; Piontkivska and Nei, unpublished). We also investigated statistical methods that would give reliable estimates of evolutionary times when many genes evolving with different rates are used. We have shown that it is generally better to concatenate the distances for different genes and then estimate times from concatenate distances (concatenate distance approach) than to estimate times by averaging the estimates obtained for individual genes (individual gene approach) (P. Xu and M. Nei, unpublished). However, the best way of weighting distances for different genes is still unclear, and we are currently investigating this problem.Takahashi, K., and M. Nei. 2000. Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol. Biol. Evol. (in press).
PROJECT MEMBERS:Masatoshi Nei
RELATED OBJECTIVES:Objective 4.0
Expand and interpret the genomic database of a select group of key microorganisms in order to reveal the history and dynamics of evolution.