Michael J. Sanderson,

Research Interests

My research applies statistical and computational techniques to problems in phylogenetics and evolution. Until recently the empirical side of these studies focused on specific taxa: legumes (especially Astragalus and relatives), angiosperms and seed plants. Across these various levels I have been attracted to phylogenetic problems that pose methodological obstacles or present unusual quantitative challenges. The same is true of the evolutionary problems I have addressed, most of which concern quantitative analyses of rates of molecular evolution (especially in relation to divergence time problems) or taxonomic diversification.

In the last few years, however, my research has shifted toward computational phylogenetic problems at the scale of the "tree of life", with an empirical emphasis on all green plants. Much of this work aimed at developing algorithms and software for assembling data from the large sequences databases for the purpose of building comprehensive phylogenetic trees. GenBank, for example, presently archives data on over 165,000 species, a sizable fraction of all described biodiversity. My lab is currently funded through two NSF AToL (Assembling the Tree of Life) grants to develop tools and techniques for acquiring sequence data and assembling it in a pre-processing pipeline upstream of phylogenetic inference proper. We are collaborating with computer scientists and other phylogeneticists to develop algorithms and test them primarily on plant phylogenetic and genomic data sets. These datasets range from taxonomically broad collections across sizeable parts of the tree of life to genome scale EST data sets and BAC-end sequence data sets (in collaboration with the OMAP rice genomics project) on smaller groups of taxa. Analysis of data at these extremes requires novel phylogenetic inference methods such as supertree construction, another active area of research in our group. Having completed some initial work on supertree methods, we and our math and computer science collaborators are now looking at problems associated with defining optimal inputs for supertree construction, and developing methods for estimating their confidence limits. Finally, we have recently started working in the area of biodiversity informatics, developing methods for examining patterns of phylogenetic diversity in local floristic assemblages. This dovetails with the phylogenomic work in unexpected ways through the common currency of taxonomic names associated with the sequence data needed to build reliable phylogenetic histories.

Publication list