Data sets to accompany manuscripts in review are HERE
All sequence data sets below are in Nexus format:
Data files:
Species tree--gene tree file (maximum parsimony gene tree collection) (uncompressed file)
Species tree--gene tree file (maximum likelihood gene tree collection) (uncompressed file)
Data files:
Dense supermatrix file (uncompressed file, 76 MB)
Dense supermatrix file (compressed .Z file, 2.4 MB)
Sparse supermatrix file (uncompressed file, 100 MB)
Sparse supermatrix file (compressed .Z file, 2.4 MB)
Please note that the uncompressed versions may not display in entirety, depending on the browser.
Tree files in Nexus format:
Dense supermatrix, 5000 equally parsimonious trees (compressed .gz file, 0.7 MB)
Dense supermatrix, 5000 equally parsimonious trees (uncompressed nexus file, 72 MB)
Dense supermatrix, strict consensus of 5000 equally parsimonious trees
README regarding data filesSwiss-Prot data used in the analyses [5.8 MB file]
GenBank green plant data used in the analyses [9.0 MB file]
Swiss-Prot metazoan supermatrix [35 MB file]
GenBank green plant supermatrix [7.2 MB file]
Aligned nexus files for all informative Swiss-prot clusters [4.7 MB tar.gz file]
Aligned nexus files for all informative green plant clusters [2.9 MB tar.gz file]
ITS
Sanderson, M. J., M. F. Wojciechowski, J.-M. Hu, T. Sher Khan, and S.
G. Brady. 2000, Error, bias, and long-branch attraction in data for
two chloroplast photosystem genes in seed plants. Mol. Biol.
Evol. 17:782-797.
psaA
ITS
rbcL and 18S combined data set
Sanderson, M. J. 2003. Molecular data from 27 proteins do not support
a Precambrian origin of land plants. Amer. J. Bot.
90:954-956.
27-protein data set
Sanderson, M. J., A. C. Driskell, R. H. Ree, O. Eulenstein, and S.
Langley. 2003. Obtaining maximal concatenated phylogenetic data sets
from large sequence databases. Mol. Biol. Evol.
20:1036-1042.
15-protein,15 taxa data set
57 chloroplast alignments used in this paper, with a table that tranlsates our cluster numbers to the gene symbol and product as given in the GenBank files. Here is a perl script that aligns DNA sequences from an existing alignement of amino acid sequences.programs implementing the test
Seq-gen-cov: a modification of Seq-Gen. C program that simulates DNA sequences under a variety of models, including covarion models. (.gz file)
HERE is a zipped file containing scripts and an example data file to implement an alpha-quasi-biclique search.