Phylogenetic position of turtles among amniotes: evidence from mitochondrial and nuclear genes

Similar documents
Complete mitochondrial genome suggests diapsid affinities of turtles (Pelomedusa subrufa phylogeny amniota anapsids)

CLADISTICS Student Packet SUMMARY Phylogeny Phylogenetic trees/cladograms

Modern Evolutionary Classification. Lesson Overview. Lesson Overview Modern Evolutionary Classification

Species: Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora Class: Mammalia Phylum: Chordata

8/19/2013. Topic 5: The Origin of Amniotes. What are some stem Amniotes? What are some stem Amniotes? The Amniotic Egg. What is an Amniote?

Lecture 11 Wednesday, September 19, 2012

Title: Phylogenetic Methods and Vertebrate Phylogeny

What are taxonomy, classification, and systematics?

Phylogeny Reconstruction

Complete mitochondrial DNA sequence of Chinese alligator, Alligator sinensis, and phylogeny of crocodiles

UNIT III A. Descent with Modification(Ch19) B. Phylogeny (Ch20) C. Evolution of Populations (Ch21) D. Origin of Species or Speciation (Ch22)

Supporting Online Material

1 Describe the anatomy and function of the turtle shell. 2 Describe respiration in turtles. How does the shell affect respiration?

17.2 Classification Based on Evolutionary Relationships Organization of all that speciation!

LABORATORY EXERCISE 6: CLADISTICS I

LABORATORY EXERCISE 7: CLADISTICS I

These small issues are easily addressed by small changes in wording, and should in no way delay publication of this first- rate paper.

Bio 1B Lecture Outline (please print and bring along) Fall, 2006

Geo 302D: Age of Dinosaurs LAB 4: Systematics Part 1

Testing Phylogenetic Hypotheses with Molecular Data 1

6. The lifetime Darwinian fitness of one organism is greater than that of another organism if: A. it lives longer than the other B. it is able to outc

Are Turtles Diapsid Reptiles?

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

Animal Diversity wrap-up Lecture 9 Winter 2014

Phylogenetics. Phylogenetic Trees. 1. Represent presumed patterns. 2. Analogous to family trees.

Ch 1.2 Determining How Species Are Related.notebook February 06, 2018

Fig Phylogeny & Systematics

Introduction to phylogenetic trees and tree-thinking Copyright 2005, D. A. Baum (Free use for non-commercial educational pruposes)

Name: Date: Hour: Fill out the following character matrix. Mark an X if an organism has the trait.

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Comparing DNA Sequences Cladogram Practice

muscles (enhancing biting strength). Possible states: none, one, or two.

Interpreting Evolutionary Trees Honors Integrated Science 4 Name Per.

INQUIRY & INVESTIGATION

Caecilians (Gymnophiona)

1 EEB 2245/2245W Spring 2014: exercises working with phylogenetic trees and characters

The impact of the recognizing evolution on systematics

History of Lineages. Chapter 11. Jamie Oaks 1. April 11, Kincaid Hall 524. c 2007 Boris Kulikov boris-kulikov.blogspot.

Taxonomic Congruence versus Total Evidence, and Amniote Phylogeny Inferred from Fossils, Molecules, Morphology

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

Comparative Zoology Portfolio Project Assignment

Understanding Evolutionary History: An Introduction to Tree Thinking

Preliminary Results of a Cognitum Study Investigating i the Traditional Tetrapod Classes. Timothy R. Brophy

HETEROCHRONY OF CRANIAL BONES IN AMNIOTA AND THE PHYLOGENETIC PLACEMENT OF TESTUDINES

Phylogeny and Biogeography of Ratite Birds Inferred from DNA Sequences of the Mitochondrial Ribosomal Genes

Animal Diversity III: Mollusca and Deuterostomes

Introduction to Cladistic Analysis

Comparing DNA Sequence to Understand

Integrating Reptilian Herpesviruses into the Family Herpesviridae

Presence and Absence of COX8 in Reptile Transcriptomes

CHAPTER 26. Animal Evolution The Vertebrates

AP Lab Three: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

No limbs Eastern glass lizard. Monitor lizard. Iguanas. ANCESTRAL LIZARD (with limbs) Snakes. No limbs. Geckos Pearson Education, Inc.

Video Assignments. Microraptor PBS The Four-winged Dinosaur Mark Davis SUNY Cortland Library Online

Page # Diversity of Arthropoda Crustacea Morphology. Diversity of Arthropoda. Diversity of Arthropoda. Diversity of Arthropoda. Arthropods, from last

Ch 34: Vertebrate Objective Questions & Diagrams

9. Summary & General Discussion CHAPTER 9 SUMMARY & GENERAL DISCUSSION

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Amphibians (Lissamphibia)

Crocodylians (Crocodylia)

Turtles (Testudines) Abstract

Proopiomelanocortin (POMC) and testing the phylogenetic position of turtles (Testudines)

Cladistics (reading and making of cladograms)

The melanocortin 1 receptor (mc1r) is a gene that has been implicated in the wide

Modern taxonomy. Building family trees 10/10/2011. Knowing a lot about lots of creatures. Tom Hartman. Systematics includes: 1.

GEODIS 2.0 DOCUMENTATION

TOPIC CLADISTICS

Do the traits of organisms provide evidence for evolution?

Which Came First: The Lizard or the Egg? Robustness in Phylogenetic Reconstruction of Ancestral States

1 EEB 2245/2245W Spring 2017: exercises working with phylogenetic trees and characters

Mammalogy: Biology 5370 Syllabus for Fall 2005

The Making of the Fittest: LESSON STUDENT MATERIALS USING DNA TO EXPLORE LIZARD PHYLOGENY

HIGLEY UNIFIED SCHOOL DISTRICT INSTRUCTIONAL ALIGNMENT. Zoology Quarter 3. Animal Behavior (Duration 2 Weeks)

DATA SET INCONGRUENCE AND THE PHYLOGENY OF CROCODILIANS

Monotremes (Prototheria)

Fish 2/26/13. Chordates 2. Sharks and Rays (about 470 species) Sharks etc Bony fish. Tetrapods. Osteichthans Lobe fins and lungfish

Chapter 13. Phylogenetic Systematics: Developing an Hypothesis of Amniote Relationships

The Evolution of Chordates

Subphylum Vertebrata

Received 20 December 2006; accepted 9 February 2007 Available online 23 February 2007

Points of View Tetrapod Phylogeny, Amphibian Origins, and the De nition of the Name Tetrapoda

Question Set 1: Animal EVOLUTIONARY BIODIVERSITY

Vertebrates. Vertebrates are animals that have a backbone and an endoskeleton.

T. 6. THE VERTEBRATES

What is the evidence for evolution?

Stuart S. Sumida Biology 342. Simplified Phylogeny of Squamate Reptiles

Warm-Up: Fill in the Blank

Red Eared Slider Secrets. Although Most Red-Eared Sliders Can Live Up to Years, Most WILL NOT Survive Two Years!

Evolution of Vertebrates through the eyes of parasitic flatworms

Quiz Flip side of tree creation: EXTINCTION. Knock-on effects (Crooks & Soule, '99)

VERTEBRATE READING. Fishes

SUPPLEMENTARY INFORMATION

8/19/2013. Topic 4: The Origin of Tetrapods. Topic 4: The Origin of Tetrapods. The geological time scale. The geological time scale.

Mitogenomic Perspectives on the Origin and Phylogeny of Living Amphibians

Sec KEY CONCEPT Reptiles, birds, and mammals are amniotes.

Systematics, Taxonomy and Conservation. Part I: Build a phylogenetic tree Part II: Apply a phylogenetic tree to a conservation problem

Phylogeny of snakes (Serpentes): combining morphological and molecular data in likelihood, Bayesian and parsimony analyses

Dynamic evolution of venom proteins in squamate reptiles. Nicholas R. Casewell, Gavin A. Huttley and Wolfgang Wüster

Origin and Evolution of Birds. Read: Chapters 1-3 in Gill but limited review of systematics

Amniote Relationships. Reptilian Ancestor. Reptilia. Mesosuarus freshwater dwelling reptile

Transcription:

Gene 259 (2000) 139 148 www.elsevier.com/locate/gene Phylogenetic position of turtles among amniotes: evidence from mitochondrial and nuclear genes Ying Cao a, Michael D. Sorenson b, Yoshinori Kumazawa c, David P. Mindell d, Masami Hasegawa a,* a The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569, Japan b Department of Biology, Boston University, Boston, MA 02215, USA c Department of Earth and Planetary Sciences, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan d Department of Biology and Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA Received 18 April 2000; received in revised form 19 July 2000; accepted 8 September 2000 Received by T. Gojobori Abstract Maximum likelihood analysis, accounting for site-heterogeneity in evolutionary rate with the C-distribution model, was carried out with amino acid sequences of 12 mitochondrial proteins and nucleotide sequences of mitochondrial 12S and 16S rrnas from three turtles, one squamate, one crocodile, and eight birds. The analysis strongly suggests that turtles are closely related to archosaurs (birds+crocodilians), and it supports both Tree-2: (((birds, crocodilians), turtles), squamates) and Tree-3: ((birds, (crocodilians, turtles)), squamates). A more traditional Tree-1: (((birds, crocodilians), squamates), turtles) and a tree in which turtles are basal to other amniotes were rejected with high statistical significance. Tree-3 has recently been proposed by Hedges and Poling [Science 283 (1999) 998 1001] based mainly on nuclear genes. Therefore, we re-analyzed their data using the maximum likelihood method, and evaluated the total evidence of the analyses of mitochondrial and nuclear data sets. Tree-1 was again rejected strongly. The most likely hypothesis was Tree-3, though Tree-2 remained a plausible candidate. 2000 Elsevier Science B.V. All rights reserved. Keywords: Diapsid affinity of turtles; Maximum likelihood; Molecular phylogeny; Reptile; Total evaluation 1. Introduction whereas Løvtrup placed archosaurs closest to turtles. Romer s view has been the consensus, while Løvtrup s Phylogenetic relationships among the major groups has been a minority view. Most morphological analyses of amniotes are not yet well resolved, and the position since the late 1980s (e.g. Gauthier et al., 1988; Laurin of Testudines (turtles) in particular remains uncertain. and Reisz, 1995; Lee, 1995) consistently supported the Among morphologists, alternative views have been pre- basal phylogenetic position of turtles as a sister-group sented by Romer (1966) and Løvtrup (1980). While of both archosaurs and squamates. both agree on placing crocodiles as the sister-group of Traditionally, living turtles have been considered the birds, in monophyletic Archosauria, Romer placed basal lineage in the amniote tree, mainly because of archosaurs closest to squamates ( lizards and snakes), their lack of temporal fenestrae in the skull (anapsid), a condition that has been interpreted as primitive (e.g. Abbreviations: AIC, Akaike Information Criterion; ATP6, ATPase Benton, 1990). This view, however, was recently chal- 6; COB, cytochrome b; COX1, cytochrome oxidase subunit 1; KH lenged in a morphological analysis by Rieppel and test, Kishino Hasegawa test; LDH, lactate dehydrogenase; ML, maxidebraga ( 1996), who suggested an affinity between mum likelihood; MS test, multiple-comparisons test of the standardturtles and diapsids, as did Løvtrup (1980). In contrast ized statistics; mtdna, mitochondrial DNA; mt-protein, mitochondrial protein; mtrev model, general reversible Markov to Løvtrup, however, Rieppel and debraga placed turtles model for amino acid substitution of mt-proteins; ND1, NADH dehy- as sister to lepidosaurs (squamates plus tuatara). In drogenase subunit 1; rrna, ribosomal RNA. either case, turtles must have lost both the upper and * Corresponding author. Tel.: +81-3-5421-8748; fax: +81-3-3446-1695/+81-3-5421-8796. lower temporal fenestrae (debraga and Rieppel, 1997). E-mail address: hasegawa@ism.ac.jp (M. Hasegawa) The validity of Rieppel and debraga s interpretation 0378-1119/00/$ - see front matter 2000 Elsevier Science B.V. All rights reserved. PII: S0378-1119(00)00425-X

140 Y. Cao et al. / Gene 259 (2000) 139 148 was questioned by Wilkinson et al. (1997) and Lee (1997), although Wilkinson et al. (1997) acknowledged the difficulty of determining the phylogenetic placement of turtles and suggested molecular approaches to be promising in this respect. Recent molecular studies bearing on turtle relationships all suggest a diapsid affinity for turtles, but also point to an apparent contradiction in data from the mitochondrial and nuclear genomes. Analyses based on complete mtdna sequences of Eastern painted turtle (Chrysemys picta; Mindell et al., 1999) and African sidenecked turtle (Pelomedusa subrufa; Zardoya and Meyer, 1998) both placed turtles as sister to archosaurs (birds and alligators), although representative lepidosaurs were not available for analyses of the complete mt genome. With lepidosaurs included in an analysis of mt rrnas only, turtles grouped with archosaurs ( Zardoya and Meyer, 1998). Kumazawa et al. (1998) and Kumazawa and Nishida (1999) obtained the first complete mtdna sequences for lepidosaurs, from the Ryukyu odd-tooth snake (Dinodon semicarinatus) and the blue-tailed mole skink (Eumeces egregius lividus). With additional data from another turtle species (green turtle, Chelonia mydas), they obtained strong support for the turtle/archosaur grouping ( Tree-2 in Fig. 1; Kumazawa and Nishida, 1999). Similarly, Platz and Conlon (1997) found a sister relationship between turtles and archosaurs, excluding lepidosaurs as an outgroup, based on amino acid sequences of the pancreatic polypeptide. Other studies using evidence from the nuclear genome have reached a somewhat different conclusion. Using mainly nuclear-encoded proteins and ribosomal RNAs, Hedges and Poling ( 1999) suggested a sister relationship between turtles and crocodilians with birds placed as the sister taxon to this group (Tree-3 in Fig. 1). In other words, Archosauria was found to be paraphyletic with turtles placed within the group. The same result was obtained by Kirsch and Mayer ( 1998) using DNA DNA hybridization data. In Kumazawa and Nishida s maximum likelihood Fig. 1. Three candidate trees for relationships among birds, crocodiles, turtles, and squamates. (wallaroo; Janke et al., 1997; database accession number analysis of mt-proteins, the Hedges and Poling sugges- Y10524); Didelphis virginiana (opossum; Janke et al., tion of a turtle/crocodile grouping was rejected at the 1994; Z29573); Ornithorhynchus anatinus ( platypus; 5% level. Thus, nuclear and mitochondrial data appear Janke et al., 1996; X83427); Corvus frugilegus (rook; to provide a different resolution of the placement of Härlid and Arnason, 1999; Y18522); Gallus gallus turtles within Diapsida. However, phylogenetic inferences can vary with taxonomic sampling (Philippe and Douzery, 1994; Adachi and Hasegawa, 1996c; Halanych, 1998), so we analyze here the updated mtdna database along with the nuclear sequences used by Hedges and Poling (1999). 2. Materials and methods 2.1. Sequence data The complete mtdna sequences used in this study are from the following 21 species: Macropus robustus (chicken; Desjardins and Morais, 1990; X52392); Aythya americana (redhead duck; Mindell et al., 1999; AF090337); Rhea americana (greater rhea; Mindell et al., 1999; AF090339); Falco peregrinus (peregrine falcon; Mindell et al., 1999; AF090338); Vidua chalybeata (village indigobird; Mindell et al., 1999; AF090341); Smithornis sharpei (grey-headed broadbill; Mindell et al., 1999; AF090340); Struthio camelus (ostrich; Härlid et al., 1997; Y12025); Alligator mississippiensis (American alligator; Mindell et al., 1999; AF069428); Eumeces egregius lividus ( blue-tailed mole skink; Kumazawa and Nishida, 1999; AB016606); Pelomedusa subrufa ( African side-necked turtle; Zardoya

Y. Cao et al. / Gene 259 (2000) 139 148 141 and Meyer, 1998; AF039066); Chrysemys picta (eastern consideration models of secondary structure ( Neefs painted turtle; Mindell et al., 1999; AF069423); Chelonia et al., 1993; Gutell et al., 1993) following Cao et al. mydas (green turtle; Kumazawa and Nishida, 1999; ( 1994). Alignment positions with gaps and regions of AB012104); Xenopus laevis (African clawed toad; Roe ambiguous alignment were excluded from all analyses. et al., 1985; M10217); Cyprinus carpio (carp; Chang Total numbers of remaining sites are 781 and 1074, et al., 1994; X61010); Crossostoma lacustre ( loach; respectively, for 12S and 16S. Tzeng et al., 1992; M91245); Oncorhynchus mykiss The alignments of the following nuclear genes used (trout; Zardoya et al., 1995; L29771); Mustelus manazo by Hedges and Poling (1999) were also analyzed: 18S (gummy shark; Cao et al., 1998; AB015962). We did rrna, 28S rrna, a-crystallin A, a-enolase, a-globin, not use the snake sequence ( Kumazawa et al., 1998) b-globin, c-mos proto-oncogene, lactate dehydrogenase because of its extremely rapid evolutionary rate. In the A (LDHa), lactate dehydrogenase B (LDHb), and extensive analyses, we did not use eutherian mammals myoglobin. We did not use calcitonin and insulin because of the ambiguity of the relationship among sequences because of their short length. See Hedges and eutherians, marsupials and monotremes (Janke et al., Poling (1999) for details on the number of taxa included 1996, 1997), but a preliminary analysis including eutheri- in the analyses for each gene. ans did not give a significantly different result (data The alignments used in this work are available at not shown). http://www.evol.ism.ac.jp The 12 proteins encoded in the H-strand of mtdna were carefully aligned by eye. All positions with gaps or 2.2. Phylogenetic methods ambiguous alignment and overlapping regions between ATP6 and ATP8 and between ND4 and ND4L were All 15 possible trees for Aves, Crocodylia, Squamata excluded from phylogenetic analyses. The total number and Testudines with Mammalia, Amphibia, Osteichthyes of remaining codons is 3235. and Chondrichthyes as the outgroup were analyzed by The small (12S) and the large (16S) mitochondrial the maximum likelihood (ML) method ( Felsenstein, rrna sequences were aligned manually taking into 1981; Kishino et al., 1990). Since the sister-group status Fig. 2. A maximum likelihood tree of vertebrates estimated from the concatenated amino acid sequences of the 12 mt-proteins using ProtML with the mtrev-f model (Adachi and Hasegawa, 1996b). Relationships among birds were constrained to the traditional view as shown. For each internal branch, the percentage bootstrap probability (after fixing the relationships within subtrees attached to that branch; local bootstrap probability (Adachi and Hasegawa, 1996b) was estimated with the RELL method using 104 replications. The horizontal length of each branch is proportional to the estimated number of amino acid substitutions.

142 Y. Cao et al. / Gene 259 (2000) 139 148 A Fig. 3A.

Y. Cao et al. / Gene 259 (2000) 139 148 143 B Fig. 3. P-values of KH test for the 15 possible trees among birds, crocodilians, turtles, and squamates with mammals as an outgroup for (A) the 12 individual mt-proteins, the 12S and 16S mt-rrnas, and the total of the separate analyses of mitochondrial genes and for (B) the 18S and 28S rrnas, the 10 nuclear proteins, and the total of the nuclear genes. Tree topologies: (1) (((bird, crocodile), squamate), turtle); (2) (((bird, crocodile), turtle), squamate); (3) ((bird, (turtle, crocodile)), squamate); (4) (((bird, turtle), crocodile), squamate); (5) ((bird, crocodile), (turtle, squamate)); (6) (((bird, turtle), squamate), crocodile); (7) ((bird, turtle), (crocodile, squamate)); (8) (bird, ((turtle, crocodile), squamate)); (9) ((bird, squamate), (turtle, crocodile)); (10) ((bird, (crocodile, squamate)), turtle); (11) (((bird, squamate), crocodile), turtle); (12) (bird, (turtle, (crocodile, squamate))); (13) (bird, ((turtle, squamate), crocodile)); (14) (((bird, squamate), turtle), crocodile); (15) ((bird, (turtle, squamate)), crocodile). of Mammalia to other groups of Amniota has been the nuclear-encoded proteins (Jones et al., 1992). The established by previous analyses (e.g. Mindell et al., BaseML program in the PAML package was applied to 1999), we assumed this relationship in our study. the rrna sequences with the HKY85 model (Hasegawa Further, the relationships within each group were et al., 1985). For the CodeML and BaseML programs, assumed to be fixed as they are in Fig. 2, such that the the discrete C-distribution model with eight categories 15 trees represent rearrangements only of the main was used to accommodate site-heterogeneity. To evalubranches between groups. The ProtML program in the ate the total evidence from separate analyses of individual MOLPHY package (vers. 2.3) (Adachi and Hasegawa, genes, the TotalML program in MOLPHY was 1996b) and the CodeML program in the PAML package applied to the output files of the ProtML, CodeML, ( Yang, 1997) were applied to the protein sequences with and BaseML programs. For each of the 15 possible the mtrev-f model for mitochondrial proteins (Adachi trees, a standard error of the log-likelihood difference and Hasegawa, 1996a) and with the JTT-F model for from the ML tree was estimated by Kishino and

144 Y. Cao et al. / Gene 259 (2000) 139 148 Hasegawa s (1989) formula, and the P-value of the sequences, and that the summation of log-likelihood Kishino Hasegawa ( KH) test was calculated. scores from the separate analyses of individual genes is Hedges and Poling ( 1999) concatenated the sequences preferable to a single analysis of concatenated sequences of several genes for combined analyses, but in preparing (Cao et al., 1999; Adachi et al., 2000). Therefore, the concatenated sequences, they chose only one individual genes were analyzed separately (branch sequence from each group and omitted other sequences. lengths were estimated by maximum likelihood for each This procedure did not make optimal use of all the individual gene with a different shape parameter for the available information. Furthermore, the analysis of concatenated C-distribution estimated for each gene) and the log- sequences under a single model of sequence likelihoods for each gene were summed to determine the evolution does not satisfactorily take into account rate ML tree for a total analysis of all the data. heterogeneity among genes, even if the C-distribution Inclusion of the C-distribution for individual mitochondrial model is used, as it has been shown that the total genes did not change the ML tree; i.e. Tree-2 evaluation based on a summation of log-likelihood ( Fig. 1) is preferred based on the combined likelihood scores from separate analyses of individual genes with score for the mitochondrial data in total ( Fig. 3A). For the C model is preferable (Cao et al., 1999). all the individual mitochondrial genes, either Tree-2 or 3 has the best likelihood score except for ND3 and COB which, respectively, yield Trees-4 and 9; however, their 3. Results scores are only slightly better than the scores for Trees-2 and 3 (P-values are 0.95 and 0.67, respectively, for ND3 A basal divergence for the passeriform lineage relative and 0.70 and 0.80 for COB). None of the individual to other birds was suggested in earlier analyses based mitochondrial genes discriminate significantly against on 12 mt protein-coding genes and 2 mt-rrna genes Trees-2 and 3 (the lowest P-value is 0.23 for Tree-3 in combined (Mindell et al., 1997) and the mt cytb gene the analysis of ND5). Although the combined analysis (Härlid et al., 1997, 1998). This was an unexpected of all mitochondrial genes (including rrnas) yield result, differing from the traditional view (e.g. Storer, Tree-2 as optimal, the log-likelihood of Tree-3 is lower 1971). Introduction of the C-distribution ( Yang, 1996) by only 2.7±10.8 (±1 S.E.) than that of Tree-2 (P= in the ML analysis of the 12 mt-proteins, however, 0.80). While the combined likelihood scores for the greatly reduced the log-likelihood difference between a nuclear genes do not strongly reject any of the 15 more traditional ratite-basal tree and the ML passerine- alternative trees (the lowest P-value is 0.01 for Tree-7) basal tree (Mindell et al., 1999; table 5), suggesting a ( Fig. 3B), the total analysis of the mitochondrial genes possible artifactual attraction between a passerine bird rejects all trees other than Trees-2, 3 and 4 with P- and alligator which is an outgroup taxon to birds. values lower than 0.001 ( Table 1). Recent studies of nuclear and mitochondrial genes by The basal position of squamates among extant reptiles van Tuinen et al. (2000) and Groth and Barrowclough is well established by our total analysis of the (1999) are consistent with this view. Fig. 2 shows the mitochondrial and nuclear data ( Table 1; P-values for ProtML tree of the concatenated 12 mt-proteins with other relationships are lower than 0.0004), as well as by the constraint that ratites are basal among birds. previous studies (Kumazawa and Nishida, 1999; Hedges Although rooting the avian tree with the Smithornis and Poling, 1999). Although Tree-3 is best supported, lineage gives a higher log-likelihood than this tree by Tree-2 has a log-likelihood score lower by only 34.8±26.3 (P=0.19) without C, the difference reduces 25.2±17.6 (P=0.15) and cannot be rejected. P-values to 2.0±16.8 (P=0.90) with C. Therefore, we assumed by the multiple-comparisons of the standardized statistics that ratites are basal among birds in the rest of our ( MS) method (Shimodaira and Hasegawa, 1999) analyses. for Trees-2 and 4 are 0.315 and 0.012, respectively, Fig. 2 is consistent with the tree of Kumazawa and which are larger than 0.152 and 0.003 given by the KH Nishida ( 1999) which was based on a more limited test presented in Table 1. In this sense the MS test is sampling of species. Turtle is the closest relative of more conservative than the Kishino Hasegawa test. It archosaurs (birds+crocodilians) in this tree. The tree in is noteworthy that even the conservative MS test gave Fig. 2 was estimated by assuming that all sites are P-values lower than 0.002 for all trees other than Trees-2, equally free to vary, but phylogeny estimation can 3 and 4. sometimes be inconsistent under such an assumption An additional concern in the analysis of nuclear- (Lockhart et al., 1996; Yang, 1996; Sullivan and encoded proteins is accurate identification of the homologous Swofford, 1997). Therefore, we accommodate rateheterogeneity relationship (orthology versus paralogy) for among sites ( Yang, 1997) as well as among LDHa and LDHb genes across species. This is crucial different genes. It has been shown that rate heterogeneity among genes is not sufficiently well approximated by a single discrete C-distribution model of concatenated for phylogenetic analyses, though not always easy to establish (Stock et al., 1997), and when LDHa and LDHb were excluded from the analysis, the combined

Y. Cao et al. / Gene 259 (2000) 139 148 145 Table 1 Comparison of log-likelihood scores among the 15 possible trees. Mitochondria refers to the total of the separate analyses of individual genes for 12 mt-proteins plus 12S and 16S rrnas, nuclear to the total of the 10 nuclear genes, and total to the total of the individual analyses of the mitochondrial and nuclear genes. The log-likelihood values of the highest likelihood trees are given in angle brackets, and the differences in loglikelihood of alternative trees from that of the ML tree are shown with their S.E. following ±. P-values are shown for the total analysis Tree Mitochondria Nuclear Total Total (excl. LDHs) Log-likelihood P-value Log-likelihood P-value 1 70.5±19.3 36.0±18.8 103.8±26.9 1.1 10 4 84.1±23.6 3.7 10 4 2 67 098.8 27.9±16.8 25.2±19.0 0.18 3.7±15.6 0.81 3 2.7±8.5 17 255.3 84 356.8 1 80 693.2 1 4 15.8±11.6 33.2±17.2 46.3±20.7 0.03 26.6±17.3 0.12 5 68.6±19.7 34.1±17.4 100.0±26.2 1.4 10 4 77.4±23.2 8.5 10 4 6 59.1±18.1 40.3±19.4 96.7±26.5 2.6 10 4 76.4±23.3 1.0 10 3 7 57.4±18.2 43.6±19.0 98.4±26.4 1.9 10 4 79.1±22.9 5.5 10 4 8 78.1±18.8 3.3±11.9 78.6±22.2 4.0 10 4 80.6±21.6 1.9 10 4 9 73.9±18.7 9.5±13.6 80.8±23.1 4.7 10 4 85.6±23.1 2.1 10 4 10 88.5±21.6 41.4±19.1 127.2±28.9 1.1 10 5 109.2±25.6 2.0 10 5 11 91.7±21.0 29.7±17.5 118.7±27.3 1.4 10 5 110.4±25.3 1.3 10 5 12 95.1±21.3 29.7±16.3 122.1±26.8 5.2 10 6 108.1±24.3 8.7 10 6 13 95.4±21.6 27.4±15.2 120.0±26.5 6.0 10 6 103.7±24.0 1.6 10 5 14 92.5±21.1 27.2±17.6 117.1±27.5 2.1 10 5 105.9±25.9 4.3 10 5 15 90.7±22.0 33.7±18.2 121.7±28.6 2.1 10 5 100.2±25.8 1.0 10 4 likelihood scores for the mitochondrial plus nuclear data additional parameter, a (shape parameter of the were almost identical for Trees-2 and 3; Tree-2 has a C-distribution), must be estimated. Therefore, the log-likelihood score lower than that of Tree-3 by only number of estimated parameters is 59(=39+19+1) for 3.7±14.6 (P=0.80). each ML analysis with the C-distribution. For Tree-2 with the concatenated analysis of the 12 proteins, AIC was 2 49 413.4+2 59=98 944.8, while AIC for the 4. Discussion separate analysis was reduced to 2 48 522.8+ 2 59 12=98 461.6. This indicates that the separate Table 2 presents log-likelihood scores for the three analysis better approximates the underlying evolutionary trees including the birds/crocodilians/turtles clade when process than the concatenated analysis, which does not the 12 mt-proteins were analyzed with the mtrev-f explicitly assume heterogeneity of the substitution promodel. The separate and concatenated analyses of 12 cess across genes. This holds even when site-heteromt-proteins were compared in terms of the Akaike geneity is taken into account with the C-distribution in Information Criterion (AIC), where the AIC score= the concatenated analysis. Separate analyses with the 2 ln L+2 (number of parameters). The minimum C-distribution model for each of the individual proteins AIC estimate is a natural extension of the classical provided the best approximation of the data among maximum likelihood estimate when comparing models models we considered, and should be more reliable than with different numbers of parameters, and the model analyses which yield higher AIC values (Cao et al., that minimizes AIC is considered to be the most appro- 1999; Adachi et al., 2000). There seems to be a tendency priate model (Akaike, 1974; Sakamoto et al., 1986). to exaggerate support for a particular tree, whether or The mtrev-f model uses the amino acid frequencies not the tree is true, when the assumed model clearly of the data (the number of parameters is 19) and, for fails to accommodate the full complexity of the substitua bifurcating tree with 21 species, 39 branch lengths tion process (e.g. Hasegawa and Adachi, 1996). must be estimated. For the C-distribution model, one Why does the concatenated analysis not approximate Table 2 the data as well as the separate analysis, even though Comparison of log-likelihood scores for trees based on 12 mitochondrial the C-distribution model is applied? We suggest that the proteins in which birds, crocodilians and turtles form a pattern of rate variation among lineages and among monophyletic group relative to squamates nucleotide or amino acid positions differs between Tree Separate Concatenated different genes, such that a single C-distribution for all the genes provides a less accurate approximation of the 2 48 522.8 49 413.4 evolutionary process. 3 2.8±7.9 7.1±6.4 Although molecular phylogenetics has become a pow- 4 14.5±9.7 10.3±5.4 erful tool in elucidating the evolutionary history of

146 Y. Cao et al. / Gene 259 (2000) 139 148 organisms, a single gene does not necessarily contain Nishida (1999). These analyses suggest that the apparent sufficient information to resolve the problem at hand, discrepancy is attributable to subtle differences in the and therefore, it is necessary to consider as many alignment (i.e. higher stringency in our alignment using different loci as possible and to evaluate the total 3235 sites than in theirs using 3465 sites), different evidence. The ML method is particularly suitable for choice of outgroup taxa (i.e. mammals versus fishes), this purpose. Given a model, one can calculate the and, most importantly, different sampling of ingroup likelihood as the probability that one tree yielded the taxa (i.e. birds and turtles) (data not shown). These observed data, and each gene can reasonably be regarded results point to the importance of sampling sufficient as evolving independently from other genes. Therefore, numbers of ingroup taxa for more reliable phylogenetic the total support for a particular tree can be evaluated analyses. Further efforts to determine complete mtdna by simply summing up the estimated log-likelihoods of sequences from representative crocodilian and squamate individual genes for that tree, and the total log-likeli- lineages should thus be encouraged in this respect. hoods for different trees can then be compared (Adachi Tree-3 is the best supported tree by our total analysis and Hasegawa, 1996b; Hasegawa et al., 1997; Cao of mitochondrial and nuclear data, but it is not sup- et al., 1999). ported by presently available morphological data. If Hedges and Poling s (1999) analysis of the 11 nuclear Tree-3 is the true tree, turtles may have lost the morphoproteins gave 100% bootstrap support for Tree-3 using logical characteristics indicating a close relationship with ML analysis (99 and 97% support by the neighbormolecular crocodilians. The total evaluation of the accumulated joining and parsimony). These support values seem data does not reliably discriminate between much higher than that obtained by our analysis. This Trees-2 and 3, and more data are needed to resolve this discrepancy may be due to their neglect of heterogeneity issue. Concerning mitochondrial data, denser species among sites as well as among genes. Furthermore, sampling, particularly for Crocodylia, might be helpful, Mannen and Li (1999) obtained Tree-3 with very high because only American alligator with a long branch is bootstrap support by collecting and analyzing sequence represented in our data set. Alternatively, failure to find data of LDHa, LDHb, and a-enolase from several a single resolution for the branching order among birds, reptiles. However, they analyzed nucleotide sequences crocodilians, and turtles might indicate that divergences and ignored rate variation among different codon positime period. for these three lineages occurred successively in a short tions. Reanalysis of their data at the amino acid sequence level suggests that the data does not contain sufficient The diapsid affinity of turtles has recently been sup- information to reliably discriminate among trees ported from a morphological standpoint (see e.g. a (although Mannen and Li s data contain more species review by Rieppel and Reisz, 1999), but the currently than those of Hedges and Poling, results do not differ available morphological data do not seem to have much from those presented in Fig. 3B; data not shown). sufficient resolution as to which diapsids are closest In general agreement with results presented here, relatives of turtles. In this respect, it is noteworthy that the total molecular analyses of the present study rejected Mindell et al. (1999) found slight preference for a tree possibilities other than the archosaurian affiliation of with alligator and turtle as sisters based on 12 turtles. We feel our results provide reasonable topologimt-proteins and ML analysis assuming equal rates of cal constraints with which to re-evaluate the morphologchange across sites; however, this changed to slight ical characters. preference for a sister relationship between turtle and archosaurs on accommodating rate heterogeneity across sites. In our ML analyses using concatenated mitochondrial Acknowledgements proteins ( Table 2), Tree-3 was not significantly We thank Hidetoshi Shimodaira for discussions and worse than Tree-2 (7.1±6.4; P=0.27). Similar ML for his help in the analysis of the multiple comparisons, analyses by Kumazawa and Nishida (1999) statistically Mitsuko Kitahara for drawing Fig. 1. This work was rejected Tree-3 ( 19.2±8.9; P=0.03), but the log-likelisupported by grants from the Japan Society for the hood differences in their analyses were slightly overesti- Promotion of Sciences ( Y.C. and M.H.), Yamada mated due to errors in the mtrev24 matrix of PUZZLE Science Foundation (M.H.) and U.S. National Science version 4.0 (Strimmer and von Haeseler, 1996) as noted Foundation (M.D.S. and D.P.M.). in the PUZZLE online manual (http://www.treepuzzle.de/manual.html ). The corresponding difference using PUZZLE version 4.0.2, in which the mtrev error was fixed, was 15.0±7.6 (P=0.05), still supporting the References rejection of Tree-3 at the 5% significance level. We have Adachi, J., Hasegawa, M., 1996a. Model of amino acid substitution conducted extensive analyses using the data sets ana- in proteins encoded by mitochondrial DNA. J. Mol. Evol. 42, lyzed in this paper and that used by Kumazawa and 459 468.

Y. Cao et al. / Gene 259 (2000) 139 148 147 Adachi, J., Hasegawa, M., 1996b. MOLPHY: Programs for Molecular Janke, A., Gemmell, N., Feldmaier-Fuchs, G., von Haeseler, A., Phylogenetics vers. 2.3. In: Computer Science Monographs No. 28. Pääbo, S., 1996. The mitochondrial genome of a monotreme, the Institute of Statistical Mathematics, Tokyo. platypus (Ornithorhynchus anatinus). J. Mol. Evol. 42, 153 159. Adachi, J., Hasegawa, M., 1996c. Instability of quartet analyses of Janke, A., Xu, X., Arnason, U., 1997. The complete mitochondrial molecular sequence data by the maximum likelihood method: the genome of the wallaroo (Macropus robustus) and the phylogenetic Cetacea/Artiodactyla relationships. Mol. Phyl. Evol. 6, 72 76. relationship among Monotremata, Marsupialia and Eutheria. Proc. Adachi, J., Waddell, P., Martin, W., Hasegawa, M., 2000. Plastid Natl. Acad. Sci. USA 94, 1276 1281. genome phylogeny and a model of amino acid substitution for Jones, D., Taylor, W., Thornton, J., 1992. The rapid generation of proteins encoded by chloroplast DNA. J. Mol. Evol. 50, 348 358. mutation data matrices from protein sequences. Comput. Appl. Akaike, H., 1974. A new look at the statistical model identification. Biosci. 8, 275 282. IEEE Trans. Autom. Contr. AC-19, 716 723. Kirsch, J., Mayer, G., 1998. The platypus is not a rodent: DNA hybridization, Benton, M., 1990. Vertebrate Palaeontology. Unwin Hyman, London. amniote phylogeny and the palimpsest theory. Philos. Cao, Y., Adachi, J., Hasegawa, M., 1994. Eutherian phylogeny as Trans. R. Soc. London, Ser. B 353, 1221 1237. inferred from mitochondrial DNA sequence data. Jpn. J. Genet. Kishino, H., Hasegawa, M., 1989. Evaluation of the maximum likeli- 69, 455 472. hood estimate of the evolutionary tree topologies from DNA Cao, Y., Waddell, P., Okada, N., Hasegawa, M., 1998. The complete sequence data, and the branching order in Hominoidea. J. Mol. mitochondrial DNA sequence of the shark Mustelus manazo: evalu- Evol. 29, 170 179. ating rooting contradictions to living bony vertebrates. Mol. Biol. Kishino, H., Miyata, T., Hasegawa, M., 1990. Maximum likelihood Evol. 15, 1637 1646. inference of protein phylogeny, and the origin of chloroplasts. Cao, Y., Kim, K., Ha, J., Hasegawa, M., 1999. Model dependence of J. Mol. Evol. 31, 151 160. the phylogenetic inference: relationship among carnivores, perisso- Kumazawa, Y., Nishida, M., 1999. Complete mitochondrial DNA dactyls and cetartiodactyls as inferred from mitochondrial genome sequences of the green turtle and blue-tailed mole skink: statistical sequences. Genes Genet. Syst. 74, 211 217. evidence for archosaurian affinity of turtles. Mol. Biol. Evol. 16, Chang, Y.-s., Huang, F.-l., Lo, T.-b., 1994. The complete nucleotide 784 792. sequence and gene organization of carp (Cyprinus carpio) mito- Kumazawa, Y., Ota, H., Nishida, M., Ozawa, T., 1998. The complete chondrial genome. J. Mol. Evol. 38, 138 155. nucleotide sequence of a snake (Dinodon semicarinatus) mito- debraga, M., Rieppel, O., 1997. Reptile phylogeny and the interrela- chondrial genome with two identical control regions. Genetics tionships of turtles. Zool. J. Linn. Soc. 120, 281 354. 150, 313 329. Desjardins, P., Morais, R., 1990. Sequence and gene organization of Laurin, M., Reisz, R.R., 1995. A reevaluation of early amniote phylogeny. the chicken mitochondrial genome: a novel gene order in higher Zool. J. Linn. Soc. 113, 165 223. vertebrates. J. Mol. Biol. 212, 599 634. Lee, M.S.Y., 1995. Historical burden in systematics and the interrelationships Felsenstein, J., 1981. Evolutionary trees from DNA sequences: a maximum of parareptiles. Biol. Rev. 70, 459 547. likelihood approach. J. Mol. Evol. 17, 368 376. Lee, M.S.Y., 1997. Reptile relationships turn turtle. Nature 389, Gauthier, J., Kluge, A.G., Rowe, T., 1988. Amniote phylogeny and 245 246. the importance of fossils. Cladistics 4, 105 209. Lockhart, P., Larkum, A., Steel, M., Waddell, P., Penny, D., 1996. Groth, J.G., Barrowclough, G.F., 1999. Basal divergences in birds and Evolution of chlorophyll and bacteriochlorophyll: the problem of the phylogenetic utility of the nuclear RAG-1 gene. Mol. Phyl. invariant sites in sequence analysis. Proc. Natl. Acad. Sci. USA 93, Evol. 12, 115 123. 1930 1934. Gutell, R., Gray, M., Schnare, M., 1993. A compilation of large sub- Løvtrup, S., 1980. The Phylogeny of Vertebrata. Plenum Press, New unit (23S and 23S-like) ribosomal RNA structures: 1993. Nucl. York. Acids Res. 21, 3055 3074. Mannen, H., Li, S.-L., 1999. Molecular evidence for a clade of turtles. Halanych, K., 1998. Lagomorph misplaced by more characters and Mol. Phyl. Evol. 13, 144 148. fewer taxa. Syst. Biol. 47, 138 146. Mindell, D., Sorenson, M., Huddleston, C., Miranda Jr., H., Knight, Hasegawa, M., Adachi, J., 1996. Phylogenetic position of cetaceans A., Sawchuk, S., Yuri, T., 1997. Phylogenetic relationships among relative to artiodactyls: reanalysis of mitochondrial and nuclear and within select avian orders based on mitochondrial DNA. In: sequences. Mol. Biol. Evol. 13, 710 717. Mindell, D. (Ed.), Avian Molecular Evolution and Systematics. Hasegawa, M., Kishino, H., Yano, T., 1985. Dating of the human ape Academic Press, San Diego, pp. 213 247. splitting by a molecular clock of mitochondrial DNA. J. Mol. Mindell, D., Sorenson, M., Dimcheff, D., Hasegawa, M., Ast, J., Yuri, Evol. 22, 160 174. T., 1999. Interordinal relationships of birds and other reptiles based Hasegawa, M., Adachi, J., Milinkovitch, M., 1997. Novel phylogeny on whole mitochondrial genomes. Syst. Biol. 48, 138 152. of whales supported by total molecular evidence. J. Mol. Evol. 44, Neefs, J.-M., Van de Peer, Y., De Rijk, P., Chapelle, S., De Wachter, Suppl. 1, 117 120. R., 1993. Compilation of small ribosomal subunit RNA structures. Härlid, A., Arnason, U., 1999. Analyses of mitochondrial DNA nest Nucl. Acids Res. 21, 3025 3049. ratite birds within the Neognathae-supporting a neotenous origin Philippe, H., Douzery, E., 1994. The pitfalls of molecular phylogeny of ratite morphological characters. Proc. R. Soc. London, Ser. B based on four species, as illustrated by the Cetacea/Artiodactyla 266, 305 309. relationships. J. Mammal. Evol. 2, 133 152. Härlid, A., Janke, A., Arnason, U., 1997. The mtdna sequence of Platz, J., Conlon, J., 1997. and turn back again. Nature 389, the ostrich and the divergence between paleognathous and neognathous 246 246. birds. Mol. Biol. Evol. 14, 754 761. Rieppel, O., debraga, M., 1996. Turtles as diapsid reptiles. Nature Härlid, A., Janke, A., Arnason, U., 1998. The complete mitochondrial 384, 453 455. genome of Rhea americana and early avian divergence. J. Mol. Rieppel, O., Reisz, R.R., 1999. The origin and early evolution of turtles. Evol. 46, 669 679. Annu. Rev. Ecol. Syst. 30, 1 22. Hedges, S., Poling, L., 1999. A molecular phylogeny of reptiles. Science Roe, B., Ma, D.-P., Wilson, R., Wong, J.-H., 1985. The complete 283, 998 1001. nucleotide sequence of the Xenopus laevis mitochondrial genome. Janke, A., Feldmaier-Fuchs, G., Thomas, W., von Haeseler, A., Pääbo, J. Biol. Chem. 260, 9759 9774. S., 1994. The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137, 243 256. Romer, A., 1966. Vertebrate Paleontology. University of Chicago Press, Chicago.

148 Y. Cao et al. / Gene 259 (2000) 139 148 Sakamoto, Y., Ishiguro, M., Kitagawa, G., 1986. Akaike Information nucleotide sequence of the Crossostoma lacustre mitochondrial Criterion Statistics. Reidel, Dordrecht. genome: conservation and variations among vertebrates. Nucl. Shimodaira, H., Hasegawa, M., 1999. Multiple comparisons of log- Acids Res. 20, 4853 4858. likelihoods with applications to phylogenetic inference. Mol. Biol. van Tuinen, M., Sibley, C., Hedges, S., 2000. The early history of Evol. 16, 1114 1116. modern birds inferred from DNA sequences of nuclear and mitochondrial Stock, D.W., Quattro, J.M., Whitt, G.S., Powers, D.A., 1997. Lactate ribosomal genes. Mol. Biol. Evol. 17, 451 457. dehydrogenase (LDH) gene duplication during chordate evolution: Wilkinson, M., Thorley, J., Benton, M., 1997. Uncertain turtle relationships. the cdna sequence of the LDH of the tunicate Styela plicata. Mol. Nature 387, 466 466. Biol. Evol. 14, 1273 1284. Yang, Z., 1996. Among-site rate variation and its impact on phylogenetic Storer, R.W., 1971. Classification of birds. In: Farner, D.S., King, J.R. analyses. Trends Ecol. Evol. 11, 367 372. (Eds.), Avian Biology vol. 1. Academic Press, New York, pp. 1 18. Yang, Z., 1997. PAML: a program package for phylogenetic analysis Strimmer, K., von Haeseler, A., 1996. Quartet puzzling: a quartet by maximum likelihood. CABIOS 13, 555 556. maximum-likelihood method for reconstructing tree topologies. Zardoya, R., Meyer, A., 1998. Complete mitochondrial genome suggests Mol. Biol. Evol. 13, 964 969. diapsid affinities of turtles. Proc. Natl. Acad. Sci. USA 95, Sullivan, J., Swofford, D., 1997. Are guinea pigs rodents? The impor- 14 226 14 231. tance of adequate models in molecular phylogenetics. J. Mammal. Zardoya, R., Garrido-Pertierra, A., Bautista, J., 1995. The complete Evol. 4, 77 86. nucleotide sequence of mitochondrial DNA genome of the rainbow Tzeng, C.-S., Hui, C.-F., Shen, S.-C., Huang, P., 1992. The complete trout, Oncorhynchus mykiss. J. Mol. Evol. 41, 942 951.