Chinese Science Bulletin 2003 Vol. 48 No. 19 2050 2054 Complete mitochondrial DNA sequence of Chinese alligator, Alligator sinensis, and phylogeny of crocodiles WU Xiaobing 1,3, WANG Yiquan 1,2,ZHOUKaiya 1, ZHU Weiquan 1, NIE Jishan 4 & WANG Chaolin 4 1. College of Life Sciences, Nanjing Normal University, Nanjing 210097, China; 2. School of Life Sciences, Xiamen University, Xiamen, 361005, China; 3. College of Life Sciences, Anhui Normal University, Wuhu 241000, China; 4. Anhui Research Center for Chinese Alligator Reproduction, Xuanzhou 242034, China Correspondence should be addressed to Wang Yiquan (e-mail: wangyqnj @jlonline.com) Abstract The 16746-neucleotide (nt) sequence of mitochondrial DNA (mtdna) of Chinese alligator, Alligator sinensis, was determined using the Long-PCR and primer walking methods. As is typical in vertebrates, the mtdna encodes 13 proteins, 2 rrna, 22tRNA genes, and a noncoding control region. The composition of bases is respectively 29.43% A, 24.59% T, 14.86% G, 31.12% C. The gene arrangement differs from the common vertebrate gene arrangement, but is similar to that of other crocodiles. DNA sequence data from 12S rrna, 16S rrna, protein-coding genes and combined sequence data were used to reconstruct the phylogeny of reptiles with the MP and ML methods. With this large data set and an appropriate range of outgroup taxa, the authors demonstrate that Chinese alligator is most closely related to American alligator among three crocodilian species, which suppors the traditional viewpoint. According to the branch lengths of ML tree from the combined data set, the primary divergence between Alligator and Caiman genus was dated at about 74.9 Ma, the split between Chinese alligator and American alligator was dated at 50.9 Ma. Keywords: Chinese alligator, mitochondrial genome, complete sequence, phylogeny, divergence time. DOI: 10.1360/03wc0076 The complete mitochondrial DNA (mtdna) sequences have been determined from more than 100 Chordata species since the first complete mtdna sequences of human determined in 1981 [1], which covered all the classes of Chordata [2 8]. However, these samples are extremely biased toward mammals, fish and birds with exception of several amphibians and reptiles. To date, only 8 complete mitochondrial DNA (mtdna) sequences of reptile species have been determined, including Alligator mississippiensis [4], Caiman crocodilus [9], Iguana iguana [9], Eumes egregius [10], Dinodon semicarinatus [11], Chelonia mydas [10], Chrysemys picta [12] and Pelomedusa subrufa [13]. The vertebrate mitochondrial DNA is a closed circular, double stranded molecule. The genes are arranged very compactly, with no intron and few intergenic nucleotides. This small organelle genome normally encodes genes of 2 ribosomal subunit RNAs (srrna and lrrna), 22 trnas, 13 protein-coding genes, and a major noncoding gene or control region (or D-loop) [14]. The complete mtdna sequences have been more and more used to infer the phylogeny of vertebrates, and from which some viewpoints and discussions were divergent between the morphology and molecular data [4,9,10,15,16]. As to the systematic position of Crocodilians, the evidence from the complete mtdna sequence data supports a close relationship between Crocodilians and birds [4,9,10], and turtles as a sister group of birds and Crocodilians [9,10,13]. Alligator and Caiman are 2 genera in Alligatoridae. Alligator sinensis and A. mississippienisis are the only two species in living Alligator. Although several biologists doubted the classification position since Chinese alligator was named, Chinese alligator is still placed in genus Alligator. Owing to the significant differences between Chinese alligator and American alligator in the character of skull, and similar to Caiman [17,18], Chinese alligator was suggested to establish a new genus, Caigator, by Deraniyagala in 1947 [17]. Although this suggestion was reconfirmed in his subsequent paper, the opinion was not accepted in academia up to date. From then on, many biologists also compared the 2 alligators in histology [19 21], karyotype [22], biochemistry and immunology [23],andthere is still a lack of consensus regarding this issue. Mo [24] researched the evolution relationship of 4 crocodiles with DNA-DNA hybridization, and suggested an evolutionary route of American alligator Crocodylus porosus Crocodylus siamensis Chinese alligator, which is a new challenge for the evolutionary relationship of 2 alligators. In this report, we sequenced the mitochondrial genome of Chinese alligator, with the objective of establishing the origin and evolutionary relationships of the 2 living alligators based on complete mitochondrial genome sequences. 1 Materials and methods ( ) Materials. Fresh liver from a female alligator was carried to laboratory from Anhui Research Center for Chinese alligator Reproduction (ARCCAR) immerged in fluid nitrogen, and was stored at 80k for mtdna isolation. ( ) mtdna isolation, Long-PCR, cloning and sequencing. Mitochondrial DNA was extracted from frozen liver according to a previously described procedure by Arnason [2]. An about 14500 bp mtdna fragment was obtained from mitochondrial DNA template by Long- PCR thatrunonptc-200 TM thermal cycler (MJ Research) us- 2050 Chinese Science Bulletin Vol. 48 No. 19 October 2003
ing primers L1091, 5 -AAACTGGGATTAGATACCC- CACTAT-3 and H15149, 5 -AAACTGCAGCCCCTCA- GAATGATATTTGTCCTCA-3 [25]. The enzyme for Long- PCR was LA-Taq polymerse (Takara Co.), and the PCR conditions followed the manual provided by manufacturer. After being purified by DNA gel extraction, the Long- PCR production was cut with restriction endonucleases Pstm and Spem, ligated to pgem-3zf (+) and pgem- 5Zf (+) and sequenced with universal primers M13, M13 Reverse or T7. The large inserted fragments were sequenced with primer walking. The regions not covered by natural clones were sequenced after polymerase chain reaction amplification with designed specific primers. Both universal and many specific primers were used in the sequencing process. All sequences were completed on ABI 310 sequencer. ( ) Data analysis. Gene arrangement was identified by comparison with the mtdna sequences of American alligator and Caiman crocodilus. trnas were identified also by constructing their cloverleaf configurations. Amino acid sequences were translated using DNAclub programme (Chen, DNAclub, Inc.). Phylogenetic analysis. Completed mtdna sequence of Chinese alligator together with data obtained from GenBank, that include Alligator mississippiensis (Y13113), Caiman crocodilus (AJ404872), Iguana iguana (AJ404872), Eumes egregious (AB016606), Dinodon semicarinatus (AB008539), Chelonia mydas (AB012104), Chrysemys picta (AF069423), Pelomedusa subrufa (AF039066), Xenopus laevis (M10217) and 2 birds (Y12025, NC_001323), were used for the construction of phylogenetic tree. Total of 13 protein-coding genes as well as 12S rrna and 16S rrna genes were combined in the analysis. Considering two factors that NADH6 is encoded by light strand and NADH4 in Caiman crocodilus is highly divergent from those in other crocodiles, we exclude these two genes in subsequent calculation. Sequences were aligned with Clustal X (version 1.8) [26] and ambiguously aligned regions were removed manually. Xenopus laevis was selected for outgroup. The maximum parsimony (MP) method was applied to the data set of 12S rrna 16S rrna, protein-coding gene sequences and the combined data using heuristic search with TBR branch swapping implemented in PAUP 4.0 [27] to find the most parsimonious tree. For phylogenetic analyses, each nucleotide was treated as an unordered character with four alternative states. Gaps were considered as missing data in all analyses. The combined data set was also analyzed using the maximum likehood (ML) as implemented in PUZZLE5.0 [28]. Different ponderation schemes were used: HKY evolution model was selected, the ratio of transitions and transversions and nucleotide acid frequencies were estimated by PAUP4.0. The robustness of the MP and ML trees was tested using the bootstrap method with 1000 replicates. 2 Results and discussion General features of genomic organization of Chinese alligator mtdna. The mitochondrial DNA of Chinese alligator is a closed circular and double stranded molecule with a genome size of 16746 bp (Accession No. AF511507). It encodes genes for 2 ribosomal subunit RNAs (srrna and lrrna), 22 trnas, and 13 protein subunits. The length of the mtdna of Chinese alligator is similar to that of American alligator, and is shorter than that of Caiman crocodilus because of longer repeat sequence units in the control region of Caiman crocodilus [9,10]. The gene arrangement of Chinese alligator mitochondrial genome conforms to that of the American alligator and Caiman crocodilus. In other crocodile mitochondrial genomes studied so far, the trna for phenylalanine is also located on the 5 -side of the control region [4,9]. One of the remarkable features of Chinese alligator mtdna is the highly compact genome organization. Most of alligator mitochondrial genes either abut directly or have very small numbers of nucleotides separating them. In addition, there are some overlaps between proteincoding genes, or a trna and a protein-coding gene. And two of the cases show very long overlaps (COI-tRNA Ser,9 nt; ATP8-ATP6,25nt). The overall base composition of the L-strand is 29.43% A, 24.59% T, 14.86% G, and 31.12% C. The GC content of the whole mtdna reaches 45.98%, and that of 12S rrna and protein-coding gene sequences reaches respectively 49.20% at the highest level and 46.23%. The GC-Skew of mtdna of Chinese alligator is 0.35, which is slightly higher than that of other vertebrates. The mtdna control region exhibits variable lengths in vertebrates, and especially in crocodiles. In the 2 crocodiles studied, the length of mtdna control region of Caiman crocodilus is 1992 bp, which is 1004 bp longer than that of American alligator (988 bp). As much as we know so far, it is the longest control region in vertebrates because of the insertion of tandemly repeated sequences [9]. The length of the control region in the Chinese alligator mitochondrial genome is 1045 bp long, and it is localized between the trna phe and srrna genes. A total of 16 repeats are present in the Chinese alligator mtdna control region with 4 repeat sequence units (5 -TTATAGGGCC- ATAAATTTA TA-3, 21bplong,4repeats;5 -TTATAG- GGCCATAAATTTACAT-3, 22 bp long, 6 repeats; 5 -TT- ATAGAGCCATAAATTTATA-3, 21 bp long, 3 repeats; 5 -CCATAAACTTATATTATAGAA-3, 21 bp long, 3 repeats). The former three repeat sequence units are very similar, which may be mutated by a basal unit. In addition, asimpleploy(c) 16 sequence was inserted in the 516th site of Chinese alligator control region. Chinese Science Bulletin Vol. 48 No. 19 October 2003 2051
Table 1 The results of phylogenetic analyses using MP method for different data sets Comparison 12S rdna 16S rdna 11 encoding genes Combined data Sites number 1054 1765 11086 12540 Variable sites number 215 289 1554 1853 Parsimony-informative sites number 482 811 5866 6484 MP tree number 2 2 1 1 Tree length 1803 2915 22687 25111 Consistency index (CI) 0.6040 0.6336 0.5627 0.5701 Retention index (RI) 0.4550 0.4860 0.3935 0.4050 Rescaled consistency index (RC) 0.2748 0.3080 0.2214 0.2309 Fig. 1. Reptile maximum-likehood tree based on combined 12S rrna + 16S rrna + protein-coding genes sequences reconstructed by the PUZ- ZLE5.0 program. The values above the branches are Bootstrap values (1000 replicates), and the values below the branches are maximum-likehood branch lengths for the external and internal branches. ( ) Phylogeny trees. Using Xenopus laevis as an outgroup, the results of phylogenetic analyses using the MP method for different data sets are listed in Table 1. Fig. 1 shows that the phylogenetic tree resulting from the ML analysis of the combined data is very similar to that resulting from all the MP strict consensus trees. The distance from Chinese alligator to Caiman crocodilus is much greater than that to American alligator, indicating that a closer evolutionary relationship exists between Chinese alligator and American alligator. The support rate of Bootstrap value from ML tree is up to 99%. ( ) Classification position analysis of Chinese alligator. The classification position of Chinese alligator and the relationship between 2 alligators have been doubted since one century ago. After Bouglenger found some morphological similarity between Chinese alligator and Caiman crocodilus, Deraniyagala [17] also found 10 similar characteristics of skull between them, and only 2 similar traits of skull between Chinese alligator and American alligator. So, he suggested that Chinese alligator should be placed into a new genus Caigator. But his viewpoint was ignored. The other morphology data also give a support to this point. Chen [29] compared the measurement values among the living alligators such as Caiman crocodilus, Crocodilus porosus and Crocodylus cataphractus, and also found that a closer relationship between Chinese alligator and Caiman crocodilus [29].The DNA-DNA hybridization renaturation kinetics data indicated that Chinese alligator was located at the clade base of 3 crocodiles [24]. In order to avoid the potential risks and inconsistencies of a single method or data set, the phylogenetic reconstructions were based on different approaches, ML and 2052 Chinese Science Bulletin Vol. 48 No. 19 October 2003
Fig. 2. Evolution of crocodiles inferred from the molecular phylogeny and paleontological records. MP, applied to the different sequences data. Fig. 1 shows an ML tree based on the combined data as constructed by PUZZLE5.0. The lengths of various branches of the tree are given under the branches. All the phylogenetic trees from 12S rrna, 16S rrna, protein-coding genes and the combined data indicated that Chinese alligator was consensus to be clustered together with American alligator, and not with Caiman crocodilus (Fig. 1). The bootstrap supports for the 2 alligators lineage based on the MP method from 12S rrna, 16S rrna, protein- coding genes and the combined data are 65%, 72%, 100% and 100%, respectively, and on the ML method from the combined data is 99%. In general, these results strongly support the affinities of the 2 alligators, and reject a closer relationship between Chinese alligator and Caiman crocodilus, and also deny an American alligator Crocodylus porosus Crocodylus siamensis Chinese alligator evolutionary route. ( ) Dating of Alligator divergence. Due to the differences in evolutionary rates in different groups, the divergence time will be estimated roughly. The ML branch lengths of Fig. 1 are used to estimate divergence time between and among different groups. Janke and Arnason [4] suggested the divergence between the avian and crocodilian lineages took place at ~254 Ma. On the basis of the origin of crocodiles being at 254 Ma and assuming a constant rate along the crocodilian lineage, Janke et al. [9] suggested the divergence of the alligator and caiman was estimated to be ca. 80 Ma ago. According to the estimating method by Janke and Arnason [4], the split between Chinese alligator and American alligator was placed at ~50.9 Ma. Similarly, the divergence between Alligator and Caiman genus was placed at ~74.9 Ma. Fig. 2 shows the evolution of crocodiles inferred from the molecular phylogeny of this study and paleontological records. According to fossil records, Alligatoridae was first known at the late Cretaceous of Mesozoic (100 140 Ma) in North America [30,31]. Amercan alligator and Caiman crocodilus represent 2 genera in Alligatoridae, respectively. The divergence between them took place at 80 Ma [9], which was identical to their inferred geologic epochs. However, the fossil of Alligatoridae was not discovered in the same stratum of Eastern Asia. But, in Paleocene stratum, a fossil of this family was discovered in China, which was very similar to living Chinese alligator in morphology [31]. This indicates that Alligatoridae in Eastern Asian may be extended from North America. According to the origin estimated in this study, Alligator genus occurred in the late Cretaceous (74.9 Ma), and the divergence between 2 living alligators occurred in late Paleocene (50.9 Ma). The divergence time of 2 living alligators estimated from mitochondrial genome sequence data is earlier than that estimated from DNA-DNA renatureation kinetics data (45 Ma) [24].Butthe fossil of Alligator genus was first known in the Oligocene (25 37 Ma) [30], which is later than the divergence time estimated from molecular data. Acknowledgements The authors would like to thank Prof. Chen Bihui for his good advice. This work was supported by the Ministry of Education of China (Grant No. GG-180-2100-2403-1740) and SRF for ROCS, from the Ministry of Education of China. References 1. Anderson, S., Bankier, A. T., Barrell, B. G. et al., Sequence and organization of the human mitochondrial genome, Nature, 1981, 290: 457 465. 2. Arnason, U., Gullberg, A., Widegren, B., The complete nucleotide sequence of the mitochondrial DNA of the fin whale, Balaenoptera physalus, J. Mol. Evol., 1991, 33: 556 568. 3. Desjardins, P., Morais, R., Sequence and gene organization of the chicken mitochondrial genome: a novel gene order in higher vertebrates, J. Mol. Biol., 1990, 212: 599 634. 4. Janke, A., Arnason, U., The complete mitochondrial genome of Alligator mississippiensis and the separation between recent Archosauria (birds and crocodiles), Mol. Biol. Evol., 1997, 14: 1266 1272. 5. Sumida, M., Kanamori, Y., Kaneda, H. et al., Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of Chinese Science Bulletin Vol. 48 No. 19 October 2003 2053
the Japanese pond frog, Rana nigromaculata, Genes Genet. Syst., 2001, 76: 311 325. 6. Miya, M., Nishida, M., Organization of the mitochondrial genome of a deep-sea fish, Gonostoma gracile (Teleostei: Stomiiformes): First example of transfer RNA gene rearrangements in bony fishes, Mar. Biotechnol., 1999, 1: 416 426. 7. Boore, J. L., Daehler, L. L., Nrown, W. M., Complete sequence, gene arrangement, and genetic code of mitochondrial DNA of the Cephalochordate, Branchiostoma floridae (Amphioxus), Mol. Bio. Evol., 1999, 16 (3): 410 418. 8. Yokobori,S.I.,Ueda,T.,Feldmaier-Fuch,G.etal.,CompleteDNA sequence of the mitochondrial genome of the ascidian, Halocynthia roretzi (Chordata, Urochordata), Genetics, 1999, 153: 1851 1862. 9. Janke, A., Erpenbeck, D., Nilsson, M. et al., The mitochondrial genomes of the iguana (Iguana iguana) and the caiman (Caiman crocodylus): implications for amniote phylogey, Proc. R. Soc. Lond. B, 2001, 268: 623 631. 10. Kumazawa, Y., Nishida, M., Complete mitochondrial DNA sequences of the green turtle and blue-tailed mole skink: statistical evidence for archosaurian affinity of turtles, Mol. Biol. Evol., 1999, 16(6): 784 792. 11. Kumazawa, Y., Ota, H., Nishida, M. et al., The complete nucleotide sequence of a snake (Dinodon semicarinatus) mitochondrial genome with two identical control regions, Genetics, 1998, 150: 313 329. 12. Mindell D. P., Sorenson, M. D., Dimcheff, D. E. et al., Interordinal relationships of birds and other reptiles based on whole mitochondrial genomes, Syst. Biol., 1999, 48: 138 152. 13. Zardoya, R., Meyer, A., Complete mitochondrial genome suggests diapsid affinities of turtles, Proc. Natl. Acad. Sci. USA, 1998, 95: 14226 14231. 14. Shadel, G. S., Clayton, D. A., Mitochondrial DNA maintenance in vertebrates, Annu. Rev. Biochem., 1997, 66: 409 435. 15. Xu, X., Gullberg, A., Arnason, U., The complete mitochondrial DNA (mtdna) of the donkey and mtdna comparisons among four closely related mammalian species-pairs, J. Mol. Evol., 1996, 43(5): 438 446. 16. Curole, J. P., Kocher, T. D., Mitogenomics: digging deeper with complete mitochondrial genomes, Trends. Ecol. Evol., 1999, 14: 394 398. 17. Deraniyagala, P. E. P., A new genus for the Chinese alligator, Proceedings of the Third Annual Sessions of the Ceylon Association of Science, 1947, 2: 12. 18. Mook, C. C., Skull characters of Recent Crocodilia with notes on the affinities of the Recent genera, Bulletin of the American Museum of Natural History, 1921, 44(13): 123 268. 19. Chen, B. H., Jiang, X. L., Wang, C. L., Observation on the structure and development of derma gland of Chinese alligator, Acta Zool. Sin. (in Chinese), 1991, 1: 16 21. 20. Chen, B. H., Tang, J. Y., Research on the lingual glands of Chinese alligator, Acta Zool. Sin. (in Chinese), 1989, 1: 28 33. 21. Wu, X. B., Chen, B. H., Wang, C. L., Reseach on the histology of visual organ of Chinese alligator, Acta Zool. Sin. (in Chinese), 1993, 39(3): 244 250. 22. Cohen, M. M., Gans, C., The chromosomes of the order Crocodilea, Cytogenetics, 1970, 9: 81 105. 23. Densmore, L. D., Biochemical and immunological systematic of the order crocodilia, in Evolution Biology, Vol. 16, New York: Plenum Press, 1983, 397 465. 24. Mo, X. Q., Zhao, T. J., Qin, P. C., The origin of Chinese alligator, Science in China, Ser. B (in Chinese), 1991, 10: 1047 1053. 25. Kocher,T.D.,Thomas,W.K.,Maer,A.etal.,Dynamicsofmitochondrial evolution in animals: amplification and sequencing with conserved primers, Proc. Natl. Acad. Sci., 1989, 86: 6196 6200. 26. Thompson, J. D., Gibson, T. J., Plewniak, F. et al., The CLUSTAL- X windows interface: Flexible strategies for multiple sequence alignment aided by quality adalysis tools, Nucleic Acids Res., 1997, 25(24): 4876 4882. 27. Swofford, D. L., PAUP*. Phylogenetic Analysis Using Parsimony (* and Other Methods), Version 4.08 Sinauer, Sunderland, MA, 2001. 28. Schmidt, H. A., Strimmer, K., Vingron, M. et al., Maximum likelihood analysis for nucleotide, amino acid, and two-state data, Version 5.0, 2000. 29. Chen, B. H., Research on the genus name of Chinese alligator, in Procceding of International Herpetologic Conference in Huangshan of China (in Chinese), Beijing: Forestry Press, 1993, 18 21. 30. Buffetaut, E., The evolution of the crocodilians, Scientific American, 1979, 241(4): 130 144. 31. Xu, Q. Q., Huang, C. C., Some problems in evolution and distribution of Alligator, Vertebratea Palasiatica (in Chinese), 1984, 22(1): 49 53. (Received May 5, 2003; accepted July 25, 2003) 2054 Chinese Science Bulletin Vol. 48 No. 19 October 2003