GEODIS 2.0 DOCUMENTATION

Similar documents
Lecture 11 Wednesday, September 19, 2012

Modern Evolutionary Classification. Lesson Overview. Lesson Overview Modern Evolutionary Classification

LABORATORY EXERCISE 7: CLADISTICS I

Ch 1.2 Determining How Species Are Related.notebook February 06, 2018

LABORATORY EXERCISE 6: CLADISTICS I

Comparing DNA Sequence to Understand

Species: Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora Class: Mammalia Phylum: Chordata

Phylogeny Reconstruction

Comparing DNA Sequences Cladogram Practice

Systematics, Taxonomy and Conservation. Part I: Build a phylogenetic tree Part II: Apply a phylogenetic tree to a conservation problem

Fig Phylogeny & Systematics

Testing Phylogenetic Hypotheses with Molecular Data 1

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Biology 164 Laboratory

Evolution in dogs. Megan Elmore CS374 11/16/2010. (thanks to Dan Newburger for many slides' content)

INQUIRY & INVESTIGATION

History of Lineages. Chapter 11. Jamie Oaks 1. April 11, Kincaid Hall 524. c 2007 Boris Kulikov boris-kulikov.blogspot.

UNIT III A. Descent with Modification(Ch19) B. Phylogeny (Ch20) C. Evolution of Populations (Ch21) D. Origin of Species or Speciation (Ch22)

Biology. Slide 1 of 33. End Show. Copyright Pearson Prentice Hall

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

CLADISTICS Student Packet SUMMARY Phylogeny Phylogenetic trees/cladograms

AP Lab Three: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Lab 7. Evolution Lab. Name: General Introduction:

Name: Period: Student Exploration: Mouse Genetics (One Trait)

Biol 160: Lab 7. Modeling Evolution

Using social media research methods to identify hidden churches

Let s Build a Cladogram!

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

Warm-Up: Fill in the Blank

Call of the Wild. Investigating Predator/Prey Relationships

Fruit Fly Exercise 2 - Level 2

The Making of the Fittest: LESSON STUDENT MATERIALS USING DNA TO EXPLORE LIZARD PHYLOGENY

Do the traits of organisms provide evidence for evolution?

Cow Exercise 1 Answer Key

Title: Phylogenetic Methods and Vertebrate Phylogeny

Representation, Visualization and Querying of Sea Turtle Migrations Using the MLPQ Constraint Database System

HEREDITARY STUDENT PACKET # 5

17.2 Classification Based on Evolutionary Relationships Organization of all that speciation!

Virtual Lab: Sex-Linked Traits Worksheet. 1. Please make sure you have read through all of the information in the

husband P, R, or?: _? P P R P_ (a). What is the genotype of the female in generation 2. Show the arrangement of alleles on the X- chromosomes below.

TOPIC CLADISTICS

Name: Date: Hour: Fill out the following character matrix. Mark an X if an organism has the trait.

Introduction to Cladistic Analysis

Bi156 Lecture 1/13/12. Dog Genetics

Mendelian Genetics Using Drosophila melanogaster Biology 12, Investigation 1

Phylogeographic assessment of Acanthodactylus boskianus (Reptilia: Lacertidae) based on phylogenetic analysis of mitochondrial DNA.

Geo 302D: Age of Dinosaurs LAB 4: Systematics Part 1

The Neanderthal within

Drd. OBADĂ MIHAI DORU. PhD THESIS ABSTRACT

muscles (enhancing biting strength). Possible states: none, one, or two.

Congeneric phylogeography: hypothesizing species limits and evolutionary processes in Patagonian lizards of the Liolaemus boulengeri

Taxonomy and Pylogenetics

Introduction to phylogenetic trees and tree-thinking Copyright 2005, D. A. Baum (Free use for non-commercial educational pruposes)

The melanocortin 1 receptor (mc1r) is a gene that has been implicated in the wide

Biology 2108 Laboratory Exercises: Variation in Natural Systems. LABORATORY 2 Evolution: Genetic Variation within Species

Testing Species Boundaries in an Ancient Species Complex with Deep Phylogeographic History: Genus Xantusia (Squamata: Xantusiidae)

Prof. Neil. J.L. Heideman

Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

If fungi, plants, and animals all have nuclei, this makes them which type of cell? What trait do the mushroom and gecko share that the tree lacks?

Econometric Analysis Dr. Sobel

INTRODUCTION OBJECTIVE REGIONAL ANALYSIS ON STOCK IDENTIFICATION OF GREEN AND HAWKSBILL TURTLES IN THE SOUTHEAST ASIAN REGION

BioSci 110, Fall 08 Exam 2

Title: Sources of Genetic Variation SOLs Bio 7.b.d. Lesson Objectives

2013 Holiday Lectures on Science Medicine in the Genomic Era

Student Exploration: Mouse Genetics (One Trait)

PLEASE PUT YOUR NAME ON ALL PAGES, SINCE THEY WILL BE SEPARATED DURING GRADING.

Dynamic evolution of venom proteins in squamate reptiles. Nicholas R. Casewell, Gavin A. Huttley and Wolfgang Wüster

Bio 1B Lecture Outline (please print and bring along) Fall, 2006

Online Heredity Lab. 5. Explain how a trait can disappear and then reappear in later generations.

Worksheet for Morgan/Carter Laboratory #9 Mendelian Genetics II: Drosophila

Biodiversity and Distributions. Lecture 2: Biodiversity. The process of natural selection

Final Report for Research Work Order 167 entitled:

Inferring Ancestor-Descendant Relationships in the Fossil Record

How can one species become two?

Historical Responses Of Marine Turtles To Global Climate Change And Juvenile Loggerhead Recruitment In Florida

HAWAIIAN BIOGEOGRAPHY EVOLUTION ON A HOT SPOT ARCHIPELAGO EDITED BY WARREN L. WAGNER AND V. A. FUNK SMITHSONIAN INSTITUTION PRESS

ISSN CAT news. N 63 Spring 2016

A SPATIAL ANALYSIS OF SEA TURTLE AND HUMAN INTERACTION IN KAHALU U BAY, HI. By Nathan D. Stewart

Modern taxonomy. Building family trees 10/10/2011. Knowing a lot about lots of creatures. Tom Hartman. Systematics includes: 1.

A range-wide synthesis and timeline for phylogeographic events in the red fox (Vulpes vulpes)

Bones, Stones, and Genes: The Origin of Modern Humans Lecture 2- Genetics of Human Origins and Adaptation Sarah A. Tishkoff, Ph.D.

Introduction Histories and Population Genetics of the Nile Monitor (Varanus niloticus) and Argentine Black-and-White Tegu (Salvator merianae) in

ERG on multidrug-resistant P. falciparum in the GMS

Multi-Locus Phylogeographic and Population Genetic Analysis of Anolis carolinensis: Historical Demography of a Genomic Model Species

EVIDENCE FOR PARALLEL ECOLOGICAL SPECIATION IN SCINCID LIZARDS OF THE EUMECES SKILTONIANUS SPECIES GROUP (SQUAMATA: SCINCIDAE)

DATA SET INCONGRUENCE AND THE PHYLOGENY OF CROCODILIANS

Evolution of Agamidae. species spanning Asia, Africa, and Australia. Archeological specimens and other data

mtdna data indicate a single origin for dogs south of Yangtze River, less than 16,300 years ago, from numerous wolves

Interpreting Evolutionary Trees Honors Integrated Science 4 Name Per.

PARTIAL REPORT. Juvenile hybrid turtles along the Brazilian coast RIO GRANDE FEDERAL UNIVERSITY

Two Sets to Build Difference Edward I. Maxwell

Question 3 (30 points)

The purpose of this lab was to examine inheritance patters in cats through a

Two Sets to Build Difference Edward I. Maxwell

6. The lifetime Darwinian fitness of one organism is greater than that of another organism if: A. it lives longer than the other B. it is able to outc

Clarifications to the genetic differentiation of German Shepherds

Understandings, Applications and Skills (This is what you maybe assessed on)

You have 254 Neanderthal variants.

Prof Michael O Neill Introduction to Evolutionary Computation

University of Canberra. This thesis is available in print format from the University of Canberra Library.

Transcription:

GEODIS.0 DOCUMENTATION 1999-000 David Posada and Alan Templeton Contact: David Posada, Department of Zoology, 574 WIDB, Provo, UT 8460-555, USA Fax: (801) 78 74 e-mail: dp47@email.byu.edu 1. INTRODUCTION GeoDis is program written in C and Java (two different programs that implement the same calculations) implementing the nested cladistic analysis developed by Templeton et al. (1987). Its input consists in the description of a nested cladogram (Templeton & Sing 199) estimated from RFLPs or DNA sequences. The theory and applications are described elsewhere (see recommended reading). The first step is the estimation of a cladogram and the defining of a nested structure. The cladogram estimation is described in Templeton et al., (199) and the nesting rules are described elsewhere and extended in Crandall (1996). We are currently working in the development of software for the cladogram estimation. Meanwhile, you can find some tools to help you building the cladogram at http://bioag.byu.edu/zoology/crandall_lab/ programs.htm. Outgroup probabilities (Castelloe & Templeton 1994) can also be included in the analysis. Here is a typical nested cladogram: 1

This cladogram consists of 1 individuals corresponding to 15 haplotype. The nested cladogram is described below in the input file for GeoDis.. INPUT FILE The first line on the file is the name of the data set being analyzed. After that, the population information is indicated:.1 Populations The description of the populations can be specified by their coordinates and sample size. However, in the case of riparian or coastal species, distances are not adequately measured simply through geographical coordinates, and a matrix of pairwise distances among the different locations better describes the geographical distribution in these one-dimensional habitats.1.1 Coordinates (-dimensions).1.1.1) Degrees, minutes and seconds Latitude and longitude can be specified with the standard notation degrees, minutes and seconds, followed by the letter N (North) or S (South) in the case of latitude and E (East) or W (West) in the case of longitude. For example: 45 00 N 4 56 78 E.1.1.) Decimal degrees Latitude and longitude can be also be specified as decimal degrees. In this case latitude is expressed as 0-90 degrees (North {+} and South {-}), while longitude is expressed as 0-180 degrees (East {+} and West {-}). For each population the format is: Line 1: the population number and name is specified, for example : 1 Green Mountain Line : the sample size, the latitude and longitude are indicated, for example: 7 60 01 N 15 0 4 E or 7 60.5 15.41.1. User-defined population pairwise distances (1-dimension) This information is specified as a lower triangle matrix without a diagonal (the diagonal would be made by zeroes). The number of populations (i.e. the dimensions of the matrix, is specified above the matrix). The population number, name and size are specified at each line. The distance can be specified in any unit. A matrix for 5 populations would look like: 5 1 Pop-1 name Pop-1 size Pop- name Pop- size distance -1 Pop- name Pop- size distance -1 distance - 4 Pop-4 name Pop-4 size distance 4-1 distance 4- distance 4-5 Pop-4 name Pop-5 size distance 5-1 distance 5- distance 5- distance 5-4

. Clades The next step in the input file is the description of the nested cladogram. Clades without geographical or genetic variation (e.g. 1-8) are not included in the analysis. Clades at one level are subclades at the next one (e.g., clade 1-5 is a subclade in the nested clade -1). 0-step clades are haplotypes. The information is specified using the nesting clade as the unit. For each nesting clade, the composition of the clades nested within is described. The clades nested within a nesting clade are denominated simply clades. Hence the specification of cladogram starts at the 1-step level. For each nesting clade, it follows this format: Line 1 name of the nesting clade, for example Clade 1-1 Line number of clades nested within this nesting clade. Line name of the clades nested within this nesting clade. At the nested 1-step level, the clades nested within are haplotypes. We can give a name to these haplotypes, for example I, II, III,. At higher nested levels (-step, -step, 4-step Total Cladogram), the name of these clades would we something like Clade 1-, Clade -, ) Line 4 for each clade, its topological situation (tip = 1; interior = 0) is specified. Line 5 number of populations represented in the nesting clade Line 6 the populations are specified by their numbers Line 7 In this line starts the observation matrix. The number of rows in this matrix corresponds to the number of clades specified in line, while the number of columns corresponds to the number of locations specified in line 5. For each row, and starting with the first clade (following the order specified in line ), the number of individuals or copies of the clade is specified for each population. Line (6 + number in line ) last line of the observation matrix This structure is repeated for each nesting clade. After the last nesting clade (the total cladogram), in the next line, the word "END" indicates the end of the input file...1 Outgroup weights Outgroup probabilities for each clade can be included in the analysis (see Castelloe and Templeton 1994). If so, they have to be specified for all the clades. The outgroup weights are specified for each clade as an extra line after line 4. Line 4' For each clade, the corresponding outgroup probability is specified

. RUNNING GeoDis To run GeoDis, the input file needs to be specified. If an output file is not specified, the results are echoed to the screen. If the C version is used, the program prompts the user for all the needed information. For the Java version, the appropriate checkboxes need to be specified. Number of permutations A minimum number of 1000 permutations is recommend for a 5% level of statistical significance. 4. GeoDis OUPUT The output of GeoDis saved to a file with the same name as the input file plus the extension.out. The value of the different statistics calculated is indicated for each nesting clade and its nested clades at each level. Two probabilities are indicated, those corresponding to significantly small (P <=) and large values (P >= ) of the test statistic. It is highly encouraged to use the reference key in (Templeton et al., 1995) for a consistent interpretation of the output. 4

5. INPUT FILE EXAMPLES 1) With DMS coordinates and without outgroup weigths Hallucigenia mtdna // Name of the data set // Number of populations 1 Green Mountain // Population number and name 7 15 41 1 N 6 1 E // Sample size, latitude and longitude Blue Mountain 6 17 16 1 N 61 45 00 E Red Mountain 8 01 5 N 66 00 00 E 5 // number of clades in the file Clade 1- // name of the nested clade 6 // number of subclades included in the nested clade II III IV V VI VII // name of subclades in the nested clade 1 1 // position of each subclade: tip(1) or interior(0) // number of populations in the nested clade 1 // number of each population represented in the nested clade 0 0 // number of individuals in subclade II for each population 0 // number of individuals in subclade III for each population 0 // number of individuals in subclade IVfor each population 0 // number of individuals in subclade V for each population 0 // number of individuals in subclade VI for each population 1 1 // number of individuals in subclade VII for each population Clade 1-4 IX X 1 1 Clade -1 5 1-1 1-1- 1-4 1-5 1 1 1 1 0 4 4 0 0 0 Clade - 1-6 1-7 Clade - 5

1-8 1-9 1 Total Cladogram -1 - - 1 1 6 5 6 1 1 END 6

) With user-defined distances and without outgroup weigths Hallucigenia mtdna // Name of the data set // Number of populations 1 Green 7 // Population number, name, sample size and distance (lower triangle matrix) Blue 6 765 Red 8 4 56 5 // number of clades in the file Clade 1- // name of the nested clade 6 // number of subclades included in the nested clade II III IV V VI VII // name of subclades in the nested clade 1 1 // position of each subclade: tip(1) or interior(0) // number of populations in the nested clade 1 // number of each population represented in the nested clade 0 0 // number of individuals in subclade II for each population 0 // number of individuals in subclade III for each population 0 // number of individuals in subclade IVfor each population 0 // number of individuals in subclade V for each population 0 // number of individuals in subclade VI for each population 1 1 // number of individuals in subclade VII for each population Clade 1-4 IX X 1 1 Clade -1 5 1-1 1-1- 1-4 1-5 1 1 1 1 0 4 4 0 0 0 Clade - 1-6 1-7 Clade - 1-8 1-9 1 7

Total Cladogram -1 - - 1 1 6 5 6 1 1 END 8

) With coordinates (decimal degrees) and with outgroup weigths Hallucigenia mtdna // Name of the data set // Number of populations 1 Green Mountain // Population number and name 7 15.41 60.5 // Sample size, latitude, and longitude Blue Mountain 6 17.67 61.81 Red Mountain 8.01 65.59 5 // number of clades in the file Clade 1- // name of the nested clade 6 // number of subclades included in the nested clade II III IV V VI VII // name of subclades in the nested clade 1 1 // position of each subclade: tip(1) or interior(0) 0.80 0.0.0 0.10 0.06 0.01 // outgroup probabilities // number of populations in the nested clade 1 // number of each population represented in the nested clade 0 0 // number of individuals in subclade II for each population 0 // number of individuals in subclade III for each population 0 // number of individuals in subclade IVfor each population 0 // number of individuals in subclade V for each population 0 // number of individuals in subclade VI for each population 1 1 // number of individuals in subclade VII for each population Clade 1-4 IX X 0.9 0.1 1 1 Clade -1 5 1-1 1-1- 1-4 1-5 1 1 1 0.75 0.05 0.05 0.10 0.05 1 0 4 4 0 0 0 Clade - 1-6 1-7 0.09 0.91 9

Clade - 1-8 1-9 0.05 0.95 1 Total Cladogram -1 - - 1 0.0.0.98 1 6 5 6 1 1 END 10

4) With user-defined distances and with outgroup weigths Hallucigenia mtdna // Name of the data set // Number of populations 1 Green 7 // Population number, name, sample size and distance (lower triangle matrix) Blue 6 765 Red 8 4 56 5 // number of clades in the file Clade 1- // name of the nested clade 6 // number of subclades included in the nested clade II III IV V VI VII // name of subclades in the nested clade 1 1 // position of each subclade: tip(1) or interior(0) 0.80 0.0.0 0.10 0.06 0.01 // outgroup probabilities // number of populations in the nested clade 1 // number of each population represented in the nested clade 0 0 // number of individuals in subclade II for each population 0 // number of individuals in subclade III for each population 0 // number of individuals in subclade IVfor each population 0 // number of individuals in subclade V for each population 0 // number of individuals in subclade VI for each population 1 1 // number of individuals in subclade VII for each population Clade 1-4 IX X 0.9 0.1 1 1 Clade -1 5 1-1 1-1- 1-4 1-5 1 1 1 0.75 0.05 0.05 0.10 0.05 1 0 4 4 0 0 0 Clade - 1-6 1-7 0.09 0.91 11

Clade - 1-8 1-9 0.05 0.95 1 Total Cladogram -1 - - 1 0.0.0.98 1 6 5 6 1 1 END 1

Recommend reading The use of this program is pointless without the understanding of the methodology Castelloe J, Templeton AR (1994) Root probabilities for intraspecific gene trees under neutral coalescent theory. Molecular Phylogenetics and Evolution, 10-11. Crandall KA (1996) Multiple interespecies transmissions of human and simian T-cell leukemia/lymphoma virus type I sequences. Molecular Biology and Evolution 1, 115-11. Georgiadis N, Bischof L, Templeton A et al. (1994) Structure and history of African elephant populations: I. Eastern and Southern Africa. The Journal of Heredity 85, 100-104. Hammer MF, Karafet T, Rasanayagam A et al. (1998) Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Molecular Biology and Evolution 15, 47-441. Karafet TM, Zegura SL, Posukh O et al. (1999) Ancestral Asian source(s) of New World Y-chromosome founder haplotypes. American Journal of Human Genetics 64, 817-81. Templeton AR (1998a) Human Races: A Genetic and Evolutionary Perspective. American Anthropologist 100, 6-650. Templeton AR (1998b) Nested clade analyses of phylogeographic data: testing hypotheses about gene flow and population history. Molecular Ecology 7, 81-97. Templeton AR (1998c) The role of molecular genetics in speciation studies. In Molecular Approaches to Ecology and Evolution (ed. De Salle R, Schierwater B), pp. 11-156. Birkhaüser-Verlag, Basel. Templeton AR (1998d) Species and speciation: geography, population structure, ecology and gene trees. In Endless forms: Species and Speciation (ed. Howard DJ, Berlocher SH), pp. -4. Oxford University Press, Oxford. Templeton AR (1999) Using gene trees to infer species from testable null hypothesis: cohesion species in the Spalaxhrenbergi complex. In Evolutionary Theory and Processes: Modern Perspectives, Papers in Honour of Eviatar Nevo (ed. Wasser SP), pp. 171-19. Kluwer Academic, Dordrecht. Templeton AR, Boerwinkle E, Sing CF (1987) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. Genetics 117, 4-51. Templeton AR, Crandall KA, Sing CF (199) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 1, 619-6. Templeton AR, Sing CF (199) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination. Genetics 14, 659-669. David Posada June 99 1