Optimizing Phylogenetic Supertrees Using Answer Set Programming

Similar documents
Optimizing Phylogenetic Supertrees Using Answer Set Programming

Supporting Information

Introduction to the Cheetah

Cite Reference: Mellen, J.D. (1997) Minimum Husbandry Guidelines for Mammals: Small Felids. American Association of Zoos and Aquariums

CATS. Evolution. The. Elegant and enigmatic, cats tantalize not only those of us. By Stephen J. O Brien and Warren E. Johnson

Phylogeny Reconstruction

Introduction to the Cheetah

Original Article The comparison of the Felidae species with karyotype symmetry/asymmetry index (S/A I )

Supplementary Materials for

AZA Felid Taxon Advisory Group North American Regional Collection Plan (RCP) 2 nd Edition.

Cytogenetic Study of the Leopard, Panthera pardus (Carnivora, Felidae) by Conventional Staining, G- banding and High-resolution Staining Technique

2007 The Japan Mendel Society Cytologia 72(1): , 2007

110th CONGRESS 1st Session H. R. 1464

Machine Learning.! A completely different way to have an. agent acquire the appropriate abilities to solve a particular goal is via machine learning.

GY 112: Earth History. Fossils 3: Taxonomy

A Column Generation Algorithm to Solve a Synchronized Log-Truck Scheduling Problem

A comparative acoustic analysis of purring in four cheetahs

1 EEB 2245/2245W Spring 2014: exercises working with phylogenetic trees and characters

2010 Canadian Computing Competition Day 1, Question 1 Barking Dogs!

SCIENCE CHINA Life Sciences. Mitogenomic analysis of the genus Panthera

Fig Phylogeny & Systematics

muscles (enhancing biting strength). Possible states: none, one, or two.

African Tracks and Signs Course by Chris & Mathilde Stuart. Paws without Claws

1 EEB 2245/2245W Spring 2017: exercises working with phylogenetic trees and characters

LABORATORY EXERCISE 7: CLADISTICS I

Evolution of Skull and Mandible Shape in Cats (Carnivora: Felidae)

The cat family is placed within the order Carnivora,

LABORATORY EXERCISE 6: CLADISTICS I

Supplementary Fig.1. Complete phylogeny used in this article. The tree topology and

Keywords: Acinonyx jubatus/acoustic/cheetah/domestic cat/felis catus/purring/vocalization

Dramatis personae: an introduction to the wild felids

Species: Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora Class: Mammalia Phylum: Chordata

CRANIAL EVIDENCE FOR SEXUAL DIMORPHISM AND GROUP LIVING IN THE EXTINCT AMERICAN LION (PANTHERA LEO ATROX)

Molecular Genetics and Evolution of Melanism in the Cat Family

UNIT III A. Descent with Modification(Ch19) B. Phylogeny (Ch20) C. Evolution of Populations (Ch21) D. Origin of Species or Speciation (Ch22)

Seroprevalence and Genomic Divergence of Circulating Strains of Feline Immunodeficiency Virus among Felidae and Hyaenidae Species

LJMU Research Online

Heuristic search, A* CS171, Winter 2018 Introduction to Artificial Intelligence Prof. Richard Lathrop. Reading: R&N

These small issues are easily addressed by small changes in wording, and should in no way delay publication of this first- rate paper.

Classification and Taxonomy

Cladistics (reading and making of cladograms)

Classification Write the name of Each animal below and then classify them:

The larynx of roaring and non-roaring cats

Bio 1B Lecture Outline (please print and bring along) Fall, 2006

Kathleen Krafte, Lincoln Larson, Robert Powell Clemson University ISSRM: June 14, 2015

Introduction to Cladistic Analysis

KB Record Errors Report

Big cat, small cat: reconstructing body size evolution in living and extinct Felidae

Title: Phylogenetic Methods and Vertebrate Phylogeny

6.14(a) - How to Run CAT Reports Record Errors Report

Title: Fossil Focus: Reimagining fossil cats IMPORTANT COPYRIGHT CITATION OF ARTICLE

HENNIG'S PARASITOLOGICAL METHOD: A PROPOSED SOLUTION

Informed search algorithms

Introduction to phylogenetic trees and tree-thinking Copyright 2005, D. A. Baum (Free use for non-commercial educational pruposes)

Welcome to the Animal Ambassador Program from IFAW!

Geo 302D: Age of Dinosaurs LAB 4: Systematics Part 1

CLADISTICS Student Packet SUMMARY Phylogeny Phylogenetic trees/cladograms

Dynamic Programming for Linear Time Incremental Parsing

DNA BARCODING & MULTI-ISOTOPIC FINGERPRINTING: A NOVEL FORENSIC TOOLBOX FOR THE RAPID IDENTIFICATION OF ILLEGAL TRADE IN ENDANGERED WILDLIFE SPECIES

History of Lineages. Chapter 11. Jamie Oaks 1. April 11, Kincaid Hall 524. c 2007 Boris Kulikov boris-kulikov.blogspot.

Order for Enforcement of the Act on Welfare and

Original language: English AC27 Doc CONVENTION ON INTERNATIONAL TRADE IN ENDANGERED SPECIES OF WILD FAUNA AND FLORA

INFORMATION TO USERS

No limbs Eastern glass lizard. Monitor lizard. Iguanas. ANCESTRAL LIZARD (with limbs) Snakes. No limbs. Geckos Pearson Education, Inc.

Lab 10: Color Sort Turtles not yet sorted by color

Covariation in the skull modules of cats: the challenge of growing saber-like canines

LN #13 (1 Hr) Decomposition, Pattern Recognition & Abstraction CTPS Department of CSE

Phylogeny of the Sciaroidea (Diptera): the implication of additional taxa and character data

Modern Evolutionary Classification. Lesson Overview. Lesson Overview Modern Evolutionary Classification

GEODIS 2.0 DOCUMENTATION

Dr. Lon Grassman Feline Research Center, Caesar Kleberg Wildlife Research Institute, Texas A&M University-Kingsville, Kingsville, TX 78363

Vol. 64 (3) Biophilately September MAMMALIA. Editor Michael Prince, LM68 New Listings

Testing Phylogenetic Hypotheses with Molecular Data 1

Taxonomy. A Pattern to the Diversity of Life

1 Sorting It All Out. Say It

ZOO GUIDELINES FOR KEEPING LARGE FELIDS IN CAPTIVITY

geodiversitas New sabre toothed Felidae (Carnivora, Mammalia) in the hominid-bearing sites of Toros Menalla (late Miocene, Chad)

The Taming of the Cat Stephen J. O Brien. August 7, 2017 Some Pictures and Some Stories

Design of 32 bit Parallel Prefix Adders

PROGRESS REPORT Report date Principle Researcher Affiliated organization Project Title Project theme Title

The ALife Zoo: cross-browser, platform-agnostic hosting of Artificial Life simulations

Required and Recommended Supporting Information for IUCN Red List Assessments

Systematics, Taxonomy and Conservation. Part I: Build a phylogenetic tree Part II: Apply a phylogenetic tree to a conservation problem

Creating an EHR-based Antimicrobial Stewardship Program Session #257, March 8, 2018 David Ratto M.D., Chief Medical Information Officer, Methodist

A SPATIAL ANALYSIS OF SEA TURTLE AND HUMAN INTERACTION IN KAHALU U BAY, HI. By Nathan D. Stewart

Chapter 10 ANIMALS. ARTICLE I Dogs License required; fees Rabies vaccination required. GENERAL REFERENCES

O'Regan HJ Defining cheetahs, a multivariante analysis of skull shape in big cats. Mammal Review 32(1):58-62.

DESIGN AND SIMULATION OF 4-BIT ADDERS USING LT-SPICE

INQUIRY & INVESTIGATION

Comparison of Parallel Prefix Adders Performance in an FPGA

Are node-based and stem-based clades equivalent? Insights from graph theory

Inferring Ancestor-Descendant Relationships in the Fossil Record

HAWAIIAN BIOGEOGRAPHY EVOLUTION ON A HOT SPOT ARCHIPELAGO EDITED BY WARREN L. WAGNER AND V. A. FUNK SMITHSONIAN INSTITUTION PRESS

CITY OF SYLVAN LAKE OAKLAND COUNTY, MICHIGAN

Sparse Supermatrices for Phylogenetic Inference: Taxonomy, Alignment, Rogue Taxa, and the Phylogeny of Living Turtles

Other possible headings (depending upon content of book and collection):

Inferred behaviour and ecology of the primitive sabretoothed cat Paramachairodus ogygia (Felidae, Machairodontinae) from the Late Miocene of Spain

CS108L Computer Science for All Module 7: Algorithms

The Making of the Fittest: LESSON STUDENT MATERIALS USING DNA TO EXPLORE LIZARD PHYLOGENY

Ch. 17: Classification

Transcription:

Optimizing Phylogenetic Supertrees Using Answer Set Programming Laura Koponen 1, Emilia Oikarinen 1, Tomi Janhunen 1, and Laura Säilä 2 1 HIIT / Dept. Computer Science, Aalto University 2 Dept. Geosciences and Geography, University of Helsinki Aalto, Finland

Outline Introduction the supertree problem ASP Encodings trees, quartets and projections Experiments Felidae data Conclusions 2/31

The supertree problem Input: a set of overlapping, possibly conflicting phylogenetic trees (rooted, leaf-labeled) 3/31

The supertree problem Input: a set of overlapping, possibly conflicting phylogenetic trees (rooted, leaf-labeled) Output: a phylogenetic tree that covers all taxa from input and reflects the relationships in input as well as possible Several measures can be used used Optimal tree not necessarily unique 4/31

Solving the supertree problem Typically heuristic methods are used, e.g. matrix representation with Parsimony (MRP) [Baum, 1992; Ragan,1992] input trees encoded into a binary matrix, and maximum parsimony analysis is then used to construct a tree no guarantee of finding optimal solution large supertrees (hundreds of species) still computationally challenging There exist earlier constraint-based approaches for related phylogeny reconstruction problem cladistics-based apporach using ASP [Brooks et al., 2007] maximum parsimony using ASP [Kavanagh et al., 2006] and MIP [Sridhar et al., 2008] maximum quartet consistency problem using ASP [Wu et al., 2007] and CP [Morgado & Marques-Silva, 2010] 5/31

In this paper We solve the supertree problem using answer set programming Rule-based, expressive language for knowledge representation, efficient solvers (moreover, possible to enumerate all optimal solutions) 6/31

In this paper We solve the supertree problem using answer set programming Rule-based, expressive language for knowledge representation, efficient solvers (moreover, possible to enumerate all optimal solutions) We present two alternative encodings (with different optimization criteria) solving: maximum quartet consistency problem maximum projection consistency problem 7/31

In this paper We solve the supertree problem using answer set programming Rule-based, expressive language for knowledge representation, efficient solvers (moreover, possible to enumerate all optimal solutions) We present two alternative encodings (with different optimization criteria) solving: maximum quartet consistency problem maximum projection consistency problem We apply the encodings on real data (Felidae) and compare our supertrees to recent supertrees obtained using the heuristic MRP method 8/31

Supertree problem: practical considerations How to resolve conflicts in the input trees? How to localize the information in trees? outgroup Felis catus Neofelis nebulosa Panthera tigris Panthera pardus Panthera leo Panthera spelaea outgroup Felis catus Neofelis diardi Neofelis nebulosa Panthera pardus Panthera uncia Panthera leo Panthera onca Panthera tigris The search space (number of rooted leaf-labeled trees) grows exponentially Taxa Different trees 1 1 2 1 3 4 4 26 5 236...... 10 282 137 824...... 15 6 353 726 042 486 112...... 9/31

Representing input trees with substructures I J K L M N 10/31

Representing input trees with substructures I J K L M N Quartet (unrooted tree with four leaf nodes) J L I K n leaf nodes, ( n 4) quartets a 50-taxa tree has 230 300 quartets 11/31

Representing input trees with substructures I J K L M N Projections J L M N 2 n 1 different projections for tree with n leaf nodes a 50-taxa tree has 1.13 10 15 projections to reduce the amount, consider only subtree projections 12/31

Outline Introduction the supertree problem ASP Encodings trees, quartets and projections Experiments Felidae data Conclusions 13/31

Representing canonical trees Non-binary, rooted leaf-labeled trees encoded using node/1 and edge/2 predicates inner nodes (inner/1) have larger indices than leaf nodes (leaf/1) edges directed from larger indices to smaller ones Taxa are assigned to leaf nodes using a fixed alphabetical order (asgn/2) To further reduce symmetries, a canonical labeling for nodes is introduced generalization of the condition in [Brooks et al., 2007] Special taxon outgroup placed as a child on the root 14/31

Quartets displayed by a tree 1 5 8 7 2 3 1 2 3 4 5 6 How to determine if a tree displays quartet ((1, 2), (3, 5))? Are pairs (1, 2) and (3, 5) separated by an edge in the tree? 15/31

Quartets displayed by a tree 1 5 8 7 2 3 1 2 3 4 5 6 How to determine if a tree displays quartet ((1, 2), (3, 5))? Are pairs (1, 2) and (3, 5) separated by an edge in the tree? satisfied(a1, A2, A3, A4) quartet(a1, A2, A3, A4), reach(x, A1), reach(x, A2), not reach(x, A3), not reach(x, A4), inner(x). 16/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 17/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 Projections are by default assigned to inner nodes asgn(x, P) inner(x), not denied(x, P). 18/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 Projections are by default assigned to inner nodes asgn(x, P) inner(x), not denied(x, P). Predicate denied/2 specifies exceptions 19/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 Projections are by default assigned to inner nodes asgn(x, P) inner(x), not denied(x, P). Predicate denied/2 specifies exceptions Projection P cannot be assigned to X if it is assigned to a node below X denied(x, P) edge(x, Y ), reach(y, P). 20/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 Projections are by default assigned to inner nodes asgn(x, P) inner(x), not denied(x, P). Predicate denied/2 specifies exceptions Distinct child projections of P cannot be mapped on the same subtree in the phylogeny denied(x, P) edge(x, Y ), reach(y, A), reach(y, B), child(a, P), child(b, P), A < B. 21/31

Projections displayed by a tree 8 3 4 5 7 6 2 3 4 5 1 2 3 4 5 Projections are by default assigned to inner nodes asgn(x, P) inner(x), not denied(x, P). Predicate denied/2 specifies exceptions If projection P is assigned at inner node X, then its child projections must have been assigned below X in the tree 22/31

Outline Introduction the supertree problem ASP Encodings trees, quartets and projections Experiments Felidae data Conclusions 23/31

Dataset: Felidae 38 source trees with 105 species of cats from [Säilä et al., 2011, 2012] 50 40 number of species 30 20 10 0 file (sorted by size) Problem: 105 species are too much for the current encodings 24/31

Scalability: genus-specific projections of data CLASP WASP ACYC CLASP-S Genus Taxa Trees qtet proj qtet proj qtet proj qtet proj Leopardus 8 6 0.6 0.1 1.7 0.2 1.1 0.4 0.6 0.1 Dinofelis 9 2 0.1 0.0 0.0 0.1 0.1 0.1 0.0 0.1 Homotherium 9 3 0.7 0.0 0.1 0.1 0.1 0.0 0.0 0.0 Felis 11 12 39.6 21.9 291 121 123 59.6 27.7 20.8 Panthera 11 22 1400 45.6 456 175 944 67.1 Time (s) to find one optimum for genus-specific data using different solvers using quartet (qtet) and projection (proj) encoding ( marks timeout of 1 hour). The projection encoding with CLASP looks as the most promising combination 25/31

Genus-level Felidae supertree Idea: project trees onto genus-level J I H D C B G A F E outgroup Lynx lynx Catopuma temmincki Prionailurus bengalensis Otocolobus manul Panthera tigris Neofelis nebulosa Panthera leo Panthera pardus Panthera uncia Felis bieti Felis silvestris Felis catus J I H G D outgroup Lynx Catopuma Prionailurus Otocolobus Panthera Neofelis Felis 105 species of cats 34 genera, 28 source trees number of genera 20 10 0 file (sorted by size) 26/31

Genus-level Felidae supertree results Quartet encoding was still too slow (timeout 48 hours) Suboptimal solutions could be obtained Projection encoding produced optimal supertrees For this data, unique optimum exists The supertrees were compared to recent supertrees computed using MRP [Säilä et al. 2011, 2012] In [Säilä et al. 2011, 2012] MRP trees selected with best resolution (MRP-R) and best support (MRP-S) These are projected onto genus-level to allow for comparison 27/31

Supertree comparison quality measures Scheme Quartets % Resolution Support Proj 0.84 0.90 0.43 MRP-S 0.77 0.85 0.45 MRP-R 0.83 0.93 0.42 Resolution: percentage of resolved nodes in the tree Quartets %: percentage of displayed quartets from input Support: [Wilkinson et al., 2005] 28/31

Supertree comparison outgroup Proailurus Pseudaelurus Hyperailurictis Stenailurus Metailurus Dinofelis Adelphailurus Promegantereon Paramachairodus Smilodon Megantereon Nimravides Machairodus Amphimachairodus Xenosmilus Homotherium Styriofelis Neofelis Pardoides Panthera Catopuma Pardofelis Leptailurus Caracal Profelis Leopardus Lynx Felis Otocolobus Prionailurus Puma Miracinonyx Acinonyx outgroup Proailurus Pseudaelurus Hyperailurictis Stenailurus Metailurus Dinofelis Adelphailurus Promegantereon Paramachaerodus Smilodon Megantereon Nimravides Machairodus Amphimachairodus Xenosmilus Homotherium Styriofelis Neofelis Panthera Pardoides Catopuma Pardofelis Leptailurus Profelis Caracal Leopardus Lynx Felis Otocolobus Prionailurus Miracinonyx Puma Acinonyx genus-level MRP-R projection encoding optimum 29/31

Outline Introduction the supertree problem ASP Encodings trees, quartets and projections Experiments Felidae data Conclusions 30/31

Conclusions Two encodings for solving the supertree problem projection-based encoding looks promising in terms of performance and tree quality Large supertrees not possible yet need for a strategy to, e.g., split the instance more analysis of bottlenecks need for more data, both artificial and real Furthermore, work is need on improving the properties of the objective function Currently larger trees get more weight, though this is not (always) desirable 31/31