Definition of Homologous Synteny Blocks (HSBs)

Similar documents
CLADISTICS Student Packet SUMMARY Phylogeny Phylogenetic trees/cladograms

Introduction to phylogenetic trees and tree-thinking Copyright 2005, D. A. Baum (Free use for non-commercial educational pruposes)

LABORATORY EXERCISE 7: CLADISTICS I

Do the traits of organisms provide evidence for evolution?

LABORATORY EXERCISE 6: CLADISTICS I

2013 Holiday Lectures on Science Medicine in the Genomic Era

Phylogeny Reconstruction

Lecture 11 Wednesday, September 19, 2012

Modern Evolutionary Classification. Lesson Overview. Lesson Overview Modern Evolutionary Classification

INQUIRY & INVESTIGATION

Geo 302D: Age of Dinosaurs LAB 4: Systematics Part 1

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

History of Lineages. Chapter 11. Jamie Oaks 1. April 11, Kincaid Hall 524. c 2007 Boris Kulikov boris-kulikov.blogspot.

Supporting Online Material

Species: Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora Class: Mammalia Phylum: Chordata

Bi156 Lecture 1/13/12. Dog Genetics

Representation, Visualization and Querying of Sea Turtle Migrations Using the MLPQ Constraint Database System

muscles (enhancing biting strength). Possible states: none, one, or two.

TOPIC CLADISTICS

Subdomain Entry Vocabulary Modules Evaluation

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

Comparing DNA Sequences Cladogram Practice

Introduction to Cladistic Analysis

Patterns of heredity can be predicted.

Epigenetic regulation of Plasmodium falciparum clonally. variant gene expression during development in An. gambiae

The City School. Learn Create Program

What are taxonomy, classification, and systematics?

Ch 1.2 Determining How Species Are Related.notebook February 06, 2018

A SPATIAL ANALYSIS OF SEA TURTLE AND HUMAN INTERACTION IN KAHALU U BAY, HI. By Nathan D. Stewart

Mendelian Genetics Using Drosophila melanogaster Biology 12, Investigation 1

UNIT III A. Descent with Modification(Ch19) B. Phylogeny (Ch20) C. Evolution of Populations (Ch21) D. Origin of Species or Speciation (Ch22)

National Unit Specification: general information. UNIT Animal Care: Breeding (SCQF level 5) CODE F6SS 11 SUMMARY OUTCOMES RECOMMENDED ENTRY

Building Concepts: Mean as Fair Share

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Cladistics (reading and making of cladograms)

Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Evolution in dogs. Megan Elmore CS374 11/16/2010. (thanks to Dan Newburger for many slides' content)

1 EEB 2245/2245W Spring 2014: exercises working with phylogenetic trees and characters

ALL ENTRIES MUST BE DONE ONLINE NO PAPER ENTRIES WILL BE ACCEPTED RABBITS & CAVIES - Department 17

Muppet Genetics Lab. Due: Introduction

The melanocortin 1 receptor (mc1r) is a gene that has been implicated in the wide

A Column Generation Algorithm to Solve a Synchronized Log-Truck Scheduling Problem

Machine Learning.! A completely different way to have an. agent acquire the appropriate abilities to solve a particular goal is via machine learning.

Effective Vaccine Management Initiative

If you take the time to follow the directions below, you will be able to solve most genetics problems.

Section: 101 (2pm-3pm) 102 (3pm-4pm)

INHERITANCE OF BODY WEIGHT IN DOMESTIC FOWL. Single Comb White Leghorn breeds of fowl and in their hybrids.

Mendelian Genetics SI

Identity Management with Petname Systems. Md. Sadek Ferdous 28th May, 2009

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Title: Phylogenetic Methods and Vertebrate Phylogeny

GEODIS 2.0 DOCUMENTATION

Fig Phylogeny & Systematics

Evaluating the quality of evidence from a network meta-analysis

Phenotype Observed Expected (O-E) 2 (O-E) 2 /E dotted yellow solid yellow dotted blue solid blue

Different versions of a single gene are called allleles, and one can be dominant over the other(s).

Workbook. Version 3. Created by G. Mullin and D. Carty

Plating the PANAMAs of the Fourth Panama Carmine Narrow-Bar Stamps of the C.Z. Third Series

Background and Plan of Analysis

These small issues are easily addressed by small changes in wording, and should in no way delay publication of this first- rate paper.

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Effective Vaccine Management (EVM) Global Data Analysis

Basic Terminology and Eyeband Colors

Let s Build a Cladogram!

Approximating the position of a hidden agent in a graph

Testing Phylogenetic Hypotheses with Molecular Data 1

Biology 201 (Genetics) Exam #1 120 points 22 September 2006

HTML COLOUR CODE CHART

LINKAGE OF ALBINO ALLELOMORPHS IN RATS AND MICE'

Bio 1B Lecture Outline (please print and bring along) Fall, 2006

Controllability of Complex Networks. Yang-Yu Liu, Jean-Jacques Slotine, Albert-Laszlo Barbasi Presented By Arindam Bhattacharya

Inferring Ancestor-Descendant Relationships in the Fossil Record

Name: Period: Student Exploration: Mouse Genetics (One Trait)

Multiclass and Multi-label Classification

Inheritance of Livershunt in Irish Wolfhounds By Maura Lyons PhD

UK HOUSE MARTIN SURVEY 2015

In the first half of the 20th century, Dr. Guido Fanconi published detailed clinical descriptions of several heritable human diseases.

Econometric Analysis Dr. Sobel

Biology 164 Laboratory

Comparing DNA Sequence to Understand

Re: Sample ID: Letzty [ ref:_00di0ijjl._500i06g6gf:ref ] 1 message

Comparison of different methods to validate a dataset with producer-recorded health events

NOR association in Canis familiaris

HEREDITARY STUDENT PACKET # 5

Analysis of CR1 repeats in the zebra finch genome

Pet Selective Automated Food Dispenser

Release note Chesar 3 (and migration tool)

Optimizing Phylogenetic Supertrees Using Answer Set Programming

Response to SERO sea turtle density analysis from 2007 aerial surveys of the eastern Gulf of Mexico: June 9, 2009

Systematics, Taxonomy and Conservation. Part I: Build a phylogenetic tree Part II: Apply a phylogenetic tree to a conservation problem

Department 7 Poultry Project Number 20501, 20502, 20503, 20504, 20505, 20506, Superintendents: Keith Schardt, Debbie Krueger

Darwin and the Family Tree of Animals

Implementation and Estimation of Delay, Power and Area for Parallel Prefix Adders

Informed search algorithms

Effective Vaccine Management (EVM) Global Data Analysis

Genetics Lab #4: Review of Mendelian Genetics

1. For each genotype, indicate whether it is heterozygous (HE) or homozygous (HO) Ii Jj kk Ll

Management. of genetic variation in local breeds. Asko Mäki-Tanila. Reykjavik 30/4/2009. Embryocentre Ltd

Mexican Gray Wolf Reintroduction

Nathan A. Thompson, Ph.D. Adjunct Faculty, University of Cincinnati Vice President, Assessment Systems Corporation

Transcription:

Definition of Homologous Synteny Blocks (HSBs) The gene mapping data were derived from the following publications: mouse and rat GRIMM synteny blocks (Bourque et al. 2004), cat radiation hybrid map (Menotti- Raymond et al. 2003), cattle radiation hybrid map (Larkin et al. 2003; Everts van der Wind et al. 2004), dog radiation hybrid map (Guyon et al. 2003), pig radiation hybrid maps (Myers et al. in press; McCoard et al. 2002), horse radiation hybrid and cytogenetic maps (Chowdhary et al. 2003). Homologous synteny blocks (HSBs) were defined for each species with the human genome as a reference (NCBI Build 33), and required a minimum of two adjacent markers on the same chromosome in both species without interruption. Inversions were counted only if there were 3 or more perfectly consecutive markers, each >1 megabase pairs (Mbp) apart from their neighbor, and in opposite orientation from the human genome. RH markers that were binned were not used for defining three-marker rule HSBs. Because of the inherent resolution limitations of RH mapping, we allowed for single out-of-place markers to jump <2 Mbp into their expected position in the segment based on the order in the human genome; those markers that did not satisfy this criterion were classified as singletons. Markers (2 or more) were not allowed to jump into a new segment if they are surrounded by segments from other parts of the same or different human chromosomes. Singletons were included for interpretive purposes. Horse segments were defined in the same manner, and were allowed to span multiple linkage groups if supported by cytogenetic data. In addition, many FISH mapped markers were used to augment the horse data, but only if they were consistent with marker orders derived from the RH-based data (Chowdhary et al. 2003). The Evolution Highway Multispecies Genome Browser To facilitate our research activities, we utilized the D2K application environment created by the Automated Learning Group at the National Center for Supercomputing Applications at UIUC." This environment allows users to connect programming modules together to build data mining applications and supplies a core set of modules, application templates, and a standard API for software component development. All D2K components are written in Java for maximum flexibility and portability. Evolution Highway is a set of D2K components created to load, correlate and map chromosome and species data to a visual chromosome metaphor for comparative analysis. The Evolution Highway interface employs a zoomable user interface that allows the user to zoom in for detailed information and zoom out for an overview. Users can alter the reference genome (currently human or mouse), alter the visualization of centromeres, telomeres, ancestral chromosome HSBs or any user-defined custom track. The D2K framework enables the Evolution Highway to be a desktop application and a web service application (http://evolutionhighway.ncsa.uiuc.edu), and is freely available. Breakpoint Classification and Analysis An evolutionary chromosome breakpoint boundary (or breakpoint boundary ) was defined as a nucleotide position on a chromosome, identified by a human genome reference coordinate (NCBI Build 33), which is adjacent to a known evolutionary chromosome rearrangement as defined by a pair-wise comparison of gene maps or genome sequences. An evolutionary breakpoint region (or breakpoint region ) was 1

defined as a region between two homologous synteny blocks that is demarcated by an evolutionary chromosome breakpoint boundary on each side. An overlapping breakpoint region was defined as a region with one or both breakpoint boundaries that overlap with a breakpoint region in another genome. An overlapping breakpoint region must have at least one breakpoint boundary that is <500 Kbp from the orthologous breakpoint boundary in another species. A special case occurs when breakpoint boundaries overlap in multiple species, but specific breakpoint boundaries within the breakpoint region are >500 Kbp in one of the pair-wise comparisons. In such cases, each of the chained overlapping breakpoint regions was considered distinct, with the smallest overlapping breakpoint region for a pair-wise comparison being defined as the core. In such situations the breakpoint boundaries of the core served as the reference points for applying the 500 Kbp rule. In cases where a putative breakpoint region overlaps with multiple breakpoint regions in a second species, these are referred to as gaps. We classified each breakpoint region as follows: 1. A lineage-specific breakpoint region was defined as a breakpoint region unique to a single species that does not overlap and breakpoint region of any other species in the interval <1Mbp up- or downstream. 2. An order-specific breakpoint region was defined as a breakpoint region overlapping within the interval 1Mbp up- or downstream in two or more species from the same order (e.g. cat and dog; mouse and rat; cattle and pig) and do not overlap with breakpoint regions of any other species from other orders. 3. A superordinal breakpoint region must be an overlapping breakpoint region in all species from two or more orders with a recent common ancestor. 4. A reuse breakpoint region must have overlapping breakpoint regions in one or more, but not all species from different orders. Table 1. Classification of mammalian breakpoint regions (< 4 Mbp). Classification No. Breakpoints Superordinal 2 Order-specific Cetartiodactyla 10 Carnivora 9 Rodentia 150 Total 169 Lineage-specific (includes reuse breakpoints) a cat 12 dog 64 cattle 67 pig 46 human 34 mouse 36 rat 20 Total 279 a Because reuse breakpoint regions are represented in multiple species categories (see Fig. 3), the total breakpoint count (superordinal+ordinal+lineage specific) sums to 450 rather than 367. 2

For the purposes of the present analysis, using the human genome as a reference, breakpoint regions overlapping in 5 or more non-human species were classified as primate-specific breakpoint regions. Primate-specific breakpoint regions were confirmed by pairwise comparison of all segments against at least two non-primate species as the reference species (e.g., using the mouse or pig genome as a reference). Producing multi-way homologous synteny blocks (HSBs) for genome rearrangement analyses. To make multi-way HSBs, pairwise HSBs were used for each human-species pair: human/cat, human/cattle, human/dog, human/pig, human/horse, human/mouse and human/rat. The latter two pairs were based on previously published, whole genome sequence-based, 1 Mbp synteny blocks (Bourque et al. 2004). To make int_7way (which excludes unsigned blocks; i.e., singletons): We find all combinations of blocks: b1: a human/mouse block b2: a human/rat block b3: a human/cat block b4: a human/cattle block b5: a human/dog block b6: a human/pig block for which the 6-way intersection of the human coordinates of b1,...,b6 is non-empty. Each such combination will result in one 7-way block, B, whose human coordinates are given by that 6-way intersection, and whose human sign is +1. The signs in the other species are inherited from b1,...,b6. The coordinates in the other species were determined by interpolation: in human coordinates, B is somewhere within b1. If it starts 0.3 of the way through b1 and ends 0.45 of the way through b1 in human coordinates, and if the mouse sign is positive, then the mouse coordinates of b1 were interpolated with those same fractions to get the coordinates in B. If the mouse sign of b1 was negative, then the complementary fractions 1-0.45=0.55 and 1-0.3=0.7 were used. This process was repeated for b2,...,b6 to get the rat/cat/.../pig coordinates of B. 4) To make the int_7wayu (which includes unsigned 2-way blocks) HSBs: a) We performed the same procedure described above for int_7way, except we included the unsigned 2-way blocks by treating their signs as +1. b) Each unsigned 2-way block was split into 0 or more 7-way blocks. If 0, then the process is complete. If there is 1 or more, then those blocks are consecutive and we have to decide on their overall sign being +1 (so they stay in the same 3

order) or -1 (so their order is reversed and their coordinates should be modified to reflect that). A parsimony criterion was used to determine the signs. Multiple Genome Rearrangement (MGR) scenario Starting from the 307 seven-way synteny blocks we first computed the rearrangement distance between each pair of genomes (Table 1). These distances highlight the varying rates of rearrangement across the different lineages. For example, the rearrangement distance between human and cat is less than half the rearrangement distance between human and mouse even though the divergence time is higher for cat than for mouse. Directly applying the neighbor-joining algorithm (Saitou and Nei, 1987) on the matrix of pairwise distances leads to the following tree: (((((Human,Cat),Pig),Dog),Cattle),(Mouse,Rat)) The tree is unrealistic as it contradicts many of the known mammalian clades (Springer et al. 2004). To circumvent this limitation but also to obtain a more descriptive rearrangement scenario that includes a description of the ancestral nodes, we used the MGR algorithm (Bourque and Pevzner, 2002). MGR uses different heuristics and seeks a most parsimonious rearrangement scenario that best explains the contemporary block arrangements. The rearrangements analyzed are inversions, translocations, fusions and fissions. For the current application, the MGR algorithm was adapted to look for a parsimonious rearrangement on a tree with a given topology (see Methods). In this case, the topology used was: ((Human,(Mouse,Rat)),((Cat,Dog),(Pig,Cow))). The final tree recovered requires a total of 487 rearrangements and is shown in Figure S2. In the process, MGR also generated the block arrangements of 5 putative ancestors: carnivore (cat+dog), cetartiodactyl (cattle+pig), ferungulate (carnivore+cetartiodactyl), murid rodent (mouse+rat), and human-mouse-rat. The minimum number of rearrangements required to convert between two adjacent genomes on the tree is shown at the top of Figure S2. For many of the ancestors, the ratio between the total number of rearrangements of the 3 incident edges and the number of common blocks was high. For this reason, it was possible to find alternative ancestors also minimizing the total number of rearrangement events on the evolutionary tree, which allows for distinguishing between weak and strong areas of the ancestral reconstructions. Specifically, we explored a wide range of alternative ancestors (see below), and we looked for adjacencies that are present in all of the observed alternative ancestors. We call these strong adjacencies. In contrast, adjacencies that were not conserved in at least one of the alternative ancestors are called weak adjacencies (black arrows in Figs. 2 & S2). The number of weak adjacencies identified in this way is actually a lower bound for the true number of weak adjacencies since we can only explore a subset of all the alternative solutions. We used this relatively stringent criterion (broken once implies weak) because otherwise we would need to characterize more precisely the proportion of explored alternative ancestors, and that is a computationally impractical task at this point in time. More than half of the 168 weak adjacencies were actually found at the end of the putative ancestral chromosomes and, in most cases, were associated with alternative inter-chromosomal rearrangements. Overall, 4

the total number of weak adjacencies represented about 10% of the predicted ancestral adjacencies. MGR only produces scenarios associated with unrooted trees. This implies that with no outgroup data it is impossible to directly extract the mammalian ancestor from the scenario displayed in Figure S2. But, by assuming comparable rates of rearrangement in the early lineages (see below), we generated a putative mammalian ancestor half way between the ferungulate and human-mouse-rat ancestors (see Figure 2, S2). By using the equally parsimonious alternative ancestors identified for both of these ancestors, we also generated alternative mammalian ancestors that allowed for identification of 49 weak adjacencies in the reconstruction (see below). Human Mouse Rat Cat Cattle Dog Pig Human 0 153 149 61 115 108 76 Mouse 153 0 60 169 209 184 172 Rat 149 60 0 164 203 180 170 Cat 61 169 164 0 130 114 87 Cattle 115 209 203 130 0 157 135 Dog 108 184 180 114 157 0 124 Pig 76 172 170 87 135 124 0 Table 1. Rearrangement distance between each pair of genomes computed on the 307 seven-way synteny blocks. MGR with fixed topology The original MGR algorithm (Bourque and Pevzner 2002) works in two stages. Suppose we have m genomes: G 1, G 2,, G m. In the first stage, rearrangements in any of the starting genomes reducing the distance to all other starting genomes are identified and are carried out iteratively. These rearrangements are called good rearrangements. The process is repeated until no good rearrangement is left and we have m modified genomes: G 1, G 2 G m. In the second stage, the tree is initialized by using the 3 closest modified genomes and by computing their median genome. The remaining modified genomes are then added to the tree one at a time. Each remaining modified genome is added to the tree by splitting the existing edge such that the increase in the total number of rearrangements is minimized. If the topology of the tree is given, MGR is modified to work as follows. The procedure to identify and carry on good rearrangements in the first stage remains unchanged. But, in the second stage, the closest pair (in terms of the rearrangement distance) of topological neighbor is identified (G i and G j ). Next, the closest topological reference genome that is also the closest in terms of the rearrangement distance is identified (G k ). Finally, the pair (G i and G j ) is replaced by the solution of the median problem for these 3 genomes and the procedure is repeated until all ancestors have been found. 5

Optimizing a rearrangement scenario Given an unrooted binary tree with m genomes (leaves) and m-2 ancestors (internal nodes) found after running the first two stages of MGR, the internal nodes are reoptimized as follows. The approach implemented iterates between a top-down and a bottom-up traversal of the internal nodes and in this sense relates to the Fitch-Hartigan algorithm for character-based parsimony (Fitch, 1971; Hartigan, 1973). For each internal node, we solve the median problem associated with the 3 adjacent nodes and we replace the internal node by the new median if the score is better. We iterate the process until no new internal node is found in a given top-down or bottom-up pass. For the data set with 307 blocks, the first scenario recovered required a total of 505 rearrangements. Reoptimizing the internal nodes allowed us to find a scenario with 499 rearrangements. Finding alternative ancestors We find alternative ancestors in a way very similar to how we optimize a scenario. Once again, given a tree with genomes and ancestors, we traversed the ancestors by alternating between a top-down and bottom-up approach. For each ancestor, instead of solving the median problem associated with this ancestor, we looked for a rearrangement that could be applied to this ancestor such that the total distance to its 3 adjacent nodes does not increase. When the distances to the neighbors are large, we can typically obtain a list of such rearrangements. Associated to each of these rearrangements is an alternative ancestor in the sense that an equally good solution (in terms of the number of rearrangements) could be obtained by replacing the current ancestor with this new one. For every list of potential alternative ancestors, we only recorded the alternative ancestor that maximized the distance to the original ancestor associated with this node. We also required that this alternative ancestor was not recorded previously. The purpose of these two rules was to maximize the range of alternative solutions explored. Finally, we replaced the internal node with the new alternative ancestor identified and we continued traversing the tree. We alternated between top-down and bottom-up traversals until either no new alternative ancestor was found or, more probably, we reached the maximum number of traversals (set to 3000). The large number of potential alternative ancestors can typically by explained by the combinatorial explosion associated with different optimal configurations of just a few breakpoints (weak adjacencies). While generating the list of alternative ancestors, it is sometimes possible to identify a new ancestor that decreases the overall total number of rearrangements. When that is the case, we use the new ancestor but we do not reset the lists of alternative ancestors to avoid restricting the procedure to a limited set of alternative ancestors. For the data set with 307 blocks, we started from a scenario with 499 rearrangements. The procedure of modifying ancestors and looking for better or equally good ancestors actually allowed the identification of a scenario with 487 rearrangements that is displayed in Figure S2. The scenario was found in the 1603rd traversal. After 3000 traversals, we recorded 694, 2018, 3000, 50 and 3000 alternative ancestors for the mouse-rat, cetartiodactyl, carnivore, ferungulate, and human-mouse-rat ancestors, respectively. The 6

small number of alternative ancestors found for the ferungulate ancestor relates to the short edges incident to that node. Estimating the Boreoeutherian ancestor The two oldest ancestors reconstructed using the current data set are the ancestor of catdog-pig-cow (cetartiodactyl-carnivore ancestor) and the ancestor of human-mouse-rat. In the solution displayed in Figure S2, these two ancestors are only 14 rearrangements away from each other. Under the parsimonious assumption, the mammalian ancestor should reside somewhere on the path between the cetartiodactyl-carnivore ancestor (A10) and the human-rodent ancestor (A11), but outgroup data from an additional mammalian genome would be required to identify its exact position unambiguously. To circumvent this limitation, we make the further assumption that the rearrangement rates were comparable in these early branches, and we looked for the putative mammalian ancestor at the midpoint between the ferungulate ancestor and the human-mouse-rat ancestor. Unfortunately these two ancestors, and the path in between them, are not unique. Instead of generating a single prediction for the mammalian ancestor, we generated one prediction for each pair of alternative ancestors (A10, A11) by selecting the ancestral configuration half way on one of the paths between this particular A10 and this particular A11. The result of the prediction made from the A10 and A11 displayed in Figure 2 and S2. All the other predictions are used to identify strong and weak adjacencies as before. For each pair (A10, A11), we note that the distance between the two ancestors is not always equal to 14 and, moreover, that each pair is not necessarily associated with an optimal scenario of 487 rearrangements. Nevertheless, we find the mammalian ancestors to be relatively stable. Starting from the 50 alternative ferungulate ancestors and the first 100 alternative human-mouse-rat ancestors, we generated 5000 pairs of ferungulate/human-mouse-rat ancestors, and hence 5000 alternative mammalian ancestors. This leads to 45 weak adjacencies that include 22 weak chromosome endpoints. Using the first 1000 alternative ancestors for human-mouse-rat, provides 50000 alternative mammalian ancestors but only increased the number of weak adjacencies to 49 (and 23 weak chromosome endpoints). Those are the weak adjacencies displayed in Figures 2 and S2. Gene Density at Evolutionary Breakpoints Analysis To determine gene density in classified evolutionary breakpoints (<4Mbp), we defined the narrowest breakpoint region interval between flanking HSBs, relative to the human sequence coordinates, as the core breakpoint, and determined its midpoint. In cases where the breakpoint was shared by multiple species, a consensus breakpoint boundary was defined in the human genome, with the left and right ends defined as the most distant boundaries of the overlapping breakpoint region. In cases where the consensus breakpoint interval included an evolutionary breakpoint region in one or more species that is defined by boundaries that are 500 Kbp from the core, then the midpoint was calculated as half the distance from the most proximal boundaries of the core and the breakpoint region (as defined in a second species). Given the coordinates of that midpoint in the human genome, we then created discrete intervals of 0.5 Mbp windows surrounding the midpoint and counted the number of unique RefSeq and predicted genes (NCBI Build 33) in those windows, extending to 2 Mbp total on each side of the midpoint. Initial analyses showed 7

no significant difference in gene number between the innermost 0.5 Mbp windows on either side of the midpoint and the adjacent 0.5 Mbp windows, so these were combined and compared to the number of RefSeq and predicted genes in chromosome regions outside of the breakpoint regions. The horse genome is excluded from the analysis of gene content because the data set as a whole produces artificially large consensus breakpoint intervals. Segmental Duplications in Primate-Specific Chromosomes. We examined the segmental duplication content of each primate specific breakpoint using the 1Mbp surrounding the midpoint of the core breakpoint, as previously defined. Primate-specific breakpoints were defined as described above under breakpoint classification. These data were tabulated using the coordinates in Table S4 and the segmental duplication data track in the UCSC web browser (April 2003 build). We counted the number of segmental duplications in each 1 Mbp breakpoint region, determined what frequency mapped to other intrachromosomal sites, and how many either mapped back to the same breakpoint region, or to breakpoint regions flanking HSBs that were rearranged (inverted) relative to the outgroup mammalian species. Centromere/Telomere Analysis The positions of the centromeres and telomeres in each species (except horse due to the relatively low resolution of its comparative map) were assigned to human-species homologous synteny blocks (HSBs). The acrocentric centromere and telomere locations in HSBs were defined based on the position of the marker on the RH comparative map most proximal to the centromere or telomere in the species chromosomes. In total, the position of 108 acrocentric centromeres and 233 telomeres were assigned to the HSBs. Due to inconsistency between the rat genome sequence centromere assignments between the NCBI and the UCSC, the 13 rat metacentric centromeres were excluded from our analysis. Of the 50 metacentric centromeres assigned, 22 were found on the boundaries of HSBs. These were split in half, and placed on each of two boundary HSBs. We excluded 5 metacentric centromeres and 4 telomeres from the analysis of centromere/telomere positional conservation because their positions could not be unambiguously determined. Cross-species positional conservation was defined as centromeres/telomeres located in HSBs in more than one species within 2.5 Mbp on the same human chromosome. An exception was made for the centromeres/telomeres located proximal to the human telomeric regions. Conservation of such centromeres/telomeres was accounted for even when centromeres/telomeres were located >2.5 human-mbp, while the other representative of the same order showed contiguous homologous synteny with the human genome in this area. Acrocentric centromeres were classified into three distinct groups 1) centromeres, (if the location of an acrocentric centromere in a human/species HSB visualized on human chromosomes was supported by conservation with the location of a metacentric centromere in another species HSB on the same human chromosome) 2) telomeres (if the location of the acrocentric centromere in a human/species HSB demonstrated conservation with the location of a telomere in another species HSB) and 3) acrocentric centromere, (if there was no conservation with either a metacentric centromere or 8

telomere location in human-species HSB at the position of the acrocentric centromere visualized on human chromosome (Fig S2)). When acrocentric centromere positions showed conservation with both metacentric centromere(s) and telomere(s) positions the acrocentric centromere was classified using the classification of the closest relative of the species analyzed. For this analysis 85 centromeres were used, including 50 metacentric centromeres, 22 metacentric centromeres at evolutionary breakpoints that were divided in half, 18 acrocentric centromeres categorized as centromeres (minus five excluded centromeres). For the same analysis 254 telomeres were used: 233 telomeres plus 25 acrocentric centromeres categorized as telomeres minus four excluded telomeres. A second analysis was performed to examine the association between centromere/telomere positions and evolutionary breakpoints. Of the original 414 centromere/telomere locations, 78 were excluded because they were placed to HSBs at the boundary of a gap (evolutionary breakpoint region >4 human-mbp), 92 were excluded because they were located at human telomeric regions (not classified as breakpoint regions), 10 metacentric centromeres were excluded because they were located inside HSBs and finally all 64 human centromeres/telomeres were excluded, because the reference genome (human) HSBs were not defined. A Chi-square test was done to test if there was an association between the remaining 170 centromere/telomere locations and evolutionary breakpoint types. Cancer Breakpoint Analysis The Mitelman database (NCBI, version build 33) was used as a source of the positions for cancer-related chromosome abnormalities in the human genome. The cytogenetic positions of cancer-associated breakpoints were translated into human sequence coordinates using coordinates of the genes associated with each of cancer abnormality in the database. Of 1,647 cancer chromosomal abnormalities, human sequence coordinates could be obtained for 650 cases. The cases were sorted by gene names; a total of 112 unique genes were identified in the dataset. The number of occurrences for each gene was summed from all cancer cases involving the same gene. Of these, 61 genes had >10 documented occurrences, and 51 had between 2 and 9 occurrences. These two datasets were independently tested for co-occurrence and location within 0.4 human-mbp from the 367 evolutionary breakpoints (<4 human-mbp; average breakpoint size = 1.2 Mbp) identified from the multispecies genome comparison. Sixty-one cases were separately tested for co-occurrence with mouse, rat, and rodent-specific breakpoints (N=206; average breakpoint size = 274 Kbp) References G. Bourque, P. A. Pevzner. Genome Res. 12, 26 (2002). G. Bourque, P. A. Pevzner, G. Tesler, Genome Res. 14, 507 (2004). B. P. Chowdhary et al., Genome Res. 13, 742 (2003). A. Everts-van der Wind et al., Genome Res. 14, 1424 (2004). 9

W. Fitch. Systematic Zool., 20, (1971). R. Guyon et al., Proc. Natl. Acad. Sci. U.S.A. 100, 5269 (2003). J. A. Hartigan. Biometrics, 29, 53 (1973). D. M. Larkin et al., Genome Res. 13, 1966 (2003). S. A. McCoard et al. Animal Genet. 33, 178 (2002). M. Menotti-Raymond et al., Cytogenet Genome Res. 102, 272 (2003). S. N. Meyers et al. Genomics in press (2005). N. Saitou, M. Nei. Mol. Biol. Evol. 4, 406 (1987). M. S. Springer, O. Madsen, W. W. de Jong, M. J. Stanhope, Trends Ecol. Evol. 19, 430 (2004). 10

Figure S1. Homologous synteny blocks (HSBs) between horse, cat, dog, cattle, pig, rat, mouse, and human genomes visualized on each human chromosome (scale to the left is in Mbp, based on NCBI Build 33). Human centromeres are indicated in black, and heterochromatin is indicated with stippled markings. Gray bars correspond to HSBs, with the species chromosome number indicated inside the bars. Lower case letters inside the segments indicate the order of the blocks in that species chromosome (in alphabetical order). The position of telomeres and centromeres in each species (for those that could be accurately determined) are indicated with the dark gray rectangles, and ovals respectively. Black ovals correspond to metacentric centromeres; gray ovals to acrocentric centromeres. Split ovals show the positions of metacentric centromeres involved in evolutionary rearrangements. Telomere (T)/centromere (C) clusterings are indicated with red arrowheads. Positions of human genes associated with cancer chromosome aberrations (37) with occurrences between 2 and 9 are shown with numbered black arrowheads. Those are indicated by green numbered arrowheads have >10 occurences. Blue arrowheads labeled RB indicate reuse breakpoints. Colored blocks on the right side of the human chromosome ideograms indicate the homologous segment in the boreoeutherian ancestor (Fig. 2) (with ancestral chromosome number and block depicted inside).

Figure S2. Genome architecture of seven mammalian species, and the ancestors of the three mammalian lineages, computed by MGR from the seven starting genomes, and compared to the human genome (far right). Each human chromosome is assigned a unique color, and is divided into 7-way, multispecies homolgous synteny blocks (MHSBs). The blocks are proportional to their length in the human genome. Diagonal lines within each block (from top left to bottom right) indicate the relative order and orientation of genes within the block. The number above each colored block refers to the corresponding human chromosome homologue. Gaps in multispecies coverage are shown in white. Gray hashed lines indicate heterochromatic/telomeric regions of human chromosome. Species chromosome designations are listed to the left of each chromosome, and the blocks are numbered above to indicate their homologous human chromosome. Four ancestral genomes are depicted in between the seven species genomes, and the boreoeutherian genome shown to the far right. The ferungulate ancestor is the ancestor of Carnivora+Cetartiodactyla. At the top of the figure, the phylogram indicates the number of rearrangements required to convert one genome into the other. Black arrows on the ancestral chromosomes indicate that the two adjacent MHSBs separated by the arrow were not found in every one of the most parsimonious solutions: these are considered weak adjacencies.

14 14 99 42 76 40 10 30 31 105 26 cattle cetartiodactyl anc. pig dog carnivore anc. cat ferungulate anc. rat murid rodent anc. mouse human/mouse/rat anc. human boreoeutherian anc. 1 21 3 21 1 1 1 6 18 15 18 15 14 15 9 1 18 6 19 1 1 A1 13 5 1 1 22 1 6 19 11 15 11 16 10 11 10 1 18 2 1 1 8 6 2 18 2 1 1 15 1 1 1 22 1 2 3 2 1 1 2 1 2 3 2 20 2 2 3 11 19 5 7 16 2 2 3 5 16 1 17 5 15 4 2 3 20 2 2 1 22 A2 A3 19 3 7 20 2 2 3 20 2 2 2 3 5 8 3 4 13 3 4 1 4 1 9 2 11 15 20 2 3 2 14 21 3 19 2 3 9 2 11 15 20 8 3 4 13 3 4 1 4 1 2 3 2 18 2 2 3 2 3 2 18 2 4 5 6 7 7 12 22 12 22 4 19 5 19 5 4 5 6 7 3 21 3 4 5 4 5 6 7 8 1 22 12 16 19 1 18 1 6 15 14 4 5 6 7 1 10 5 11 17 1 16 7 16 1 1 18 4 5 6 7 3 5 21 3 4 8 4 5 B1 B2 B3 B4 4 8 4 6 15 14 12 22 4 5 6 7 3 21 3 4 8 4 5 4 5 6 7 7 4 2 3 10 12 8 6 9 1 2 14 12 19 12 8 22 12 4 5 6 7 1 4 18 6 2 12 22 8 6 8 3 4 13 3 4 1 4 1 5 11 19 11 15 6 3 4 5 6 7 8 6 9 1 7 2 4 1 12 7 13 7 4 2 3 10 12 19 11 15 11 16 10 4 5 6 7 12 3 21 3 4 16 5 4 5 6 7 4 5 6 7 3 3 21 4 8 4 5 8 8 9 8 5 19 8 4 8 14 8 5 C1 1 2 8 19 5 8 11 19 11 15 6 3 8 1 7 6 5 6 8 19 13 8 4 19 22 4 19 16 1 8 6 8 8 5 19 9 6 9 15 6 9 11 1 7 1 9 17 9 9 6 C2 21 3 9 6 9 6 2 18 9 1 9 6 8 6 21 6 9 11 19 11 15 6 3 9 7 9 9 6 10 14 15 14 15 14 15 14 15 10 6 10 1 9 10 12 22 2 10 7 16 D1 11 10 16 7 10 16 5 17 10 7 4 2 3 10 12 10 6 19 12 10 19 4 8 7 10 10 7 16 11 12 13 14 20 2 9 13 8 11 12 13 14 7 16 7 8 4 7 8 11 12 13 14 17 13 81222 1 10 3 21 11 12 13 14 5 9 6 8 4 7 11 12 13 14 7 1 10 8 11 D2 D3 D4 E1 1 10 12 22 18 9 17 11 12 13 14 7 9 1 10 8 11 12 13 14 3 21 3 19 13 7 12 18 2 1 1 4 22 2 11 12 13 14 10 19 20 15 11 2 9 10 11 12 19 6 8 11 12 13 14 22 2 5 17 2 14 1 7 6 5 101413 8 13 11 12 13 14 8 9 1 10 3 22 10 11 12 13 14 11 12 13 14 7 9 1 10 8 15 11 15 9 8 15 2 4 8 2 15 1 15 1 12 4 15 12 E2 19 16 15 11 15 1413 8 13 15 12 7 13 22 19 4 8 13 10 16 11 15 11 19 16 15 5 8 22 12 15 11 15 15 11 16 17 1 4 12 22 16 17 1 10 11 16 17 20 5 16 17 7 4 8 2 1 16 17 22 12 13 E3 F1 7 16 1 16 17 22 12 12 16 17 10 19 4 8 13 5 6 7 1 16 17 13 8 13 14 1 16 19 4 16 17 16 3 21 616 6 21 6 18 2 16 17 12 13 16 17 16 17 12 22 12 18 16 19 18 12 22 18 7 18 7 11 18 15 14 F2 8 18 13 18 18 5 18 18 22 2 5 17 18 18 5 18 18 14 18 18 13 19 17 19 12 22 X X 19 2 4 19 15 X X 19 15 14 19 16 22 16 19 4 16 1 19 18 5 18 19 11 10 19 15 19 19 15 14 20 5 20 13 20 3 19 20 19 16 20 15 20 621 6 20 X X X 20 19 16 20 20 15 21 22 15 14 15 14 317 3 21 22 19 16 14 15 9 21 22 11 13 21 22 17 17 9 21 22 16 19 17 X X 21 22 2 20 17 21 22 21 22 19 16 17 23 6 23 17 23 3 23 18 23 18 23 X X 23 2 20 24 25 26 27 28 18 16 7 10 4 8 1 10 24 25 18 X 24 25 26 27 28 20 13 8 2 12 21 22 18 12 10 24 25 3 19 X 24 X 24 X 29 11 29 8 X X 30 15 31 3 21 32 4 33 3 34 5 3 35 6 36 2 6 37 2 38 1 X X

Fig. S3. The distribution of evolutionary reuse breakpoints in mammalian genomes. The chart shows the number of reuse breakpoints found between individual species from different orders, and between orders (see Fig. S1 for labeled reuse breakpoints). Abbreviations are as follows: Rod=Rodentia, Catl=Cattle, Ceta=Cetartiodactyla, Mou=Mouse, Car=Carnivora.

Fig. S4. Conservation of telomeres within and between mammalian orders. Blocks from bottom to top indicate: no conservation (white), conservation between single species from different orders (gray), conservation within a single order (dark gray), conservation within an order and at least a single species from a different order (light gray), conservation within two orders and at least a single species from a third or fourth order (gray stripes). 2

frequency

Figure S5: Human cancer breakpoint occurrences and their correspondence with evolutionary breakpoint regions Within +/- 0.4 Mbp > 0.4 Mbp % within +/- 0.4Mbp % >0.4 Mbp 2-10 occurences N=51 6 45 12 88 >9 occurences N=61 19 42 31 69 >2 occurences N=112 25 87 22 78 Cancer Breakpoints Compared To Evolutionary Breakpoint Regions Number of Cancer Aberration Breakpoints 50 45 40 35 30 25 20 15 10 5 0 Within +/- 0.4 Mbp > 0.4 Mbp Distance of Cancer Aberration Breakpoints to Evolutionary Breakpoint Regions 2-10 occurences N=51 >9 occurences N=61

Table S1. Pairwise Homologous Synteny Blocks Human Chromosome Segment Start/Human Genome Segment End/Human Genome Species Chromosome/Segment Order Segment Start/Species Genome Segment End/Species Genome Segment Orientation Species Segment # Comment 13 21693700 113064379 A1a 1 16 1 cat 1 1 245115007 245124084 A1b 17 17 0 cat 2 singleton 1 224713326 224722192 A1c 18 18 0 cat 3 singleton 5 180725965 180732931 A1d 19 19 0 cat 4 singleton 5 124098797 145444492 A1e 20 28 1 cat 5 5 54691047 70405708 A1f 29 34 1 cat 6 5 74019724 88105156 A1g 35 39-1 cat 7 5 110590510 111490607 A1h 40 41-1 cat 8 5 171060048 176460698 A1i 42 44 1 cat 9 5 158677334 162035708 A1j 45 46 1 cat 10 5 35857133 52508337 A1k 47 50 1 cat 11 5 148190997 151050039 A1l 51 53 1 cat 12 5 6632091 17309480 A1m 54 56-1 cat 13 19 1043934 19314097 A2a 57 67 1 cat 14 3 45755641 51753995 A2b 68 76-1 cat 15 3 53050080 53081586 A2c 77 77 0 cat 16 singleton 3 1451745 13166476 A2d 78 83 1 cat 17 3 130529093 130535798 A2e 84 84 0 cat 18 singleton 7 37986196 50340856 A2f 85 88-1 cat 19 7 103882428 103884909 A2g 89 89 0 cat 20 singleton 7 80926574 92802263 A2h 90 92 1 cat 21 7 22475068 30980430 A2i 94 96-1 cat 22 7 120783291 131601627 A2j 97 102 1 cat 23 7 136101824 154560138 A2k 103 107-1 cat 24 20 33516847 62172325 A3a 108 117-1 cat 25 20 411734 14155629 A3b 118 124 1 cat 26 2 43408416 75076821 A3c 126 132 1 cat 27 2 21182012 35873540 A3d 133 136-1 cat 28 2 86969573 113448884 A3e 137 141 1 cat 29 2 39066275 39067575 A3f 142 142 0 cat 30 singleton 2 9729021 16125926 A3g 143 146-1 cat 31 4 182527928 186652876 B1a 147 149 1 cat 32 8 12616348 19634073 B1b 151 155 1 cat 33 8 27275788 42016234 B1c 156 162-1 cat 34 4 657730 175316794 B1d 163 185-1 cat 35 6 282097 28564483 B2a 186 194-1 cat 36 6 29926866 88825718 B2b 195 216 1 cat 37 6 92138810 170557420 B2c 218 230-1 cat 38 15 76132408 81314475 B3a 231 233 1 cat 39 15 84268789 89026239 B3b 234 237-1 cat 40 15 96785448 97094240 B3c 238 238 0 cat 41 singleton 15 27572440 27694790 B3d 239 239 0 cat 42 singleton 15 37452423 74939446 B3e 240 250-1 cat 43 15 27067283 27070326 B3f 242 242 0 cat 44 singleton_within 14 18913418 104293998 B3g 251 281 1 cat 45 15 32562542 32762771 B3h 262 262 0 cat 46 singleton_within 10 5940612 8267092 B4a 282 284 1 cat 47 10 13414005 17429831 B4b 285 287-1 cat 48 10 18579845 22770461 B4c 288 289 1 cat 49 10 31706539 35651317 B4d 290 292-1 cat 50 12 5420747 56835830 B4e 293 310 1 cat 51 12 62139180 71496217 B4f 311 314-1 cat 52 12 87890468 103277571 B4g 315 318 1 cat 53 22 32688487 37884099 B4h 319 323-1 cat 54 22 40724570 49015920 B4i 324 326-1 cat 55 1 1585750 27194712 C1a 327 339 1 cat 56 1 31544491 65460889 C1b 340 347-1 cat 57 1 70702760 144725443 C1c 348 360 1 cat 58 2 120318716 241838347 C1d 361 384 1 cat 59 21 15255433 46941497 C2a 385 392-1 cat 60 3 76765804 123699021 C2b 393 405 1 cat 61 3 134746710 197215954 C2c 406 419-1 cat 62 3 16620157 17806005 C2d 420 421 1 cat 63 3 30497808 38951627 C2e 422 425 1 cat 64 3 19362170 25489220 C2f 426 427 1 cat 65 11 102425313 126166349 D1a 428 444 1 cat 66 11 72389623 101035333 D1b 445 451-1 cat 67 11 5205733 18468547 D1c 452 457 1 cat 68 11 26740657 32496231 D1d 458 461-1 cat 69 11 32890555 36658940 D1e 462 467 1 cat 70 11 61259553 68958076 D1f 468 474 1 cat 71 11 181724 2829324 D1g 475 477-1 cat 72 1 225692983 236369302 D2a 478 481-1 cat 73 10 60165227 65674798 D2b 482 484 1 cat 74 10 74510175 86044527 D2c 486 491-1 cat 75

10 43347708 50765558 D2d 492 495-1 cat 76 10 104718952 106017367 D2e 496 497 1 cat 77 10 108657106 126717000 D2f 498 505-1 cat 78 10 99666091 102403107 D2g 506 508-1 cat 79 12 125463278 133089157 D3a 509 510-1 cat 80 12 110612631 121223165 D3b 511 516 1 cat 81 22 21580600 30831199 D3c 517 520 1 cat 82 18 970800 10113453 D3d 521 525-1 cat 83 22 17537649 17540897 D3e 523 523 0 cat 84 singleton_within 18 21779040 71611061 D3f 526 531 1 cat 85 15 89033388 89051654 D3g 528 528 0 cat 86 singleton_within 9 11905550 12700223 D4a 533 534-1 cat 87 9 21067106 35747923 D4b 535 541-1 cat 88 9 106463545 133317257 D4c 542 553-1 cat 89 17 14250490 19675749 E1a 554 559 1 cat 90 17 1275985 9751740 E1b 560 564-1 cat 91 17 26882361 32312773 E1c 565 571 1 cat 92 17 56764618 59679495 E1d 572 574-1 cat 93 17 38082011 41186168 E1e 575 578-1 cat 94 17 42293302 48595350 E1f 579 584-1 cat 95 17 62335374 80354230 E1g 585 595 1 cat 96 19 59061575 62808963 E2a 596 598 1 cat 97 19 34374288 55108809 E2b 599 613-1 cat 98 19 22153700 22155800 E2c 602 602 0 cat 99 singleton_within 16 47235351 89688677 E2d 614 624 1 cat 100 7 4840400 4840520 E3a 625 625 0 cat 101 singleton 7 99287685 101473628 E3b 626 630 1 cat 102 7 64823489 75669671 E3c 631 640-1 cat 103 16 672593 29857874 E3d 642 649-1 cat 104 1 164095901 182950099 F1a 650 657 1 cat 105 1 197794598 210818867 F1b 658 663-1 cat 106 1 150738543 161580755 F1c 664 670 1 cat 107 1 186338948 186348198 F1d 671 671 0 cat 108 singleton 1 215123128 215219029 F1e 672 672 0 cat 109 singleton 8 53258204 134539163 F2a 673 691 1 cat 110 X 10607902 151592786 Xa 692 724 1 cat 111 21 14665471 34224306 1a 1 8-1 cattle 1 3 76952785 126014321 1b 9 31 1 cattle 2 3 153834347 198918819 1c 32 47-1 cattle 3 3 139944679 144060636 1d 48 52 1 cattle 4 3 131894446 137949578 1e 53 56 1 cattle 5 2 243071390 243071502 1f 57 57 0 cattle 6 singleton 21 36362691 45205395 1g 58 65 1 cattle 7 3 15288218 15374753 1h 62 62 0 cattle 8 singleton_within 3 131700576 131701180 1i 66 66 0 cattle 9 singleton 2 188294271 188294465 2a 67 67 0 cattle 10 singleton 2 191265044 191337954 2b 68 68 0 cattle 11 singleton 15 25823012 26034151 2c 69 69 0 cattle 12 singleton 2 127710541 131827274 2d 70 72 1 cattle 13 2 191033384 191148604 2e 73 73 0 cattle 14 singleton 15 20423754 20452813 2f 74 74 0 cattle 15 singleton 2 191172219 191200413 2g 75 75 0 cattle 16 singleton 2 143909211 183695912 2h 76 87 1 cattle 17 3 149697209 149742427 2i 82 82 0 cattle 18 singleton_within 2 114554351 121015788 2j 88 91-1 cattle 19 2 135235659 135943001 2k 92 94-1 cattle 20 2 191797784 209083854 2l 96 103 1 cattle 21 2 216140711 224431140 2m 104 113-1 cattle 22 2 228300887 232637976 2n 114 119-1 cattle 23 1 32828690 32856534 2o 116 116 0 cattle 24 singleton_within 1 15996102 32178408 2p 120 138-1 cattle 25 1 58606121 164602310 3a 139 186-1 cattle 26 1 224655903 224672451 3b 151 151 0 cattle 27 singleton_within 1 38920455 46148253 3c 189 194 1 cattle 28 2 234232819 242142249 3d 195 198-1 cattle 29 1 35070436 36041724 3e 199 202 1 cattle 30 7 79362390 96967711 4a 203 215-1 cattle 31 4 174837773 174838785 4b 208 208 0 cattle 32 singleton_within 7 23219402 23274348 4c 210 211 0 cattle 33 singleton_within 7 103313520 107193133 4d 216 220 1 cattle 34 7 111393372 116849416 4e 221 226-1 cattle 35 7 12320540 12403564 4f 227 227 0 cattle 36 singleton 7 26039787 45667651 4g 228 238 1 cattle 37 7 55456517 55488695 4h 232 232 0 cattle 38 singleton_within 8 52071452 52071773 4i 239 239 0 cattle 39 singleton 7 121053887 151447936 4j 240 251 1 cattle 40 4 657092 657757 4k 243 243 0 cattle 41 singleton_within 12 80101134 93768986 5a 252 261 1 cattle 42 12 40589623 54689770 5b 262 272-1 cattle 43 12 55795103 70543513 5c 273 283-1 cattle 44 12 96596135 104099339 5d 284 288 1 cattle 45

22 31522195 35668902 5e 289 293 1 cattle 46 12 4262199 32950041 5f 292 307-1 cattle 47 22 36413032 36413528 5g 298 298 0 cattle 48 singleton_within 14 43749441 43749643 5h 302 302 0 cattle 49 singleton_within 22 16644972 16685103 5i 308 308 0 cattle 50 singleton 22 36929737 49316132 5j 309 313 1 cattle 51 4 89113522 120943179 6a 314 328-1 cattle 52 4 19943231 88978287 6b 329 357 1 cattle 53 4 4262159 5815903 6c 359 360-1 cattle 54 4 6710544 17202205 6d 365 369-1 cattle 55 1 224713326 224980083 7a 371 372-1 cattle 56 19 13109089 19463359 7b 373 379-1 cattle 57 19 2209523 5093605 7c 381 383 1 cattle 58 5 131427210 132144366 7d 384 386 1 cattle 59 19 737411 752327 7e 387 387 0 cattle 60 singleton 19 7879606 12837529 7f 388 395 0 cattle 61 5 122712498 122790149 7g 396 396 0 cattle 62 singleton 5 176747058 180306851 7h 397 400 1 cattle 63 1 245030235 245030340 7i 401 401 0 cattle 64 singleton 19 765097 1921309 7j 402 405 1 cattle 65 5 132242576 146444172 7k 406 415 1 cattle 66 5 149764362 161261955 7l 416 421 1 cattle 67 5 80664539 108558257 7m 422 431 1 cattle 68 8 9782861 11567741 8a 432 434-1 cattle 69 8 27275794 28021411 8b 435 437 1 cattle 70 9 356238 32563350 8c 438 444-1 cattle 71 9 64714529 72692449 8d 445 452 1 cattle 72 9 33094081 37856457 8e 453 459-1 cattle 73 9 81194075 81194393 8f 460 460 0 cattle 74 singleton 8 19606071 22698277 8g 461 466-1 cattle 75 9 84136387 110814313 8h 467 479 1 cattle 76 6 71327637 83026802 9a 480 485 1 cattle 77 6 117882140 127599865 9b 486 491 1 cattle 78 6 99848361 116447426 9c 493 500-1 cattle 79 6 129139472 168047271 9d 501 517 1 cattle 80 14 51010044 51093480 10a 518 518 0 cattle 81 singleton 15 63482603 70139318 10b 519 522 1 cattle 82 14 18904698 22900668 10c 523 531-1 cattle 83 15 36359442 41092619 10d 532 537 1 cattle 84 14 48565394 48573670 10e 538 538 0 cattle 85 singleton 15 46749230 62574353 10f 539 553-1 cattle 86 14 51314031 84084304 10g 554 571 1 cattle 87 15 42582853 42589480 10h 572 572 0 cattle 88 singleton 2 95220395 111831391 11a 573 578 1 cattle 89 2 71516603 74400121 11b 579 583-1 cattle 90 2 31515982 33582120 11c 584 586 1 cattle 91 2 74643151 74714596 11d 587 588-1 cattle 92 2 37386420 51217330 11e 589 596-1 cattle 93 2 61372309 67595383 11f 597 601-1 cattle 94 2 79341841 85513118 11g 602 605 1 cattle 95 2 54711216 58344712 11h 606 609-1 cattle 96 2 75677152 75816416 11i 610 610 0 cattle 97 singleton 2 85733916 86962867 11j 611 614-1 cattle 98 2 113438308 113867305 11k 615 618 1 cattle 99 2 70465664 70466012 11l 619 619 0 cattle 100 singleton 2 20416154 31448970 11m 620 629 1 cattle 101 2 69926836 70274004 11n 630 631 1 cattle 102 2 9568707 10756219 11o 632 635-1 cattle 103 9 117660930 133812198 11p 636 647 1 cattle 104 13 20858048 52250584 12a 648 660-1 cattle 105 13 72266179 113064619 12b 661 673 1 cattle 106 10 33189173 33189252 12c 666 666 0 cattle 107 singleton_within 13 20823750 20824838 12d 670 670 0 cattle 108 singleton_within 10 33857582 36080601 13a 674 675 1 cattle 109 10 13470039 23561177 13b 677 681-1 cattle 110 10 28113926 30174969 13c 682 685 1 cattle 111 20 17422551 23566574 13d 686 693 1 cattle 112 10 170643 5600634 13e 694 698-1 cattle 113 20 2587041 5854003 13f 699 701-1 cattle 114 20 55572444 63246709 13g 702 705-1 cattle 115 20 336697 367193 13h 706 707 1 cattle 116 20 31478307 49194684 13i 708 725 1 cattle 117 20 16200750 16670419 13j 726 727-1 cattle 118 20 8061296 10555474 13k 729 731 1 cattle 119 8 122294338 144938033 14a 732 741-1 cattle 120 2 242637815 242639521 14b 742 742 0 cattle 121 singleton 8 53258203 59218602 14c 743 747-1 cattle 122 8 59441162 74465892 14d 748 756 1 cattle 123 8 118201750 120105393 14e 757 758-1 cattle 124 8 79311167 82118658 14f 759 761-1 cattle 125 8 117325870 117447292 14g 762 762 0 cattle 126 singleton

8 101383937 103242039 14h 763 765 1 cattle 127 8 104079660 109800606 14i 766 769 0 cattle 128 8 96694899 96695260 14j 770 770 0 cattle 129 singleton 8 120411813 120513872 14k 771 772 0 cattle 130 singleton 8 82293375 86141302 14l 773 774-1 cattle 131 11 94750718 104299321 15a 775 786-1 cattle 132 11 107647966 122966757 15b 787 807 1 cattle 133 1 656945 657101 15c 800 800 0 cattle 134 singleton_within 11 3606922 18086368 15d 808 829-1 cattle 135 11 71865223 76385891 15e 830 837 1 cattle 136 11 26370088 59884931 15f 838 867 1 cattle 137 1 659326 659491 16a 868 868 0 cattle 138 singleton 1 199519571 203961724 16b 869 875 1 cattle 139 1 189614633 189622181 16c 876 876 0 cattle 140 singleton 1 217415368 223312565 16d 877 883 1 cattle 141 1 237235402 237817065 16e 884 884 0 cattle 142 singleton 1 166355840 166559098 16f 885 887-1 cattle 143 1 238368064 241073844 16g 888 890-1 cattle 144 1 170027236 180080269 16h 891 895 1 cattle 145 1 193593332 198313571 16i 897 901-1 cattle 146 1 204701027 211126688 16j 903 907 1 cattle 147 1 182785859 183169425 16k 908 909-1 cattle 148 1 168061951 168143429 16l 910 910 0 cattle 149 singleton 1 1185659 15066499 16m 911 918-1 cattle 150 4 123569220 156604414 17a 919 936-1 cattle 151 4 157100884 160110729 17b 937 940-1 cattle 152 12 108971835 132906549 17c 941 952-1 cattle 153 22 23920379 30013052 17d 953 958 1 cattle 154 22 17274763 22451056 17e 959 968-1 cattle 155 19 34374216 34380163 18a 969 969 0 cattle 156 singleton 16 69883602 70570481 18b 970 974 0 cattle 157 16 74224531 89509172 18c 975 986 1 cattle 158 16 46463883 69496123 18d 987 1017 1 cattle 159 16 71660730 72828038 18e 1018 1019 1 cattle 160 19 34773239 63627610 18f 1020 1072 1 cattle 161 17 58039142 59812060 19a 1073 1075-1 cattle 162 17 27236434 33264971 19b 1076 1088-1 cattle 163 17 962865 18056321 19c 1089 1107 1 cattle 164 5 176496367 176658350 19d 1105 1105 0 cattle 165 singleton_within 17 45950561 48905415 19e 1108 1114-1 cattle 166 17 37473192 40823819 19f 1113 1122 1 cattle 167 5 162800120 162807563 19g 1123 1123 0 cattle 168 singleton 17 62245622 62956701 19h 1124 1126 1 cattle 169 17 64625694 81429122 19i 1127 1139-1 cattle 170 5 168999836 169660774 20a 1140 1141 1 cattle 171 5 6747464 74052866 20b 1142 1158-1 cattle 172 15 83879506 99785313 21a 1159 1165 1 cattle 173 15 27140673 29349843 21b 1166 1168-1 cattle 174 15 72711998 80399292 21c 1169 1173-1 cattle 175 14 29333759 37562478 21d 1174 1179 1 cattle 176 15 41244346 41673925 21e 1180 1183 1 cattle 177 14 89727710 103233692 21f 1184 1192 1 cattle 178 3 27264015 40167131 22a 1194 1201 1 cattle 179 19 23786424 23786909 22b 1202 1202 0 cattle 180 singleton 3 4526981 10160699 22c 1203 1209-1 cattle 181 17 43206153 43206254 22d 1210 1210 0 cattle 182 singleton 3 42442574 42528308 22e 1211 1212-1 cattle 183 3 56978865 71608550 22f 1213 1218-1 cattle 184 3 128598894 130535798 22g 1219 1221 1 cattle 185 3 11286246 11296261 22h 1222 1222 0 cattle 186 singleton 3 45550087 50149543 22i 1223 1231 1 cattle 187 6 55621652 56893536 23a 1232 1235 1 cattle 188 6 32604636 53135692 23b 1236 1261 1 cattle 189 6 2938393 32560586 23c 1262 1286-1 cattle 190 18 71705506 77632403 24a 1287 1291 1 cattle 191 18 61339916 61356094 24b 1292 1292 0 cattle 192 singleton 18 32803951 40737968 24c 1293 1296-1 cattle 193 18 148520 901054 24d 1297 1301 1 cattle 194 18 24323969 28824451 24e 1302 1304 1 cattle 195 17 26646708 26647099 24f 1305 1305 0 cattle 196 singleton 18 44278910 48362259 24g 1306 1307-1 cattle 197 18 7069996 13875520 24h 1308 1316 1 cattle 198 18 52829119 60432639 24i 1317 1321 1 cattle 199 16 1218338 11341619 25a 1322 1325-1 cattle 200 9 79830829 79831533 25b 1327 1327 0 cattle 201 singleton 16 18721728 31200072 25c 1328 1344 1 cattle 202 7 55740033 55775620 25d 1336 1336 0 cattle 203 singleton_within 7 65068128 75845912 25e 1345 1355 1 cattle 204 7 97844204 100328878 25f 1356 1361-1 cattle 205 7 2104812 6456942 25g 1362 1366-1 cattle 206 10 53349423 58013421 26a 1367 1369 1 cattle 207

10 90403644 105084506 26b 1371 1383 1 cattle 208 5 80645742 80647195 26c 1374 1374 0 cattle 209 singleton_within 10 112184025 135121309 26d 1384 1396 1 cattle 210 10 51958757 51959007 26e 1397 1397 0 cattle 211 singleton 8 12716064 12909249 27a 1398 1398 0 cattle 212 singleton 4 186200082 188229437 27b 1399 1402 1 cattle 213 8 15740582 17751798 27c 1403 1406 1 cattle 214 8 30856388 32478311 27d 1407 1408 1 cattle 215 8 37697698 42492813 27e 1409 1413-1 cattle 216 8 19070977 19269574 27f 1414 1414 0 cattle 217 singleton 3 21309788 21309938 27g 1415 1415 0 cattle 218 singleton 1 226676144 236369146 28a 1416 1419 1 cattle 219 10 38496809 43537563 28b 1420 1421 1 cattle 220 10 61680561 86150931 28c 1422 1428 1 cattle 221 10 44640810 51023128 28d 1429 1431 1 cattle 222 11 80850130 82834337 29a 1432 1435 1 cattle 223 11 83392819 88767869 29b 1436 1444-1 cattle 224 11 77443084 79558251 29c 1445 1447-1 cattle 225 11 18339499 23516713 29d 1448 1452-1 cattle 226 11 124671025 134055118 29e 1453 1465 0 cattle 227 11 60909075 68708644 29f 1466 1489 1 cattle 228 11 945884 2972880 29g 1490 1493-1 cattle 229 X 115836971 151591310 Xa 1494 1507 1 cattle 230 X 101064703 112920937 Xb 1508 1514 1 cattle 231 X 1081075 75082961 Xc 1515 1537-1 cattle 232 18 49849424 75928348 1a 1 25-1 dog 1 6 134636317 168415190 1b 26 47 1 dog 2 1 221714989 221715076 1c 31 31 0 dog 3 singleton_within 6 117205412 128158927 1d 48 56 1 dog 4 9 92830285 93252826 1e 57 58-1 dog 5 9 64680154 83509089 1f 59 80-1 dog 6 3 31617402 31617498 1g 62 62 0 dog 7 singleton_within 9 2793967 4985449 1h 81 83 1 dog 8 9 84197870 88325373 1i 84 86 1 dog 9 19 38430419 61241209 1j 87 101-1 dog 10 4 41354762 41354810 1k 95 95 0 dog 11 singleton_within 10 35649939 35651375 2a 103 103 0 dog 12 singleton 10 19838598 27636075 2b 104 113-1 dog 13 10 28756702 29768723 2c 114 117 1 dog 14 10 3919172 17303295 2d 118 128-1 dog 15 5 139314332 147500024 2e 129 134 1 dog 16 5 54430782 72910625 2f 135 150 1 dog 17 16 50091358 57392278 2g 151 157-1 dog 18 1 10694376 30604900 2h 158 173-1 dog 19 16 48252736 48252827 2i 161 161 0 dog 20 singleton_within 17 69831823 71173431 2j 174 175-1 dog 21 5 74693175 110655563 3a 176 192-1 dog 22 19 1608275 1608318 3b 194 194 0 dog 23 singleton 15 25187362 27572896 3c 196 198 1 dog 24 15 78482259 99405817 3d 199 213-1 dog 25 4 6636644 7205656 3e 214 215-1 dog 26 4 11178133 16784991 3f 216 223-1 dog 27 4 4843689 4843905 3g 224 224 0 dog 28 singleton 4 39163348 40800151 3h 225 229-1 dog 29 4 27216787 29725563 3i 233 235 1 dog 30 4 18221948 25988681 3j 236 241-1 dog 31 1 226090888 234291054 4a 242 250-1 dog 32 10 62019327 67573743 4b 251 255 1 dog 33 10 83161758 88397950 4c 256 259 1 dog 34 5 158904033 176874405 4d 260 276-1 dog 35 5 151618976 157149080 4e 277 281 1 dog 36 5 29154044 54244544 4f 283 312-1 dog 37 5 17699137 25414507 4g 313 318 1 dog 38 11 102696000 130804158 5a 319 346-1 dog 39 17 11482231 17061700 5b 347 350 1 dog 40 1 53734207 65993550 5c 351 363-1 dog 41 1 1185924 8096359 5d 364 369 1 dog 42 16 75066442 89406653 5e 370 382-1 dog 43 16 70099126 73722556 5f 383 387 1 dog 44 16 59729239 68604059 5g 388 391-1 dog 45 7 64821028 64821377 6a 392 392 0 dog 46 singleton 7 67222876 71294252 6b 393 396-1 dog 47 7 72795069 75652886 6c 398 400 0 dog 48 7 98503540 101165463 6d 401 406-1 dog 49 7 3204167 5448203 6e 407 409 0 dog 50 16 20390050 30078874 6f 410 424-1 dog 51 16 11928942 17938461 6g 425 428 1 dog 52 16 217613 7767297 6h 429 433-1 dog 53 1 67528567 110593132 6i 434 462-1 dog 54 1 193578479 198955520 7a 463 467-1 dog 55 1 203584059 211436968 7b 468 475 1 dog 56