Conservation genomics of the highly endangered Red Siskin Haw Chuan Lim Dept of Vertebrate Zoology & Center for Conservation Genomics Smithsonian Institution Brian Coyle Project Coordinator, Red Siskin Initiative Smithsonian Institution Paul Frandsen Office of Chief Information Officer Smithsonian Institution Rebecca Dikow Office of Chief Information Officer Smithsonian Institution Warren Johnson Smithsonian Conservation Biology Institute Smithsonian Institution Michael Braun Dept of Vertebrate Zoology Smithsonian Institution
Red Siskin (Spinus cucullatus) male female
Iconic Species
Threats X Red factor canary Trapping and trafficking Natural hybridization Habitat destruction and degradation I Hooded Siskin (Spinus magellanicus)
EXTERIOR PERSPECTIVE Cardenalito Conservation Center Bararida Parque Zoológico y Botánico Bararida September 23, 2016 Barquisimeto, Venezuela RUHL WALKER Architects 60 K Street Boston, MA 02127 617-268-5479 www.ruhlwalker.com Conservation Efforts Organization Smithsonian leadership + Many international partners Project Components Field research (Habitat modeling, ground truthing, GAP analysis, etc) Captive breeding THE CARDENALITO CONSERVATION CENTER BARARIDA (SCBI, Bararida zoo ) Shade coffee program Smithsonian Bird Friendly certification Education I and community based conservation Genomic resources Mitigating threats
Genomic Research Demographic and divergence history 1687 bp mtdna Inform captive breeding and reintroduction programs SCBI colony (Front Royal, VA) Adaptive changes related to habitat differences and domestication I Venezuela cloud forest up to 1200 meters Guyana foothill/savannah ecotone
De novo Sequencing & Resequencing De novo sequencing: 12x PacBio data generated from RSII 100x Illumina paired-end data Assembly MaSuRCA (Zimin, A. et al. 2013) Uses Illumina reads to correct PacBio reads and create MegaReads, which are then assembled using the CABOG assembler (Celera Assembler with Best Overlap Graph) Resequencing 4 from Guyana + 5 from Venezuela Illumina PE reads, ~10X per individual I
Genomics in a natural history institution Challenges Decentralization Lack of computational training Many different genomics interests (phylogenomics, metabarcoding/metagenomics, speciation, conservation, etc) New IT needs (HPC, data management, etc.) Hydra HPC at Herndon 3300 cpus, 2 x 1TB nodes Solutions Create pan institutional groups Provide computational training on core skills Create new IT initiatives to directly provide solutions to researchers (e.g., Galaxy, Amazon Web Services) I
Genome Quality Contig N50: 554 kb vs. Chicken: 45 kb (Sanger+BAC) & 12 kb (SOAPdenovo ) (Ye et al. 2011) Contig L50: 546 554 kb Benchmarking Universal Single-Copy Orthologs Red Siskin Chicken % of zebra finch genome 92.5% complete single copy 93.3% 1.9% complete and duplicated 1.7% 3.5% fragmented 2.9% 2% missing 2.1% I
Synteny Mapping Dot plot (Assemblytics) Red Siskin contigs aligned against Zebra Finch reference genome Synteny mapping (SyMap v4) Smaller scaffolds due to lack of large jumping libraries and genetic mapping I
Pliocene Pairwise sequentially Markovian coalescent modeling Last Glacial P 700 ka Map Illumina reads of high coverage individual back to the reference genome Identify heterozygous sites in 100 bp blocks Bootstrapping Generation time =3 years, mutation rate = 3 x 10-9 year/site
Resequencing Data Mapped against reference genome with bowtie2 Gibber Venezuela SNP discovery & genotype calling: GATK + stringent quality filtering ANGSD (probabilistic) PCA based on 24 mill probable SNPs; combined with pooled genomes of canary and domestic red siskin philloflowers.com Wild Canary en.wikipedia.org Domestic Red Siskin aussiefinchbreeder.com Guyana
Nucleotide Diversity Nucleotide diversity (π): 21,052 50 kb sliding windows in 50 kb steps Guyana, Avg pi = 8.3 x 10-4 Venezuela, Avg pi = 1.2 x 10-3
Differentiation between Guy and Ven Weir F st : 21,224 50 kb Sliding windows in 50 kb steps Average Fst = 0.221 Inter-species and population Fst across various taxa (Hey and Pinho 2011)
Next Steps Estimate divergence and demographic parameters Site frequency spectrum Design SNP probe sets for: Parentage analysis & genetic diversity/purity of captive populations Genotyping of historical samples Study of hybridization in the wild Finish annotating genome Work queue-maker v.2.31 on Amazon EC2 10 workers, Chicken as RepeatMasker & Augustus gene model species Genes associated with habitat adaptation and domestication
Acknowledgements Sm ithsonian Institution Smithsonian National Museum of Natural History Mike Braun Brian Coyle Kathryn Clark-Rodriguez Conservation Biology Institute National Zoological Park Brandie Smith Center for Cons and Ecol Genetics Jesus Maldonado Center for Species Survival Warren Lynch Paul Marinari Erica Royer Migratory Bird Center Robert Rice Scott Sillett Office of Chief Information Officer Paul Frandsen Rebecca Dikow Tropical Biology Institute Sunshine VanBael Funding Smithsonian Grand Challenges Consortium James Bond Fund NMNH Small Grants Program www.redsiskin.org Institutional Partners National Finch and Softbill Society, USA Ruhl Walker Architects, USA Instituto Venezolano de Investigaciones Cientificas Parque Zoologico y Botanico Bararida, VE Parque Zoologico El Pinar, VE Provita Conservation NGO, VE Universidad Central de Venezuela, VE South Rupununi Conservation Society, GY Environmental Protection Agency, GY Collaborators American Bird Conservancy, USA Bird Life International Sponsors Amazon Web Services Intel Corporation UNIVERSIDAD CENTRAL CARACAS - VENEZUELA