[Boston March for Science 2017 photo Hendrik Strobelt]

Similar documents
CS6501: Deep Learning for Visual Recognition. CNN Architectures

Development of intelligent systems (RInS) Object recognition with Convolutional Neural Networks

Week 42: Siamese Network: Architecture and Applications in Visual Object Tracking. Yuanwei Wu

The Kaggle Competitions: An Introduction to CAMCOS Fall 2015

IMAGE CAPTIONING USING PHRASE-BASED HIERARCHICAL LSTM MODEL

Multiclass and Multi-label Classification

Where Is My Puppy? Retrieving Lost Dogs by Facial Features

Machine Learning.! A completely different way to have an. agent acquire the appropriate abilities to solve a particular goal is via machine learning.

Dynamic Programming for Linear Time Incremental Parsing

Cats and Dogs. Omkar M Parkhi 1,2 Andrea Vedaldi 1 Andrew Zisserman 1 C. V. Jawahar 2. Abstract. 1. Introduction

Supplementary Fig. 1: Comparison of chase parameters for focal pack (a-f, n=1119) and for 4 dogs from 3 other packs (g-m, n=107).

The genetic factors under consideration in the present study include black (+) vs. red (y), a sex-linked pair of alternatives manifesting

Reasoning with Neural Networks

Using a Spatially Explicit Crocodile Population Model to Predict Potential Impacts of Sea Level Rise and Everglades Restoration Alternatives


Dog ecology studies oral vaccination of dogs Burden of rabies

Nathan A. Thompson, Ph.D. Adjunct Faculty, University of Cincinnati Vice President, Assessment Systems Corporation

Effects of Cage Stocking Density on Feeding Behaviors of Group-Housed Laying Hens

Recurrent neural network grammars. Slide credits: Chris Dyer, Adhiguna Kuncoro

A SPATIAL ANALYSIS OF SEA TURTLE AND HUMAN INTERACTION IN KAHALU U BAY, HI. By Nathan D. Stewart

Rules of Connectivity between Geniculate Cells and Simple Cells in Cat Primary Visual Cortex

The Increase and Spread of Mosquito Borne Diseases. Deidre Evans

Semantically-driven Automatic Creation of Training Sets for Object Recognition

Answers to Questions about Smarter Balanced 2017 Test Results. March 27, 2018

Modeling and Control of Trawl Systems

Lecture 4: Controllability and observability

Design of 32 bit Parallel Prefix Adders

Online WHIPPET User Guide September 2014

The integration of dogs into collaborative humanrobot. - An applied ethological approach - PhD Thesis. Linda Gerencsér Supervisor: Ádám Miklósi

An integrated study of the Gladstone Marine System

Echinoderms. Copyright 2011 LessonSnips

Ocean Teens. Water Quality Worksheet SECTION 1 SECTION 2. Tidal Touch Pools & Seahorse Sanctuary - Temperature. Jellyfish Kingdom - Light

Controllability of Complex Networks. Yang-Yu Liu, Jean-Jacques Slotine, Albert-Laszlo Barbasi Presented By Arindam Bhattacharya

A Column Generation Algorithm to Solve a Synchronized Log-Truck Scheduling Problem

Applying PZP Vaccines in the Field:

Pre-natal construction of neural circuits (the highways are genetically specified):

Second Interna,onal Workshop on Parts and A5ributes ECCV 2012, Firenze, Italy October, 2012 Discovering a Lexicon of Parts and Attributes

Subdomain Entry Vocabulary Modules Evaluation

Development of the New Zealand strategy for local eradication of tuberculosis from wildlife and livestock

DICOM Correction Proposal

PROGRESS REPORT for COOPERATIVE BOBCAT RESEARCH PROJECT. Period Covered: 1 April 30 June Prepared by

FPGA-based Emotional Behavior Design for Pet Robot

Guide to Preparation of a Site Master File for Breeder/Supplier/Users under Scientific Animal Protection Legislation

DESIGN AND SIMULATION OF 4-BIT ADDERS USING LT-SPICE

FPGA Implementation of Efficient 16-Bit Parallel Prefix Kogge Stone Architecture for Convolution Applications Geetha.B 1 Ramachandra.A.

World Rabies Day and Experiences of the Philippines in Rabies Prevention and Control

Genera&on of Image Descrip&ons. Tambet Ma&isen

Lesson Objectives. Core Content Objectives. Language Arts Objectives

Food For One By popular brands staff READ ONLINE

STUDY BEHAVIOR OF CERTAIN PARAMETERS AFFECTING ASSESSMENT OF THE QUALITY OF QUAIL EGGS BY COMPUTER VISION SYSTEM

A survey of spatial distribution and population size of feral cat colonies in RI Summary of Findings

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Energetics of Ningaloo Green Turtles

Custom Software Solution

Chapter VII Non-linear SSI analysis of Structure-Isolated footings -soil system

IHE Profile Proposal Anatomic Pathology Opinion Request (APOR) Paris June 1, 2012 E.Cordonnier, C.Daniel, F.Macary

INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET)

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

The Veterinary Epidemiology and Risk Analysis Unit (VERAU)

A Novel Approach For Error Detection And Correction Using Prefix-Adders

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Yonat Swimmer, Richard Brill, Lianne Mailloux University of Hawaii VIMS-NMFS

Relationship Between Eye Color and Success in Anatomy. Sam Holladay IB Math Studies Mr. Saputo 4/3/15

Lab 6: Energizer Turtles

DICOM Correction Proposal

Bayesian Analysis of Population Mixture and Admixture

16-BIT CARRY SELECT ADDER. Anushree Garg B.Tech Scholar, JVW, University, Rajasthan, India

Comparison of Parallel Prefix Adders Performance in an FPGA

Required and Recommended Supporting Information for IUCN Red List Assessments

Biology 164 Laboratory

November Final Report. Communications Comparison. With Florida Climate Institute. Written by Nicole Lytwyn PIE2012/13-04B

o VETERINARY IMMUNODIAGNOSTICS MARKET- GLOBAL OPPORTUNITY ANALYSIS AND INDUSTRY FORECASTS TO 2022 Report ID: MRAM Publishing Date: July, 2017

USING INCUBATION AND HEADSTARTING AS CONSERVATION TOOLS FOR NOVA SCOTIA S ENDANGERED BLANDING S TURTLE, (Emydoidea blandingii)

Title. Grade level. Time. Student Target. PART 3 Lesson: Populations. PART 3 Activity: Turtles, Turtle Everywhere! minutes

Donald Bell, Poultry Specialist Cooperative Extension - Highlander Hall-C University of Caliiornia, Riverside, CA USA

UC Davis/BARTA, California, October 2017 BARTA Vets in the Community. Promote Safer Rescue, Improve Welfare and Protect Livelihoods

Public Space 3.0. Challenger Item # Hyatt Item # Description Image Product Description

TREAT Steward. Antimicrobial Stewardship software with personalized decision support

Name: Date: Algebra I - Unit 3, Lesson 4: Writing and Graphing Inequalities to Represent Constraints

Hunting Zika Virus using Machine Learning

Today s Agenda. Why does this matter? A Dangerous Mind. Data Collection. Data Analysis. Data Interpretation. Case Studies

Australian Journal of Basic and Applied Sciences. Performance Analysis of Different Types of Adder Using 3-Transistor XOR Gate

Monitoring gonococcal antimicrobial susceptibility

Optoacoustic imaging of an animal model of prostate cancer

Moving toward formalisation COMP62342

National Unit Specification: general information. UNIT Animal Care: Breeding (SCQF level 5) CODE F6SS 11 SUMMARY OUTCOMES RECOMMENDED ENTRY

Enter the Unique ID that is printed on the label on the bag in which the pack is held.

Breeder Cobb 700. The Cobb 700 has been introduced to meet the. Ten years of research to develop Cobb 700. Breeder Performance

Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Design of a High Speed Adder

Epigenetic regulation of Plasmodium falciparum clonally. variant gene expression during development in An. gambiae

INTRODUCTION TO ANIMAL AND VETERINARY SCIENCE CURRICULUM. Unit 1: Animals in Society/Global Perspective

Complete Solutions for BROILER BREEDERS

Lab 7. Evolution Lab. Name: General Introduction:

Emerging Bovine Health Issues. February 2019 MREC-Minneapolis Brandon Treichler, DVM

Calendar : Timeframe: 1 st 9 Weeks

Characterizing Social Vulnerability: a NFIE Integration

A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair

Sheepdog: Alternative software-defined storage on your OpenStack cloud

Visual and Instrumental Evaluation of Mottling and Striping

Transcription:

[Boston March for Science 2017 photo Hendrik Strobelt]

[Boston March for Science 2017]

[Boston March for Science 2017]

[Boston March for Science 2017]

Object Detectors Emerge in Deep Scene CNNs Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, AntonioTorralba Massachusetts Institute of Technology ICLR 2015

Agrawal, et al. Analyzing the performance of multilayer neural networks for object recognition. ECCV, 2014 Szegedy, et al. Intriguing properties of neural networks.arxiv preprint arxiv:1312.6199, 2013. Zeiler, M. et al. Visualizing and Understanding Convolutional Networks, ECCV 2014. How Objects are Represented in CNN? CNN uses distributed code to represent objects. Conv1 Conv2 Conv3 Conv4 Pool5

Estimating the Receptive Fields Estimated receptive fields pool1 Actual size of RF is much smaller than the theoretic size conv3 pool5 Segmentation using the RF of Units More semantically meaningful

Annotating the Semantics of Units Top ranked segmented images are cropped and sent to Amazon Turk for annotation.

Annotating the Semantics of Units Pool5, unit 76; Label: ocean; Type: scene; Precision: 93%

Annotating the Semantics of Units Pool5, unit 13; Label: Lamps; Type: object; Precision: 84%

Annotating the Semantics of Units Pool5, unit 77; Label:legs; Type: object part; Precision: 96%

Annotating the Semantics of Units Pool5, unit 112; Label: pool table; Type: object; Precision: 70%

Annotating the Semantics of Units Pool5, unit 22; Label: dinner table; Type: scene; Precision: 60%

Distribution of Semantic Types at Each Layer

Distribution of Semantic Types at Each Layer Object detectors emerge within CNN trained to classify scenes, without any object supervision!

ConvNets perform classification < 1 millisecond tabby cat 1000-dim vector end-to-end learning 18 [Slides from Long, Shelhamer, and Darrell]

R-CNN does detection many seconds dog R-CNN cat 26 [Long et al.]

R-CNN: Region-based CNN Figure: Girshick et al. 27

Fast R-CNN Multi-task loss RoI = Region of Interest Figure: Girshick et al.

Fast R-CNN - Convolve whole image into feature map (many layers; abstracted) - For each candidate RoI: - Squash feature map weights into fixed-size RoI pool adaptive subsampling! - Divide RoI into H x W subwindows, e.g., 7 x 7, and max pool - Learn classification on RoI pool with own fully connected layers (FCs) - Output classification (softmax) + bounds (regressor) Figure: Girshick et al.

What if we want pixels out? monocular depth estimation Eigen & Fergus 2015 semantic segmentation optical flow Fischer et al. 2015 boundary prediction Xie & Tu 2015 30 [Long et al.]

~1/10 second??? end-to-end learning 31 [Long et al.]

Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley 32 [CVPR 2015] Slides from Long, Shelhamer, and Darrell

A classification network Number of filters, e.g., 64 Number of perceptrons in MLP layer, e.g., 1024 tabby cat 33 [Long et al.]

A classification network tabby cat 34 [Long et al.]

A classification network tabby cat The response of every kernel across all positions are attached densely to the array of perceptrons in the fully-connected layer. 35 [Long et al.]

A classification network tabby cat The response of every kernel across all positions are attached densely to the array of perceptrons in the fully-connected layer. AlexNet: 256 filters over 6x6 response map Each 2,359,296 response is attached to one of 4096 perceptrons, leading to 37 mil params. 36 [Long et al.]

Problem We want a label at every pixel Current network gives us a label for the whole image. We want a matrix of labels Approach: Make CNN for sub-image size Convolutionalize all layers of network, so that we can treat it as one (complex) filter and slide around our full image.

Long, Shelhamer, and Darrell 2014

A classification network tabby cat The response of every kernel across all positions are attached densely to the array of perceptrons in the fully-connected layer. AlexNet: 256 filters over 6x6 response map Each 2,359,296 response is attached to one of 4096 perceptrons, leading to 37 mil params. 39 [Long et al.]

Convolutionalization Number of filters Number of filters 1x1 convolution operates across all filters in the previous layer, and is slid across all positions. 42 [Long et al.]

Back to the fully-connected perceptron Perceptron is connected to every value in the previous layer (across all channels; 1 visible). [Long et al.]

100 1x1

Convolutionalization # filters, e.g. 1024 # filters, e.g., 64 1x1 convolution operates across all filters in the previous layer, and is slid across all positions. e.g., 64x1x1 kernel, with shared weights over 13x13 output, x1024 filters = 11mil params. 46 [Long et al.]

Becoming fully convolutional Multiple outputs Arbitrarysized image When we turn these operations into a convolution, the 13x13 just becomes another parameter and our output size adjust dynamically. Now we have a vector/matrix output, and our network acts itself like a complex filter. 47 [Long et al.]

Long, Shelhamer, and Darrell 2014

Upsampling the output Some upsampling algorithm to return us to H x W 49 [Long et al.]

End-to-end, pixels-to-pixels network 50 [Long et al.]

End-to-end, pixels-to-pixels network conv, pool, nonlinearity upsampling pixelwise output + loss 51 [Long et al.]

What is the upsampling layer? This one. Hint: it s actually an upsampling _network_ 52 [Long et al.]

Upsampling with convolution Convolution Transposed convolution = weighted kernel stamp Often called deconvolution, but not actually the deconvolution that we previously saw in deblurring -> that is division in the Fourier domain.

Spectrum of deep features Combine where (local, shallow) with what (global, deep) Fuse features into deep jet (cf. Hariharan et al. CVPR15 hypercolumn ) 54 [Long et al.]

Learning upsampling kernels with skip layer refinement interp + sum interp + sum End-to-end, joint learning of semantics and location dense output 55 [Long et al.]

Skip layer refinement input image stride 32 stride 16 stride 8 ground truth no skips 1 skip 2 skips 56 [Long et al.]

Results FCN SDS* Truth Input Relative to prior state-of-the-art SDS: - 30% relative improvement for mean IoU - 286 faster *Simultaneous Detection and Segmentation Hariharan et al. ECCV14 58 [Long et al.]

What can we do with an FCN? Long, Shelhamer, and Darrell 2014

How much can an image tell about its geographic location? 6 million geo-tagged Flickr images http://graphics.cs.cmu.edu/projects/im2gps/ im2gps (Hays & Efros, CVPR 2008)

Nearest Neighbors according to gist + bag of SIFT + color histogram + a few others

PlaNet - Photo Geolocation with Convolutional Neural Networks Tobias Weyand, Ilya Kostrikov, James Philbin ECCV 2016

Discretization of Globe

Network and Training Network Architecture: Inception with 97M parameters 26,263 categories places in the world 126 Million Web photos 2.5 months of training on 200 CPU cores

PlaNet vs im2gps (2008, 2009)

Spatial support for decision

PlaNet vs Humans

PlaNet vs. Humans

PlaNet summary Very fast geolocalization method by categorization. Uses far more training data than previous work (im2gps) Better than humans!