Week 42: Siamese Network: Architecture and Applications in Visual Object Tracking. Yuanwei Wu

Similar documents
CS6501: Deep Learning for Visual Recognition. CNN Architectures

Development of intelligent systems (RInS) Object recognition with Convolutional Neural Networks

[Boston March for Science 2017 photo Hendrik Strobelt]

Where Is My Puppy? Retrieving Lost Dogs by Facial Features

IMAGE CAPTIONING USING PHRASE-BASED HIERARCHICAL LSTM MODEL

The Kaggle Competitions: An Introduction to CAMCOS Fall 2015

Reasoning with Neural Networks

Recurrent neural network grammars. Slide credits: Chris Dyer, Adhiguna Kuncoro

Genera&on of Image Descrip&ons. Tambet Ma&isen

Multiclass and Multi-label Classification

Dynamic Programming for Linear Time Incremental Parsing

Cats and Dogs. Omkar M Parkhi 1,2 Andrea Vedaldi 1 Andrew Zisserman 1 C. V. Jawahar 2. Abstract. 1. Introduction

EARLINET validation of CATS L2 product

Global Strategies to Address AMR Carmem Lúcia Pessoa-Silva, MD, PhD Antimicrobial Resistance Secretariat

SCIENTIFIC REPORT. Analysis of the baseline survey on the prevalence of Salmonella in turkey flocks, in the EU,

MAIL ORDER HATCHERIES: OPERATIONAL AND DISTRIBUTION LOGISTICS, SALMONELLA INTERVENTION ACTIVITIES AIMED AT PREVENTION OF HUMAN SALMONELLOSIS

STUDY BEHAVIOR OF CERTAIN PARAMETERS AFFECTING ASSESSMENT OF THE QUALITY OF QUAIL EGGS BY COMPUTER VISION SYSTEM

Hunting Zika Virus using Machine Learning

Modeling and Control of Trawl Systems

Semantically-driven Automatic Creation of Training Sets for Object Recognition

Nathan A. Thompson, Ph.D. Adjunct Faculty, University of Cincinnati Vice President, Assessment Systems Corporation

Econometric Analysis Dr. Sobel

University of Pennsylvania. From Perception and Reasoning to Grasping

Lab 7: Experimenting with Life and Death

The English Governess At The Siamese Court: Being Recollections Of Six Years In The Royal Palace At Bangkok (Oxford In Asia Paperbacks) By Anna

Sentences and pictures: not just more words and pictures

Re: Sample ID: Letzty [ ref:_00di0ijjl._500i06g6gf:ref ] 1 message

Mottling Assessment of Solid Printed Areas and Its Correlation to Perceived Uniformity

Note: The following article is used with permission of Dr. Sonia Altizer.

Pre-natal construction of neural circuits (the highways are genetically specified):

Course: Canine Massage and Bodywork Certification Course Part A Cranial Trunk and Thoracic Appendicular System. Movers of the Forelimb, Neck, and Head

FPGA-based Emotional Behavior Design for Pet Robot

BEHAVIOUR OF THE DOMESTIC DOG (Canis familiaris)

9: Coffee Break. 10:00-11: Spatial Risk Mapping (Thomas Van Boekel) 11:00-12: Dynamic Bayesian Network (Yrjo Grohn)

Subdomain Entry Vocabulary Modules Evaluation

Increasing trends in mcr-1 prevalence among ESBL-producing E. coli in French calves

Identifying critical habitat of swordfish and loggerhead turtles from fishery, satellite tag, and environmental data

Machine Learning.! A completely different way to have an. agent acquire the appropriate abilities to solve a particular goal is via machine learning.

Rules of Connectivity between Geniculate Cells and Simple Cells in Cat Primary Visual Cortex

Design of a High Speed Adder

INTRODUCTION TO VISUAL HEALTH ASSESSMENTS

Wearable sensor shown to specifically quantify pruritic behaviors in dogs

Metadata Sheet: Extinction risk (Indicator No. 9)

Human Uniqueness. Human Uniqueness. Why are we so different? 12/6/2017. Four Candidates

Eating Your Own Dog Food

Detection of Progression of Clinical Mastitis in Cows Using Hidden Markov Model

Prof Michael O Neill Introduction to Evolutionary Computation

GUIDELINES FOR APPROPRIATE USES OF RED LIST DATA

@DEVONPERSING DESIGNING FOR ACCESSIBILITY

The integration of dogs into collaborative humanrobot. - An applied ethological approach - PhD Thesis. Linda Gerencsér Supervisor: Ádám Miklósi

Canine Gait Analysis and Diagnosis. using Artificial Neural Networks. and. Ground Reaction Force. Makiko Kaijima

One Health Movement in Bangladesh:

On the Trail to Health, Heritage, and Happiness Route 4: Greeley s Number Three

Third Global Patient Safety Challenge. Tackling Antimicrobial Resistance

ALTO Implementations and Use Cases: A Brief Survey. S. Chen, X. Lin, D. Lachos, Y. Yang, C. Rothenberg. IETF 102 July 16, 2018 Montreal

CS108L Computer Science for All Module 7: Algorithms

PROGRESS REPORT for COOPERATIVE BOBCAT RESEARCH PROJECT. Period Covered: 1 October 31 December Prepared by

Cat Swarm Optimization

Effective Vaccine Management Initiative

Dairy Cattle Assessment protocol

Australian Journal of Basic and Applied Sciences. Performance Analysis of Different Types of Adder Using 3-Transistor XOR Gate

Texas 4-H/FFA Heifer Validation Program

Shared Humanity Written by Marilee Joy Mayfield

Progress of type harmonisation

4-H Dog Obedience Proficiency Program A Member s Guide

The genetic factors under consideration in the present study include black (+) vs. red (y), a sex-linked pair of alternatives manifesting

Genetic approaches to improving lamb survival

Not All Data is Linear

Neuroscience Letters

EN SANCO/745/2008r6 EN EN

Be Doggone Smart at Work

Optimal Efficient Meta Heauristic Based Approch for Radial Distribution Network

Objectives: The student will be able to (TSWBT). (OR Skill Set numbers in parentheses at the end of the objective statement.)

Utilizing ArcGIS Schematics to Manage Chemically Treated Pipelines. Chris Nichols, GIS Analyst New Century Software, Inc.

PARTNERSHIP OPPORTUNITIES. AA Affiliate of the Toronto Blue Jays. PHOTO BY: Bruce Taylor / The New Hampshire Union Leader

Application of Fuzzy Logic in Automated Cow Status Monitoring

Answers to Questions about Smarter Balanced 2017 Test Results. March 27, 2018

Regulatory issues. Electricity. Authorisation of amendments to the national electricity code regional pricing of ancillary services

Alabama Shrimp Summary Action Plan Marine Advancement Plan (MAP)

Cat n Around Catskill 2018

World Organisation for Animal Health

Research Strategy Institute of Animal Welfare Science. (Institut für Tierschutzwissenschaften und Tierhaltung)

A Column Generation Algorithm to Solve a Synchronized Log-Truck Scheduling Problem

Integrated Resistance Management in the control of disease transmitting mosquitoes

Changing patterns of poultry production in the European Union

A Discrete-Event Simulation Study of the Re-emergence of S. vulgaris in Horse Farms Adopting Selective Therapy

Trapped in a Sea Turtle Nest

Use of Antibiotics in Animals. A European Perspective by a Dutch observer. Dr. Albert Meijering

Characterizing Social Vulnerability: a NFIE Integration

ICAO PUBLIC KEY DIRECTORY (PKD) Christiane DerMarkar ICAO PKD Officer

PIGEON DISCRIMINATION OF PAINTINGS 1

FEAR-FREE HOSPITAL DESIGN GUIDELINE Heather E. Lewis, AIA, NCARB. Draft / January 2015 PREPARED BY. architecture animals people

Call for abstracts: Risks to life, heritage, and. community on the Yangtze River

Blue Whales: Giant Mammals

EVM analysis of an Interference Limited SIMO-SC System With Independent and Correlated Channels

One Health, One Purpose:

Distance and the presentation of visual stimuli to birds

Objectives. ERTs for the New Beef Industry. Ancient History. The EPD we produce entirely depends on the tools we have to use them.

A SYSTEM FOR LOCALIZING TORTOISES DURING THE EGGS DEPOSITION PHASE

5 State of the Turtles

Transcription:

Week 42: Siamese Network: Architecture and Applications in Visual Object Tracking Yuanwei Wu 10-21-2016 1

Outline Siamese Architecture Siamese Applications in Computer Vision Paper review Visual Object Tracking using Siamese CNN Future Work 2

What does Siamese mean? Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 3

Siamese Architecture Source: Learning Hierarchies of Invariant Features. Yann LeCun. helper.ipam.ucla.edu/publications/gss2012/gss2012_10739.pdf 4

Siamese Architecture and loss function Source: Learning Hierarchies of Invariant Features. Yann LeCun. helper.ipam.ucla.edu/publications/gss2012/gss2012_10739.pdf 5

Siamese Applications in Computer Vision: 1. Signature Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 6

Siamese Applications in Computer Vision: 2. Dimensionality Reduction Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 7

Siamese Applications in Computer Vision: 3.1 Learning Image Descriptors CNN Model Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 8

Siamese Applications in Computer Vision: 3.2 Learning Image Descriptors Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 9

Siamese Applications in Computer Vision: 4.1 Face Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 10

Siamese Applications in Computer Vision: 4.2 Face Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 11

Siamese Applications in Computer Vision: 4.3 Face Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 12

Siamese Applications in Computer Vision: 4.4 Face Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 13

Siamese Applications in Computer Vision: 4.5 Face Verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 14

Paper Review: Fully-Convolutional Siamese Networks for Object Tracking @article{bertinetto2016fully, title={fully-convolutional Siamese Networks for Object Tracking}, author={bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS}, journal={arxiv preprint arxiv:1606.09549}, year={2016} } 15

Architecture of Siamese CNN Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 16

Details of the Architecture of Siamese CNN 1. Source: 1: Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012. 17

Details of the Architecture of Siamese CNN 1. 2. Cross-correlation layer Source: 1: Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012. 2: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, 18 fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016.

Training: dataset ImageNet Video dataset of 2015: contains ~4000 videos with ~1 million annotated frames Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 19

Training: preprocessing on the images Preprocessing: 2820 videos, examplar image: 127 x 127, search image: 255 x 255 Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 20

Training: recap the steps ImageNet Video dataset of 2015: contains ~4000 videos with ~1 million annotated frames Preprocessing: 2820 videos examplar image: 127 x 127 search image: 255 x 255 Training with a standard Stochastic Gradient Descent (SGD) solver using MathConvNet Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 21

Training: loss function Employing a discriminative training approach using positive and negative pairs and adopting the logistic loss: Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 22

Training: loss function Employing a discriminative training approach using positive and negative pairs and adopting the logistic loss: The loss of a score map is the mean of the individual losses: Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 23

Training: loss function Employing a discriminative training approach using positive and negative pairs and adopting the logistic loss: The loss of a score map is the mean of the individual losses: Applying SGD to find the conv-net Ѳ using Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 24

Tracking algorithm Use a search image centered at the previous position of the target. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 25

Tracking algorithm Use a search image centered at the previous position of the target. Only search for the object within a region of approximately four times its previous size. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 26

Tracking algorithm Use a search image centered at the previous position of the target. Only search for the object within a region of approximately four times its previous size. A cosine window is added to the score map to penalize large displacements. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 27

Tracking algorithm Use a search image centered at the previous position of the target. Only search for the object within a region of approximately four times its previous size. A cosine window is added to the score map to penalize large displacements. The position of the maximum score relative to the center of the score map, multiplied by the stride of the network, gives the displacement of the target from frame to frame. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 28

Experiments: training dataset size Accuracy: is calculated as the average Intersection-over-Union (IoU) Robustness: in terms of the total number of failures Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 29

Experiments: training dataset size Accuracy: is calculated as the average Intersectionover-Union (IoU) Robustness: in terms of the total number of failures Using a larger video dataset could increase the performance even further. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 30

Experiments: OTB13 benchmark results Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 31

Experiments: VOT15 benchmark results Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 32

Experiments: VOT15 benchmark results Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 33

Experiments: VOT15 benchmark results Estimates the new position of the target object by merely crosscorrelating the embeddings of two patches over three scales. Achieves real-time performance and state-of-the-art results. Source: Bertinetto, Luca and Valmadre, Jack and Henriques, Jo{\~a}o F and Vedaldi, Andrea and Torr, Philip HS, fully-convolutional Siamese Networks for Object Tracking, arxiv preprint, 2016. 34

Future work: How to improve the performance? By augmenting the online tracking pipeline: online model updating (i.e. tracking-by-detection) bounding-box regression (i.e. YOLO, Faster-CNN) fine-tuning (i.e. correlation filters + CNN features) memory (i.e. add RNN, LSTM) 35

Source: Guanghan Ning, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, Haohong Wang, Spatially Supervised Recurrent Convolutional 36 Neural Networks for Visual Object Tracking, arxiv preprint, 2016.

Future work: How to improve the performance? By augmenting the online tracking pipeline: online model updating (i.e. tracking-by-detection) bounding-box regression (i.e. YOLO, Faster-CNN) fine-tuning (i.e. correlation filters + CNN features) memory (i.e. add RNN, LSTM) By introducing new architecture in the framework of Siamese CNN, need to dig deeply in the structure of networks (i.e. regression network, triplet network). 37

Triplet Network Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 38

Future work: How to improve the performance? By augmenting the online tracking pipeline: online model updating (i.e. tracking-by-detection) bounding-box regression (i.e. YOLO, Faster-CNN) fine-tuning (i.e. correlation filters + CNN features) memory (i.e. add RNN, LSTM) By introducing new architecture in the framework of Siamese CNN, need to dig deeply in the structure of networks (i.e. regression network, triplet network). By introducing new loss function is Siamese network. 39

Loss function used in face verification Source: http://vision.ia.ac.cn/zh/senimar/reports/siamese-network-architecture-and-applications-in-computer-vision.pdf 40

Thank you! 41