SlideShare a Scribd company logo
Deep Learning based
Object Detection Basics
Detection As Regression?
Detection As Regression?
Detection As Classification
CAT? NO
DOG? NO
Detection As Classification
CAT? YES
DOG? NO
Detection As Classification
CAT? NO
DOG? NO
Detection As Classification
CAT? NO
DOG? YES
From Classification To Detection
Classification Head:
● C+1 Scores for C
classes + 1
background
class
Localization Head:
● Class agnostic:
(x,y,w,h)
● Class specific:
(x,y,w,h) X C
From Classification To Detection
● Training
○ Crop random regions from images.
○ Scale to uniform size.
○ A region is labeled according to overlap with ground truth labeling.
○ Optimize using Stochastic Gradient Descent.
○ Handle class imbalance by resampling.
● Detection
○ Use sliding window to go over image.
○ Crop regions.
○ Scale to uniform size.
○ Apply network to all cropped images.
○ Repeat process for different image scales.
How To Handle So Many Detections?
● Problem:
○ Running this algorithm at many locations at many scales result with many detections.
● Solution:
○ Need somehow to suppress weaker detections.
Non-Maximum Suppression (NMS)
● Start with most confident detection D.
● Measure IoU with all other detections.
● Remove detections with IoU>50% with D.
● Repeat with next most confident detection.
From Classification To Detection
● Problem:
○ Previous method was too slow.
○ Network is applied over and over.
● Solution:
○ Sliding window is inherently efficient in the case of CNNs.
● OverFeat: Integrated Recognition, Localization and Detection using
Convolutional Networks (2013)
○ Rob Fergus, Yann LeCun
From Classification To Detection
From Detection To Classification
From Detection To Classification
From Detection To Classification
CNNs Are Still Too Slow
● Problem:
○ Need to test many positions and scales, and use a computationally demanding classifier (CNN)
● Solution:
○ Only look at a tiny subset of possible positions.
● Rich feature hierarchies for accurate object detection and semantic
segmentation (2014)
○ AKA R-CNN
○ Ross Girshick
Region Proposals
● Find “blobby” image regions that are likely to contain objects
● “Class-agnostic” object detector
● Look for “blob-like” regions
Region Proposals: Selective Search
Region Proposals: Many Other Choices
Region Proposals: Many Other Choices
R-CNN
R-CNN
R-CNN
R-CNN
R-CNN
R-CNN
R-CNN: Training
1. Train a classification model on a large dataset (ImageNet)
2. Fine-tune model for detection on a smaller dataset (Pascal)
○ Instead of 1000 ImageNet classes, now use 20 classes + background class.
○ Extract region proposals for all images.
○ Use positive / negative regions from detection images.
■ If proposal has >50% IoU with any ground truth → Positive example.
■ Otherwise → Negative example.
■ Batch = 32 positives + 96 negatives.
3. Train final classifiers
○ Extract region proposals for all images.
○ For each region: crop and warp to CNN size, run forward pass, save features to disk.
(Requires ~200GB for Pascal dataset)
○ Train one binary SVM per class to classify region features.
○ Train one linear regression model per class to predict regression offsets.
R-CNN: 2014’s State Of The Art
Looking for brilliant researchers
cv@brodmann17.com

More Related Content

What's hot (20)

PDF
Moving Object Detection And Tracking Using CNN
NITISHKUMAR1401
 
PPTX
Yolo
NEHA Kapoor
 
PDF
Deep learning - A Visual Introduction
Lukas Masuch
 
PDF
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Universitat Politècnica de Catalunya
 
PPTX
Object detection
Jksuryawanshi
 
PPTX
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
PDF
Mask R-CNN
Chanuk Lim
 
PPTX
Object Detection & Tracking
Akshay Gujarathi
 
PPTX
Object detection
ROUSHAN RAJ KUMAR
 
PDF
Machine Learning - Object Detection and Classification
Vikas Jain
 
PPTX
Image classification using CNN
Noura Hussein
 
PPTX
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
PDF
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
PDF
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Taegyun Jeon
 
PDF
Object Detection and Recognition
Intel Nervana
 
PDF
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
PPTX
Image feature extraction
Rushin Shah
 
PPTX
Yolo
Bang Tsui Liou
 
PDF
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 
Moving Object Detection And Tracking Using CNN
NITISHKUMAR1401
 
Deep learning - A Visual Introduction
Lukas Masuch
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Universitat Politècnica de Catalunya
 
Object detection
Jksuryawanshi
 
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
Mask R-CNN
Chanuk Lim
 
Object Detection & Tracking
Akshay Gujarathi
 
Object detection
ROUSHAN RAJ KUMAR
 
Machine Learning - Object Detection and Classification
Vikas Jain
 
Image classification using CNN
Noura Hussein
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Taegyun Jeon
 
Object Detection and Recognition
Intel Nervana
 
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
Image feature extraction
Rushin Shah
 
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 

Similar to Deep learning based object detection basics (20)

PDF
MLIP - Chapter 5 - Detection, Segmentation, Captioning
Charles Deledalle
 
PDF
物件偵測與辨識技術
CHENHuiMei
 
PDF
object detection paper review
Yoonho Na
 
PDF
Fast methods for deep learning based object detection
Brodmann17
 
PDF
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
TaqwaElsayed
 
PDF
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17
 
PDF
Cvpr 2017 Summary Meetup
Amir Alush
 
PDF
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Universitat Politècnica de Catalunya
 
PDF
SSD: Single Shot MultiBox Detector (UPC Reading Group)
Universitat Politècnica de Catalunya
 
PDF
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Original SOINN
SOINN Inc.
 
PPTX
Anomaly Detection and Localization Using GAN and One-Class Classifier
홍배 김
 
PPTX
150424 Scalable Object Detection using Deep Neural Networks
Junho Cho
 
PPTX
Knn Algorithm presentation
RishavSharma112
 
PDF
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
PPTX
ngboost.pptx
Hadrian7
 
PDF
D3L4-objects.pdf
ssusere945ae
 
PDF
Making BIG DATA smaller
Tony Tran
 
PPTX
Anomaly detection using deep one class classifier
홍배 김
 
PDF
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
Charles Deledalle
 
物件偵測與辨識技術
CHENHuiMei
 
object detection paper review
Yoonho Na
 
Fast methods for deep learning based object detection
Brodmann17
 
Computer Vision Computer Vision: Algorithms and Applications Richard Szeliski
TaqwaElsayed
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17
 
Cvpr 2017 Summary Meetup
Amir Alush
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Universitat Politècnica de Catalunya
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Original SOINN
SOINN Inc.
 
Anomaly Detection and Localization Using GAN and One-Class Classifier
홍배 김
 
150424 Scalable Object Detection using Deep Neural Networks
Junho Cho
 
Knn Algorithm presentation
RishavSharma112
 
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
ngboost.pptx
Hadrian7
 
D3L4-objects.pdf
ssusere945ae
 
Making BIG DATA smaller
Tony Tran
 
Anomaly detection using deep one class classifier
홍배 김
 
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Ad

More from Brodmann17 (6)

PDF
5 Practical Steps to a Successful Deep Learning Research
Brodmann17
 
PDF
Advanced deep learning based object detection methods
Brodmann17
 
PDF
Deep Learning on Everyday Devices
Brodmann17
 
PDF
Brodmann17 I The rise of edge vision intelligence I Adi Pinhas I DLD 2017
Brodmann17
 
PDF
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 
PDF
Geektime 2017
Brodmann17
 
5 Practical Steps to a Successful Deep Learning Research
Brodmann17
 
Advanced deep learning based object detection methods
Brodmann17
 
Deep Learning on Everyday Devices
Brodmann17
 
Brodmann17 I The rise of edge vision intelligence I Adi Pinhas I DLD 2017
Brodmann17
 
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 
Geektime 2017
Brodmann17
 
Ad

Recently uploaded (20)

PDF
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
PDF
Herbal Excipients: Natural Colorants & Perfumery Agents
Seacom Skills University
 
PPTX
Chromosomal Aberration (Mutation) and Classification.
Dr-Haseeb Zubair Tagar
 
PPTX
arun battery Li-ion presentation physics.pptx
lakshyanss2122
 
PDF
Investigatory_project Topic:-effect of electrolysis in solar desalination .pdf
shubham997ku
 
PDF
Global Health Initiatives: Lessons from Successful Programs (www.kiu.ac.ug)
publication11
 
PDF
We are Living in a Dangerous Multilingual World!
Editions La Dondaine
 
PPTX
Organisms of oncogenic Potential.pptx
mrkoustavjana2003
 
PDF
The Gender Binary & LGBTI People: Religious Myth and Medical Malpractice
Veronica Drantz, PhD
 
PPTX
(Normal Mechanism)physiology of labour.pptx
DavidSalman2
 
PPTX
Cancer
Vartika
 
PPTX
Single-Cell Multi-Omics in Neurodegeneration p1.pptx
KanakChaudhary10
 
PDF
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
PPTX
General properties of connective tissue.pptx
shrishtiv82
 
DOCX
Transportation in plants and animals.docx
bhatbashir421
 
PPTX
Human-AI Interaction in Space: Insights from a Mars Analog Mission with the H...
Jean Vanderdonckt
 
PPTX
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
PPSX
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
PPTX
1-SEAFLOOR-SPREADINGGGGGGGGGGGGGGGGGGGG.pptx
JohnCristoffMendoza
 
PPTX
FACTORS PREDISPOSING TO MICROBIAL PATHOGENICITY.pptx
Remya M S
 
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
Herbal Excipients: Natural Colorants & Perfumery Agents
Seacom Skills University
 
Chromosomal Aberration (Mutation) and Classification.
Dr-Haseeb Zubair Tagar
 
arun battery Li-ion presentation physics.pptx
lakshyanss2122
 
Investigatory_project Topic:-effect of electrolysis in solar desalination .pdf
shubham997ku
 
Global Health Initiatives: Lessons from Successful Programs (www.kiu.ac.ug)
publication11
 
We are Living in a Dangerous Multilingual World!
Editions La Dondaine
 
Organisms of oncogenic Potential.pptx
mrkoustavjana2003
 
The Gender Binary & LGBTI People: Religious Myth and Medical Malpractice
Veronica Drantz, PhD
 
(Normal Mechanism)physiology of labour.pptx
DavidSalman2
 
Cancer
Vartika
 
Single-Cell Multi-Omics in Neurodegeneration p1.pptx
KanakChaudhary10
 
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
General properties of connective tissue.pptx
shrishtiv82
 
Transportation in plants and animals.docx
bhatbashir421
 
Human-AI Interaction in Space: Insights from a Mars Analog Mission with the H...
Jean Vanderdonckt
 
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
1-SEAFLOOR-SPREADINGGGGGGGGGGGGGGGGGGGG.pptx
JohnCristoffMendoza
 
FACTORS PREDISPOSING TO MICROBIAL PATHOGENICITY.pptx
Remya M S
 

Deep learning based object detection basics

  • 1. Deep Learning based Object Detection Basics
  • 8. From Classification To Detection Classification Head: ● C+1 Scores for C classes + 1 background class Localization Head: ● Class agnostic: (x,y,w,h) ● Class specific: (x,y,w,h) X C
  • 9. From Classification To Detection ● Training ○ Crop random regions from images. ○ Scale to uniform size. ○ A region is labeled according to overlap with ground truth labeling. ○ Optimize using Stochastic Gradient Descent. ○ Handle class imbalance by resampling. ● Detection ○ Use sliding window to go over image. ○ Crop regions. ○ Scale to uniform size. ○ Apply network to all cropped images. ○ Repeat process for different image scales.
  • 10. How To Handle So Many Detections? ● Problem: ○ Running this algorithm at many locations at many scales result with many detections. ● Solution: ○ Need somehow to suppress weaker detections.
  • 11. Non-Maximum Suppression (NMS) ● Start with most confident detection D. ● Measure IoU with all other detections. ● Remove detections with IoU>50% with D. ● Repeat with next most confident detection.
  • 12. From Classification To Detection ● Problem: ○ Previous method was too slow. ○ Network is applied over and over. ● Solution: ○ Sliding window is inherently efficient in the case of CNNs. ● OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks (2013) ○ Rob Fergus, Yann LeCun
  • 14. From Detection To Classification
  • 15. From Detection To Classification
  • 16. From Detection To Classification
  • 17. CNNs Are Still Too Slow ● Problem: ○ Need to test many positions and scales, and use a computationally demanding classifier (CNN) ● Solution: ○ Only look at a tiny subset of possible positions. ● Rich feature hierarchies for accurate object detection and semantic segmentation (2014) ○ AKA R-CNN ○ Ross Girshick
  • 18. Region Proposals ● Find “blobby” image regions that are likely to contain objects ● “Class-agnostic” object detector ● Look for “blob-like” regions
  • 20. Region Proposals: Many Other Choices
  • 21. Region Proposals: Many Other Choices
  • 22. R-CNN
  • 23. R-CNN
  • 24. R-CNN
  • 25. R-CNN
  • 26. R-CNN
  • 27. R-CNN
  • 28. R-CNN: Training 1. Train a classification model on a large dataset (ImageNet) 2. Fine-tune model for detection on a smaller dataset (Pascal) ○ Instead of 1000 ImageNet classes, now use 20 classes + background class. ○ Extract region proposals for all images. ○ Use positive / negative regions from detection images. ■ If proposal has >50% IoU with any ground truth → Positive example. ■ Otherwise → Negative example. ■ Batch = 32 positives + 96 negatives. 3. Train final classifiers ○ Extract region proposals for all images. ○ For each region: crop and warp to CNN size, run forward pass, save features to disk. (Requires ~200GB for Pascal dataset) ○ Train one binary SVM per class to classify region features. ○ Train one linear regression model per class to predict regression offsets.
  • 29. R-CNN: 2014’s State Of The Art