SlideShare a Scribd company logo
Aerial Object Detection
HyeongJun Kwon
2019-2
Contents
2
1. EAST: An Efficient and Accurate Scene Text Detector
2. Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
EAST: An Efficient and Accurate Scene Text Detector
3
Network Overview Pipeline
Input
image
Multi-channel
FCN
Multi-channel
FCN
Multi-oriented
Task-wise boxes
4
EAST: An Efficient and Accurate Scene Text Detector
Main Contributions :
1. propose two stage(step) method : FCN and NMS merging stage
2. pipeline is flexible
5
EAST: An Efficient and Accurate Scene Text Detector
Relate work: PVANet
6
EAST: An Efficient and Accurate Scene Text Detector
Pipeline
input
conv2
conv3
conv1
conv3
merging1
merging2
merging3 For reduce computation cost, using
U-shape not using HyperNet in
PVANet that merge all feature maps
output
EAST: An Efficient and Accurate Scene Text Detector
7
Label Generation
EAST: An Efficient and Accurate Scene Text Detector
8
Label Generation: Score Map Generation
Score map generation eq
𝑟𝑖 = min(D 𝑝𝑖, 𝑝 𝑖 𝑚𝑜𝑑 4 +1 ,
D 𝑝𝑖, 𝑝 𝑖+2 𝑚𝑜𝑑 4 +1
we shrink it by moving its two endpoints inward along the edge
by 0.3𝑟𝑖 and 0.3𝑟 𝑖 𝑚𝑜𝑑 4 +1 espectively.
9
EAST: An Efficient and Accurate Scene Text Detector
𝐿 = 𝐿 𝑠 + 𝜆 𝑔 𝐿 𝑔
Label Generation: Loss
Where Y = 𝐹S is the prediction of the score map, and Y∗ is the
ground truth
𝐿 𝑠 : loss for score map
𝐿 𝑔 : loss for geometry
10
EAST: An Efficient and Accurate Scene Text Detector
𝐿 = 𝐿 𝑠 + 𝜆 𝑔 𝐿 𝑔
Label Generation: Loss
RBOX: 𝐿 𝑔 = 𝐿 𝐴𝐴𝐵𝐵 + 𝜆 𝜃 𝐿 𝜃
𝐿 𝑠 : loss for score map
𝐿 𝑔 : loss for geometry
𝐿 𝑔 = 𝐿QUAD Q, Q∗
= min
Q∈𝑃 𝑄∗
𝑐 𝑖∈CQ
𝑐 𝑖∈CQ
smoothed 𝐿1 𝑐𝑖 − 𝑐𝑖
8 × 𝑁 𝑄∗
QUAD:
11
EAST: An Efficient and Accurate Scene Text Detector
Locality-Aware NMS
Problem: A naïve NMS algorithm runs in 𝑂 𝑛2
where 𝑛 is the number of candidate geometries.
The geometries from nearby pixels tend to be highly correlated.
Solution: locality-aware NMS
𝑎 = WEIGHTEDMERGE 𝑔, 𝑝 , then 𝑎i = V 𝑔 𝑔𝑖 + 𝑉 𝑝 𝑝𝑖 and V 𝑎 = V 𝑔 + V(𝑝)
12
EAST: An Efficient and Accurate Scene Text Detector
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
13
Network Overview
14
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
Main Contributions :
1. new joint image cascade and feature pyramid network(ICN and FPN)
2. design a DIN module as a domain adaptation module
3. new loss function to shape rectangles by constraining the angles between the edges to
90 degrees
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
15
ICN, FPN and Deformable Inception Subnetworks
• Appropriate weights sharing
• Resize image size by bilinear
interpolation
ICN
• The low-level semantic feature
from high resolution
• The high-level semantic feature
from low-level resolution
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
16
ICN, FPN and Deformable Inception Subnetworks
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
17
R-RPN
Characteristics:
1. no difference between the front and back of objects
2. initialize anchor by using dimension clustering in YOLO v2
3. use the smooth 𝑙1 loss to regress the four coordinates
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
18
R-ROI
Characteristics:
1. penalize angles that are not 90 degree
2. initialize anchor by using dimension clustering in YOLO v2
3. use the smooth 𝑙1 loss to regress the four coordinates
penalize angles that are not 90 degree
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
19
Objective
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
20
Objective
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
21
Result
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
22
Result
Towards Multi-class Object Detection in Unconstrained Remote Sensing
Imagery
23
Result
Reference
24
EAST:
PVANET: Deep but lightweight neural networks for real-time object detection.
Balanced-cross entropy:
Holistically-nested edge detection
Scene text detection via holistic, multi-channel prediction.
U-shape: U-net: Convolu-tional networks for biomedical image segmentation.
Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery:
Soft-NMS: Improving object detection with one line of code.
IoU distance: Yolo9000: Better, faster, stronger.
DIN: deformable convolutional networks

More Related Content

PPTX
Feature pyramid networks for object detection
PPTX
Aerial detection1
PPTX
Aerial detection part2
PPTX
Tutorial on Object Detection (Faster R-CNN)
PPTX
150807 Fast R-CNN
PPTX
Convolutional Patch Representations for Image Retrieval An unsupervised approach
PPTX
Deep image retrieval - learning global representations for image search - ub ...
PDF
Visual odometry & slam utilizing indoor structured environments
Feature pyramid networks for object detection
Aerial detection1
Aerial detection part2
Tutorial on Object Detection (Faster R-CNN)
150807 Fast R-CNN
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Deep image retrieval - learning global representations for image search - ub ...
Visual odometry & slam utilizing indoor structured environments

What's hot (20)

PDF
Objects as points (CenterNet) review [CDM]
PDF
Class Weighted Convolutional Features for Image Retrieval
PDF
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
PDF
Building and road detection from large aerial imagery
PDF
fusion of Camera and lidar for autonomous driving II
PDF
Camera-based road Lane detection by deep learning III
PDF
VJAI Paper Reading#3-KDD2019-ClusterGCN
PDF
[PaperReview] LightGCN: Simplifying and Powering Graph Convolution Network fo...
PDF
Auro tripathy - Localizing with CNNs
PPT
Matteoli ieee gold_2010_clean
PPTX
Thesis Presentation
PDF
Multi sensor calibration by deep learning
PDF
Detection focal loss 딥러닝 논문읽기 모임 발표자료
PPTX
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
PDF
Remote Sensing IEEE 2015 Projects
PDF
Autonomous deployment for load balancing surface coverage in sensor networks
PDF
Gnn overview
PDF
Unsupervised/Self-supervvised visual object tracking
PDF
Comparative Study of Object Detection Algorithms
PDF
Depth Fusion from RGB and Depth Sensors by Deep Learning
Objects as points (CenterNet) review [CDM]
Class Weighted Convolutional Features for Image Retrieval
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
Building and road detection from large aerial imagery
fusion of Camera and lidar for autonomous driving II
Camera-based road Lane detection by deep learning III
VJAI Paper Reading#3-KDD2019-ClusterGCN
[PaperReview] LightGCN: Simplifying and Powering Graph Convolution Network fo...
Auro tripathy - Localizing with CNNs
Matteoli ieee gold_2010_clean
Thesis Presentation
Multi sensor calibration by deep learning
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Remote Sensing IEEE 2015 Projects
Autonomous deployment for load balancing surface coverage in sensor networks
Gnn overview
Unsupervised/Self-supervvised visual object tracking
Comparative Study of Object Detection Algorithms
Depth Fusion from RGB and Depth Sensors by Deep Learning
Ad

Similar to Aerial detection part3 (20)

PDF
R-FCN : object detection via region-based fully convolutional networks
PDF
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
PDF
Convolutional Neural Network for pixel-wise skyline detection
PDF
camera-based Lane detection by deep learning
PPTX
Presentation2.pptx of sota seminar iit kanpur
PDF
object detection paper review
PPTX
Object Detection is a very powerful field.pptx
PDF
LiDAR-based Autonomous Driving III (by Deep Learning)
PDF
Adaptive object detection using adjacency and zoom prediction
PDF
Lidar for Autonomous Driving II (via Deep Learning)
PDF
Neural Radiance Fields & Neural Rendering.pdf
PPT
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PDF
Conditional Image Generation with PixelCNN Decoders
PDF
D3L4-objects.pdf
PDF
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
PPTX
[NS][Lab_Seminar_240805]CheckerPose.pptx
PDF
最近の研究情勢についていくために - Deep Learningを中心に -
PDF
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
PDF
物件偵測與辨識技術
PDF
Object Pose Estimation
R-FCN : object detection via region-based fully convolutional networks
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Convolutional Neural Network for pixel-wise skyline detection
camera-based Lane detection by deep learning
Presentation2.pptx of sota seminar iit kanpur
object detection paper review
Object Detection is a very powerful field.pptx
LiDAR-based Autonomous Driving III (by Deep Learning)
Adaptive object detection using adjacency and zoom prediction
Lidar for Autonomous Driving II (via Deep Learning)
Neural Radiance Fields & Neural Rendering.pdf
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
Conditional Image Generation with PixelCNN Decoders
D3L4-objects.pdf
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
[NS][Lab_Seminar_240805]CheckerPose.pptx
最近の研究情勢についていくために - Deep Learningを中心に -
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
物件偵測與辨識技術
Object Pose Estimation
Ad

More from ssuser456ad6 (6)

PPTX
Scale invariant feature transform
PPTX
Learning joint 2 d 3d representations for depth completion
PPTX
Guided image filter
PPTX
Fast cost volume filtering for visual correspondence and beyond
PPTX
D2 net a trainable cnn for joint description and detection of local features
PPTX
Gan dissection
Scale invariant feature transform
Learning joint 2 d 3d representations for depth completion
Guided image filter
Fast cost volume filtering for visual correspondence and beyond
D2 net a trainable cnn for joint description and detection of local features
Gan dissection

Recently uploaded (20)

PPTX
Artificial Intelligence
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Well-logging-methods_new................
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
Current and future trends in Computer Vision.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Geodesy 1.pptx...............................................
PPT
introduction to datamining and warehousing
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
UNIT 4 Total Quality Management .pptx
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPT
Total quality management ppt for engineering students
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Artificial Intelligence
Foundation to blockchain - A guide to Blockchain Tech
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Well-logging-methods_new................
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Current and future trends in Computer Vision.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Geodesy 1.pptx...............................................
introduction to datamining and warehousing
CYBER-CRIMES AND SECURITY A guide to understanding
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
UNIT 4 Total Quality Management .pptx
III.4.1.2_The_Space_Environment.p pdffdf
Total quality management ppt for engineering students
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk

Aerial detection part3

  • 2. Contents 2 1. EAST: An Efficient and Accurate Scene Text Detector 2. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery
  • 3. EAST: An Efficient and Accurate Scene Text Detector 3 Network Overview Pipeline Input image Multi-channel FCN Multi-channel FCN Multi-oriented Task-wise boxes
  • 4. 4 EAST: An Efficient and Accurate Scene Text Detector Main Contributions : 1. propose two stage(step) method : FCN and NMS merging stage 2. pipeline is flexible
  • 5. 5 EAST: An Efficient and Accurate Scene Text Detector Relate work: PVANet
  • 6. 6 EAST: An Efficient and Accurate Scene Text Detector Pipeline input conv2 conv3 conv1 conv3 merging1 merging2 merging3 For reduce computation cost, using U-shape not using HyperNet in PVANet that merge all feature maps output
  • 7. EAST: An Efficient and Accurate Scene Text Detector 7 Label Generation
  • 8. EAST: An Efficient and Accurate Scene Text Detector 8 Label Generation: Score Map Generation Score map generation eq 𝑟𝑖 = min(D 𝑝𝑖, 𝑝 𝑖 𝑚𝑜𝑑 4 +1 , D 𝑝𝑖, 𝑝 𝑖+2 𝑚𝑜𝑑 4 +1 we shrink it by moving its two endpoints inward along the edge by 0.3𝑟𝑖 and 0.3𝑟 𝑖 𝑚𝑜𝑑 4 +1 espectively.
  • 9. 9 EAST: An Efficient and Accurate Scene Text Detector 𝐿 = 𝐿 𝑠 + 𝜆 𝑔 𝐿 𝑔 Label Generation: Loss Where Y = 𝐹S is the prediction of the score map, and Y∗ is the ground truth 𝐿 𝑠 : loss for score map 𝐿 𝑔 : loss for geometry
  • 10. 10 EAST: An Efficient and Accurate Scene Text Detector 𝐿 = 𝐿 𝑠 + 𝜆 𝑔 𝐿 𝑔 Label Generation: Loss RBOX: 𝐿 𝑔 = 𝐿 𝐴𝐴𝐵𝐵 + 𝜆 𝜃 𝐿 𝜃 𝐿 𝑠 : loss for score map 𝐿 𝑔 : loss for geometry 𝐿 𝑔 = 𝐿QUAD Q, Q∗ = min Q∈𝑃 𝑄∗ 𝑐 𝑖∈CQ 𝑐 𝑖∈CQ smoothed 𝐿1 𝑐𝑖 − 𝑐𝑖 8 × 𝑁 𝑄∗ QUAD:
  • 11. 11 EAST: An Efficient and Accurate Scene Text Detector Locality-Aware NMS Problem: A naïve NMS algorithm runs in 𝑂 𝑛2 where 𝑛 is the number of candidate geometries. The geometries from nearby pixels tend to be highly correlated. Solution: locality-aware NMS 𝑎 = WEIGHTEDMERGE 𝑔, 𝑝 , then 𝑎i = V 𝑔 𝑔𝑖 + 𝑉 𝑝 𝑝𝑖 and V 𝑎 = V 𝑔 + V(𝑝)
  • 12. 12 EAST: An Efficient and Accurate Scene Text Detector
  • 13. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 13 Network Overview
  • 14. 14 Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery Main Contributions : 1. new joint image cascade and feature pyramid network(ICN and FPN) 2. design a DIN module as a domain adaptation module 3. new loss function to shape rectangles by constraining the angles between the edges to 90 degrees
  • 15. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 15 ICN, FPN and Deformable Inception Subnetworks • Appropriate weights sharing • Resize image size by bilinear interpolation ICN • The low-level semantic feature from high resolution • The high-level semantic feature from low-level resolution
  • 16. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 16 ICN, FPN and Deformable Inception Subnetworks
  • 17. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 17 R-RPN Characteristics: 1. no difference between the front and back of objects 2. initialize anchor by using dimension clustering in YOLO v2 3. use the smooth 𝑙1 loss to regress the four coordinates
  • 18. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 18 R-ROI Characteristics: 1. penalize angles that are not 90 degree 2. initialize anchor by using dimension clustering in YOLO v2 3. use the smooth 𝑙1 loss to regress the four coordinates penalize angles that are not 90 degree
  • 19. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 19 Objective
  • 20. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 20 Objective
  • 21. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 21 Result
  • 22. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 22 Result
  • 23. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery 23 Result
  • 24. Reference 24 EAST: PVANET: Deep but lightweight neural networks for real-time object detection. Balanced-cross entropy: Holistically-nested edge detection Scene text detection via holistic, multi-channel prediction. U-shape: U-net: Convolu-tional networks for biomedical image segmentation. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery: Soft-NMS: Improving object detection with one line of code. IoU distance: Yolo9000: Better, faster, stronger. DIN: deformable convolutional networks

Editor's Notes

  • #17: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #18: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #19: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #20: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #21: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #22: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #23: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #24: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.
  • #25: DIN 내부에 deformable convolution을 통해 geometric transformation을 적용하는 것을 도와주고 더욱 offset regression property는 kernel 외부의 object를 localization하는 것을 도움 줍니다.