SlideShare a Scribd company logo
Structured Regression for
            Efficient Object Detection

                        Christoph Lampert
                   www.christoph-lampert.org

            Max Planck Institute for Biological Cybernetics, Tübingen


                          December 3rd, 2009




• [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008]
• [Matthew B. Blaschko, C.L. ECCV 2008]
• [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]
Category-Level Object Localization
Category-Level Object Localization




             What objects are present? person, car
Category-Level Object Localization




                    Where are the objects?
Object Localization ⇒ Scene Interpretation




    A man inside of a car       A man outside of a car
    ⇒ He’s driving.             ⇒ He’s passing by.
Algorithmic Approach: Sliding Window




       f (y1 ) = 0.2         f (y2 ) = 0.8        f (y3 ) = 1.5

   Use a (pre-trained) classifier function f :
     • Place candidate window on the image.
     • Iterate:
            Evaluate f and store result.
            Shift candidate window by k pixels.
     • Return position where f was largest.
Algorithmic approach: Sliding Window




       f (y1 ) = 0.2       f (y2 ) = 0.8      f (y3 ) = 1.5

   Drawbacks:
     • single scale, single aspect ratio
       → repeat with different window sizes/shapes
     • search on grid
       → speed–accuracy tradeoff
     • computationally expensive
New view: Generalized Sliding Window




   Assumptions:
     • Objects are rectangular image regions of arbitrary size.
     • The score of f is largest at the correct object position.

   Mathematical Formulation:

                       yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}
New view: Generalized Sliding Window




   Mathematical Formulation:

                        yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn the function f ?
   • How to do the optimization efficiently and robustly?
     (exhaustive search is too slow, O(w2 h2 ) elements).
New view: Generalized Sliding Window




   Mathematical Formulation:

                        yopt = argmax f (y)
                                  y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn the function f ?
   • How to do the optimization efficiently and robustly?
     (exhaustive search is too slow, O(w2 h2 ) elements).
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
                                             • Calculate scores for
                                               sets of boxes jointly.

                                             • If no element can
                                               contain the maximum,
                                               discard the box set.

                                             • Otherwise, split the
                                               box set and iterate.

                                            → Branch-and-bound
                                              optimization
   • finds global maximum yopt
New view: Generalized Sliding Window


   Use the problem’s geometric structure:
                                             • Calculate scores for
                                               sets of boxes jointly.

                                             • If no element can
                                               contain the maximum,
                                               discard the box set.

                                             • Otherwise, split the
                                               box set and iterate.

                                            → Branch-and-bound
                                              optimization
   • finds global maximum yopt
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 .
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval.
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
     • New box sets: [L, T, R1 , B]
Representing Sets of Boxes

     • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4




   Splitting:
     • Identify largest interval. Split at center: R → R1 ∪R2 .
     • New box sets: [L, T, R1 , B] and [L, T, R2 , B].
Calculating Scores for Box Sets

   Example: Linear Support-Vector-Machine f (y) :=     pi ∈y   wi .




                                    +


         f upper (Y) =        min(0, wi ) +    max(0, wi )
                         pi ∈y∩           pi ∈y∪

   Can be computed in O(1) using integral images.
Calculating Scores for Box Sets
                                                  J               y
   Histogram Intersection Similarity: f (y) :=    j=1   min(hj , hj ).




                                  J               ∪
                                                  y
                 f upper (Y) =    j=1
                                        min(hj , hj )

   As fast as for a single box: O(J) with integral histograms.
Evaluation: Speed (on PASCAL VOC 2006)




   Sliding Window Runtime:
      • always: O(w2 h2 )
   Branch-and-Bound (ESS) Runtime:
      • worst-case: O(w2 h2 )
      • empirical: not more than O(wh)
Extensions:


   Action classification: (y, t)opt = argmax(y,t)∈Y×T fx (y, t)




   • J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection, CVPR 2009.
Extensions:

   Localized image retrieval: (x, y)opt = argmaxy∈Y, x∈D fx (y)




   • C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, ICCV 2009
Extensions:


   Hybrid – Branch-and-Bound with Implicit Shape Model




   • A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search, ICCV 2009
Structured regression for efficient object detection
Generalized Sliding Window




                       yopt = argmax f (y)
                                y∈Y


   with Y = {all rectangular regions in image}

   • How to choose/construct/learn f ?
   • How to do the optimization efficiently and robustly?
Traditional Approach: Binary Classifier

    Training images:
         +            +
      • x1 , . . . , xn show the object
         −            −
      • x1 , . . . , xm show something else

    Train a classifier, e.g.
      • support vector machine,
      • boosted cascade,
      • artificial neural network,. . .

    Decision function f : {images} → R
      • f > 0 means “image shows the object.”
      • f < 0 means “image does not show
        the object.”
Traditional Approach: Binary Classifier


   Drawbacks:


      • Train distribution
        = test distribution

      • No control over partial
        detections.

      • No guarantee to even find
        training examples again.
Object Localization as Structured Output Regression

   Ideal setup:
     • function
                    g : {all images} → {all boxes}
       to predict object boxes from images
     • train and test in the same way, end-to-end



                                 

      gcar                          =
Object Localization as Structured Output Regression

   Ideal setup:
     • function
                        g : {all images} → {all boxes}
       to predict object boxes from images
     • train and test in the same way, end-to-end

   Regression problem:
     • training examples (x1 , y1 ), . . . , (xn , yn ) ∈ X × Y
             xi are images, yi are bounding boxes
     • Learn a mapping
                                      g : X →Y
        that generalizes from the given examples:
             g(xi ) ≈ yi , for i = 1, . . . , n,
Structured Support Vector Machine

     SVM-like framework by Tsochantaridis et al.:
       • Positive definite kernel k : (X × Y) × (X × Y)→R.
         ϕ : X × Y → H : (implicit) feature map induced by k.
         • ∆ : Y × Y → R:                        loss function
         • Solve the convex optimization problem
                                                                               n
                                                        1         2
                                          minw,ξ          w           +C           ξi
                                                        2                    i=1

             subject to margin constraints for i = 1, . . . , n :

             ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
         • unique solution: w ∗ ∈ H

• I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and Interdependent
  Output Variables, Journal of Machine Learning Research (JMLR), 2005.
Structured Support Vector Machine



     • w ∗ defines compatiblity function

                         F (x, y) = w ∗ , ϕ(x, y)

     • best prediction for x is the most compatible y:

                        g(x) := argmax F (x, y).
                                    y∈Y

     • evaluating g : X → Y is like generalized Sliding Window:
            for fixed x, evaluate quality function for every box y ∈ Y.
            for example, use previous branch-and-bound procedure!
Joint Image/Box-Kernel: Example


   Joint kernel: how to compare one (image,box)-pair (x, y) with
   another (image,box)-pair (x , y )?

   kjoint          ,            =k           ,          is large.



   kjoint          ,            =k           ,          is small.



   kjoint          ,            = kimage         ,

                                            could also be large.
Loss Function: Example

   Loss function: how to compare two boxes y and y ?




          ∆(y, y ) := 1 − area overlap between y and y
                          area(y ∩ y )
                    =1−
                          area(y ∪ y )
Structured Support Vector Machine

                                                                      n
                                                     1       2
     • S-SVM Optimization:                  minw,ξ   2
                                                         w       +C         ξi
                                                                      i=1
        subject to for i = 1, . . . , n :

   ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
Structured Support Vector Machine

                                                                      n
                                                     1       2
     • S-SVM Optimization:                  minw,ξ   2
                                                         w       +C         ξi
                                                                      i=1
        subject to for i = 1, . . . , n :

   ∀y ∈ Y  {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,

     • Solve via constraint generation:
     • Iterate:
             Solve minimization with working set of contraints
             Identify argmaxy∈Y ∆(y, yi ) + w, ϕ(xi , y)
             Add violated constraints to working set and iterate
     • Polynomial time convergence to any precision ε

     • Similar to bootstrap training, but with a margin.
Evaluation: PASCAL VOC 2006




           Example detections for VOC 2006 bicycle, bus and cat.




          Precision–recall curves for VOC 2006 bicycle, bus and cat.

    • Structured regression improves detection accuracy.
    • New best scores (at that time) in 6 of 10 classes.
Why does it work?




        Learned weights from binary (center) and structured training (right).



     • Both methods assign positive weights to object region.
     • Structured training also assigns negative weights to
       features surrounding the bounding box position.
     • Posterior distribution over box coordinates becomes more
       peaked.
More Recent Results (PASCAL VOC 2009)




                aeroplane
More Recent Results (PASCAL VOC 2009)




                 bicycle
More Recent Results (PASCAL VOC 2009)




                   bird
More Recent Results (PASCAL VOC 2009)




                   boat
More Recent Results (PASCAL VOC 2009)




                   bottle
More Recent Results (PASCAL VOC 2009)




                    bus
More Recent Results (PASCAL VOC 2009)




                     car
More Recent Results (PASCAL VOC 2009)




                     cat
More Recent Results (PASCAL VOC 2009)




                     chair
More Recent Results (PASCAL VOC 2009)




                     cow
More Recent Results (PASCAL VOC 2009)




                   diningtable
More Recent Results (PASCAL VOC 2009)




                      dog
More Recent Results (PASCAL VOC 2009)




                      horse
More Recent Results (PASCAL VOC 2009)




                    motorbike
More Recent Results (PASCAL VOC 2009)




                      person
More Recent Results (PASCAL VOC 2009)




                    pottedplant
More Recent Results (PASCAL VOC 2009)




                       sheep
More Recent Results (PASCAL VOC 2009)




                        sofa
More Recent Results (PASCAL VOC 2009)




                        train
More Recent Results (PASCAL VOC 2009)




                      tvmonitor
Extensions:

      Image segmentation with connectedness constraint:




               CRF segmentation                             connected CRF segmentation

 • S. Nowozin, C.L.: Global Connectivity Potentials for Random Field Models, CVPR 2009.
Summary


  Object Localization is a step towards image interpretation.

  Conceptual approach instead of algorithmic:
    • Branch-and-bound evaluation:
           don’t slide a window, but solve an argmax problem,
           ⇒ higher efficiency

    • Structured regression training:
           solve the prediction problem, not a classification proxy.
           ⇒ higher localization accuracy

    • Modular and kernelized:
           easily adapted to other problems/representations, e.g.
           image segmentations
Structured regression for efficient object detection

Recommended

Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Models
zukun
 
Machine learning of structured outputs
Machine learning of structured outputs
zukun
 
Object Recognition with Deformable Models
Object Recognition with Deformable Models
zukun
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
Gabriel Peyré
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Beniamino Murgante
 
Learning Sparse Representation
Learning Sparse Representation
Gabriel Peyré
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison Data
Liyuan Xu
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
Gabriel Peyré
 
07 cie552 image_mosaicing
07 cie552 image_mosaicing
Elsayed Hemayed
 
Curve fitting
Curve fitting
JULIO GONZALEZ SANZ
 
Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031
frdos
 
03 finding roots
03 finding roots
Sparisoma Viridi
 
Lecture 1
Lecture 1
SaeedShahAhmed
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
Matthew Leingang
 
Journey to structure from motion
Journey to structure from motion
Ja-Keoung Koo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
zukun
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
zukun
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
zukun
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
Gabriel Peyré
 
Calculus Cheat Sheet All
Calculus Cheat Sheet All
Moe Han
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
Paris Women in Machine Learning and Data Science
 
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
Charles Deledalle
 
Common derivatives integrals_reduced
Common derivatives integrals_reduced
Kyro Fitkry
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
Gabriel Peyré
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
Ashwin Rao
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
guestd436758
 
Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...
B G S Institute of Technolgy
 
super vector machines algorithms using deep
super vector machines algorithms using deep
KNaveenKumarECE
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
sanjaibalajeessn
 
course slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdf
onurenginar1
 

More Related Content

What's hot (19)

07 cie552 image_mosaicing
07 cie552 image_mosaicing
Elsayed Hemayed
 
Curve fitting
Curve fitting
JULIO GONZALEZ SANZ
 
Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031
frdos
 
03 finding roots
03 finding roots
Sparisoma Viridi
 
Lecture 1
Lecture 1
SaeedShahAhmed
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
Matthew Leingang
 
Journey to structure from motion
Journey to structure from motion
Ja-Keoung Koo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
zukun
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
zukun
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
zukun
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
Gabriel Peyré
 
Calculus Cheat Sheet All
Calculus Cheat Sheet All
Moe Han
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
Paris Women in Machine Learning and Data Science
 
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
Charles Deledalle
 
Common derivatives integrals_reduced
Common derivatives integrals_reduced
Kyro Fitkry
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
Gabriel Peyré
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
Ashwin Rao
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
guestd436758
 
Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...
B G S Institute of Technolgy
 
07 cie552 image_mosaicing
07 cie552 image_mosaicing
Elsayed Hemayed
 
Ann chapter-3-single layerperceptron20021031
Ann chapter-3-single layerperceptron20021031
frdos
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
Matthew Leingang
 
Journey to structure from motion
Journey to structure from motion
Ja-Keoung Koo
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
zukun
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
zukun
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
zukun
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
Gabriel Peyré
 
Calculus Cheat Sheet All
Calculus Cheat Sheet All
Moe Han
 
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
IVR - Chapter 2 - Basics of filtering I: Spatial filters (25Mb)
Charles Deledalle
 
Common derivatives integrals_reduced
Common derivatives integrals_reduced
Kyro Fitkry
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
Gabriel Peyré
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
Ashwin Rao
 
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
05210401 P R O B A B I L I T Y T H E O R Y A N D S T O C H A S T I C P R...
guestd436758
 
Computer Science and Information Science 3rd semester (2012-December) Questio...
Computer Science and Information Science 3rd semester (2012-December) Questio...
B G S Institute of Technolgy
 

Similar to Structured regression for efficient object detection (20)

super vector machines algorithms using deep
super vector machines algorithms using deep
KNaveenKumarECE
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
sanjaibalajeessn
 
course slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdf
onurenginar1
 
Dual SVM Problem.pdf
Dual SVM Problem.pdf
ssuser8547f2
 
Chapter8 LINEAR DESCRIMINANT FOR MACHINE LEARNING.pdf
Chapter8 LINEAR DESCRIMINANT FOR MACHINE LEARNING.pdf
PRABHUCECC
 
Integration
Integration
Oladokun Sulaiman Olanrewaju
 
机器学习Adaboost
机器学习Adaboost
Shocky1
 
Lecture09 SVM Intro, Kernel Trick (updated).pptx
Lecture09 SVM Intro, Kernel Trick (updated).pptx
DrMTayyabChaudhry1
 
point processing
point processing
Rumah Belajar
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
The world of loss function
The world of loss function
홍배 김
 
Support vector machine
Support vector machine
Prasenjit Dey
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
Svm V SVC
Svm V SVC
AMR koura
 
Quadratics Unit PowerPoint - all days 2014.pptx
Quadratics Unit PowerPoint - all days 2014.pptx
michaelup2023
 
SVMs.pptx support vector machines machine learning
SVMs.pptx support vector machines machine learning
AmgadAbdallah2
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lr
guesta32562
 
Knapsack problem using fixed tuple
Knapsack problem using fixed tuple
Mohanlal Sukhadia University (MLSU)
 
svm_introductory_ppt by university of texas
svm_introductory_ppt by university of texas
krutikavermafcs
 
Support vector MAchine using machine learning
Support vector MAchine using machine learning
GauravRaj772344
 
super vector machines algorithms using deep
super vector machines algorithms using deep
KNaveenKumarECE
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
sanjaibalajeessn
 
course slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdf
onurenginar1
 
Dual SVM Problem.pdf
Dual SVM Problem.pdf
ssuser8547f2
 
Chapter8 LINEAR DESCRIMINANT FOR MACHINE LEARNING.pdf
Chapter8 LINEAR DESCRIMINANT FOR MACHINE LEARNING.pdf
PRABHUCECC
 
机器学习Adaboost
机器学习Adaboost
Shocky1
 
Lecture09 SVM Intro, Kernel Trick (updated).pptx
Lecture09 SVM Intro, Kernel Trick (updated).pptx
DrMTayyabChaudhry1
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
The world of loss function
The world of loss function
홍배 김
 
Support vector machine
Support vector machine
Prasenjit Dey
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
Quadratics Unit PowerPoint - all days 2014.pptx
Quadratics Unit PowerPoint - all days 2014.pptx
michaelup2023
 
SVMs.pptx support vector machines machine learning
SVMs.pptx support vector machines machine learning
AmgadAbdallah2
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lr
guesta32562
 
svm_introductory_ppt by university of texas
svm_introductory_ppt by university of texas
krutikavermafcs
 
Support vector MAchine using machine learning
Support vector MAchine using machine learning
GauravRaj772344
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 

Recently uploaded (20)

The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 

Structured regression for efficient object detection

  • 1. Structured Regression for Efficient Object Detection Christoph Lampert www.christoph-lampert.org Max Planck Institute for Biological Cybernetics, Tübingen December 3rd, 2009 • [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008] • [Matthew B. Blaschko, C.L. ECCV 2008] • [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]
  • 3. Category-Level Object Localization What objects are present? person, car
  • 4. Category-Level Object Localization Where are the objects?
  • 5. Object Localization ⇒ Scene Interpretation A man inside of a car A man outside of a car ⇒ He’s driving. ⇒ He’s passing by.
  • 6. Algorithmic Approach: Sliding Window f (y1 ) = 0.2 f (y2 ) = 0.8 f (y3 ) = 1.5 Use a (pre-trained) classifier function f : • Place candidate window on the image. • Iterate: Evaluate f and store result. Shift candidate window by k pixels. • Return position where f was largest.
  • 7. Algorithmic approach: Sliding Window f (y1 ) = 0.2 f (y2 ) = 0.8 f (y3 ) = 1.5 Drawbacks: • single scale, single aspect ratio → repeat with different window sizes/shapes • search on grid → speed–accuracy tradeoff • computationally expensive
  • 8. New view: Generalized Sliding Window Assumptions: • Objects are rectangular image regions of arbitrary size. • The score of f is largest at the correct object position. Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image}
  • 9. New view: Generalized Sliding Window Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O(w2 h2 ) elements).
  • 10. New view: Generalized Sliding Window Mathematical Formulation: yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn the function f ? • How to do the optimization efficiently and robustly? (exhaustive search is too slow, O(w2 h2 ) elements).
  • 11. New view: Generalized Sliding Window Use the problem’s geometric structure:
  • 12. New view: Generalized Sliding Window Use the problem’s geometric structure: • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum yopt
  • 13. New view: Generalized Sliding Window Use the problem’s geometric structure: • Calculate scores for sets of boxes jointly. • If no element can contain the maximum, discard the box set. • Otherwise, split the box set and iterate. → Branch-and-bound optimization • finds global maximum yopt
  • 14. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 .
  • 15. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4
  • 16. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval.
  • 17. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 .
  • 18. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 . • New box sets: [L, T, R1 , B]
  • 19. Representing Sets of Boxes • Boxes: [l, t, r, b] ∈ R4 . Boxsets: [L, T, R, B] ∈ (R2 )4 Splitting: • Identify largest interval. Split at center: R → R1 ∪R2 . • New box sets: [L, T, R1 , B] and [L, T, R2 , B].
  • 20. Calculating Scores for Box Sets Example: Linear Support-Vector-Machine f (y) := pi ∈y wi . + f upper (Y) = min(0, wi ) + max(0, wi ) pi ∈y∩ pi ∈y∪ Can be computed in O(1) using integral images.
  • 21. Calculating Scores for Box Sets J y Histogram Intersection Similarity: f (y) := j=1 min(hj , hj ). J ∪ y f upper (Y) = j=1 min(hj , hj ) As fast as for a single box: O(J) with integral histograms.
  • 22. Evaluation: Speed (on PASCAL VOC 2006) Sliding Window Runtime: • always: O(w2 h2 ) Branch-and-Bound (ESS) Runtime: • worst-case: O(w2 h2 ) • empirical: not more than O(wh)
  • 23. Extensions: Action classification: (y, t)opt = argmax(y,t)∈Y×T fx (y, t) • J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection, CVPR 2009.
  • 24. Extensions: Localized image retrieval: (x, y)opt = argmaxy∈Y, x∈D fx (y) • C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, ICCV 2009
  • 25. Extensions: Hybrid – Branch-and-Bound with Implicit Shape Model • A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search, ICCV 2009
  • 27. Generalized Sliding Window yopt = argmax f (y) y∈Y with Y = {all rectangular regions in image} • How to choose/construct/learn f ? • How to do the optimization efficiently and robustly?
  • 28. Traditional Approach: Binary Classifier Training images: + + • x1 , . . . , xn show the object − − • x1 , . . . , xm show something else Train a classifier, e.g. • support vector machine, • boosted cascade, • artificial neural network,. . . Decision function f : {images} → R • f > 0 means “image shows the object.” • f < 0 means “image does not show the object.”
  • 29. Traditional Approach: Binary Classifier Drawbacks: • Train distribution = test distribution • No control over partial detections. • No guarantee to even find training examples again.
  • 30. Object Localization as Structured Output Regression Ideal setup: • function g : {all images} → {all boxes} to predict object boxes from images • train and test in the same way, end-to-end   gcar   =
  • 31. Object Localization as Structured Output Regression Ideal setup: • function g : {all images} → {all boxes} to predict object boxes from images • train and test in the same way, end-to-end Regression problem: • training examples (x1 , y1 ), . . . , (xn , yn ) ∈ X × Y xi are images, yi are bounding boxes • Learn a mapping g : X →Y that generalizes from the given examples: g(xi ) ≈ yi , for i = 1, . . . , n,
  • 32. Structured Support Vector Machine SVM-like framework by Tsochantaridis et al.: • Positive definite kernel k : (X × Y) × (X × Y)→R. ϕ : X × Y → H : (implicit) feature map induced by k. • ∆ : Y × Y → R: loss function • Solve the convex optimization problem n 1 2 minw,ξ w +C ξi 2 i=1 subject to margin constraints for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi , • unique solution: w ∗ ∈ H • I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and Interdependent Output Variables, Journal of Machine Learning Research (JMLR), 2005.
  • 33. Structured Support Vector Machine • w ∗ defines compatiblity function F (x, y) = w ∗ , ϕ(x, y) • best prediction for x is the most compatible y: g(x) := argmax F (x, y). y∈Y • evaluating g : X → Y is like generalized Sliding Window: for fixed x, evaluate quality function for every box y ∈ Y. for example, use previous branch-and-bound procedure!
  • 34. Joint Image/Box-Kernel: Example Joint kernel: how to compare one (image,box)-pair (x, y) with another (image,box)-pair (x , y )? kjoint , =k , is large. kjoint , =k , is small. kjoint , = kimage , could also be large.
  • 35. Loss Function: Example Loss function: how to compare two boxes y and y ? ∆(y, y ) := 1 − area overlap between y and y area(y ∩ y ) =1− area(y ∪ y )
  • 36. Structured Support Vector Machine n 1 2 • S-SVM Optimization: minw,ξ 2 w +C ξi i=1 subject to for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi ,
  • 37. Structured Support Vector Machine n 1 2 • S-SVM Optimization: minw,ξ 2 w +C ξi i=1 subject to for i = 1, . . . , n : ∀y ∈ Y {yi } : ∆(y, yi ) + w, ϕ(xi , y) − w, ϕ(xi , yi ) ≤ ξi , • Solve via constraint generation: • Iterate: Solve minimization with working set of contraints Identify argmaxy∈Y ∆(y, yi ) + w, ϕ(xi , y) Add violated constraints to working set and iterate • Polynomial time convergence to any precision ε • Similar to bootstrap training, but with a margin.
  • 38. Evaluation: PASCAL VOC 2006 Example detections for VOC 2006 bicycle, bus and cat. Precision–recall curves for VOC 2006 bicycle, bus and cat. • Structured regression improves detection accuracy. • New best scores (at that time) in 6 of 10 classes.
  • 39. Why does it work? Learned weights from binary (center) and structured training (right). • Both methods assign positive weights to object region. • Structured training also assigns negative weights to features surrounding the bounding box position. • Posterior distribution over box coordinates becomes more peaked.
  • 40. More Recent Results (PASCAL VOC 2009) aeroplane
  • 41. More Recent Results (PASCAL VOC 2009) bicycle
  • 42. More Recent Results (PASCAL VOC 2009) bird
  • 43. More Recent Results (PASCAL VOC 2009) boat
  • 44. More Recent Results (PASCAL VOC 2009) bottle
  • 45. More Recent Results (PASCAL VOC 2009) bus
  • 46. More Recent Results (PASCAL VOC 2009) car
  • 47. More Recent Results (PASCAL VOC 2009) cat
  • 48. More Recent Results (PASCAL VOC 2009) chair
  • 49. More Recent Results (PASCAL VOC 2009) cow
  • 50. More Recent Results (PASCAL VOC 2009) diningtable
  • 51. More Recent Results (PASCAL VOC 2009) dog
  • 52. More Recent Results (PASCAL VOC 2009) horse
  • 53. More Recent Results (PASCAL VOC 2009) motorbike
  • 54. More Recent Results (PASCAL VOC 2009) person
  • 55. More Recent Results (PASCAL VOC 2009) pottedplant
  • 56. More Recent Results (PASCAL VOC 2009) sheep
  • 57. More Recent Results (PASCAL VOC 2009) sofa
  • 58. More Recent Results (PASCAL VOC 2009) train
  • 59. More Recent Results (PASCAL VOC 2009) tvmonitor
  • 60. Extensions: Image segmentation with connectedness constraint: CRF segmentation connected CRF segmentation • S. Nowozin, C.L.: Global Connectivity Potentials for Random Field Models, CVPR 2009.
  • 61. Summary Object Localization is a step towards image interpretation. Conceptual approach instead of algorithmic: • Branch-and-bound evaluation: don’t slide a window, but solve an argmax problem, ⇒ higher efficiency • Structured regression training: solve the prediction problem, not a classification proxy. ⇒ higher localization accuracy • Modular and kernelized: easily adapted to other problems/representations, e.g. image segmentations