Python OpenCV: Object Tracking using Homography
Last Updated :
03 Jan, 2023
In this article, we are trying to track an object in the video with the image already given in it. We can also track the object in the image. Before seeing object tracking using homography let us know some basics.
What is Homography?
Homography is a transformation that maps the points in one point to the corresponding point in another image. The homography is a 3x3 matrix :
If 2 points are not in the same plane then we have to use 2 homographs. Similarly, for n planes, we have to use n homographs. If we have more homographs then we need to handle all of them properly. So that is why we use feature matching.
Importing Image Data : We will be reading the following image :
Above image is the cover page of book and it is stored as 'img.jpg'.
Python
# importing the required libraries
import cv2
import numpy as np
# reading image in grayscale
img = cv2.imread("img.jpg", cv2.IMREAD_GRAYSCALE)
# initializing web cam
cap = cv2.VideoCapture(0)
Feature Matching : Feature matching means finding corresponding features from two similar datasets based on a search distance. Now will be using sift algorithm and flann type feature matching.
Python
# creating the SIFT algorithm
sift = cv2.xfeatures2d.SIFT_create()
# find the keypoints and descriptors with SIFT
kp_image, desc_image =sift.detectAndCompute(img, None)
# initializing the dictionary
index_params = dict(algorithm = 0, trees = 5)
search_params = dict()
# by using Flann Matcher
flann = cv2.FlannBasedMatcher(index_params, search_params)
Now, we also have to convert the video capture into grayscale and by using appropriate matcher we have to match the points from image to the frame.
Here, we may face exceptions when we draw matches because infinitely there will we many points on both planes. To handle such conditions we should consider only some points, to get some accurate points we can vary the distance barrier.
Python
# reading the frame
_, frame = cap.read()
# converting the frame into grayscale
grayframe = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# find the keypoints and descriptors with SIFT
kp_grayframe, desc_grayframe = sift.detectAndCompute(grayframe, None)
# finding nearest match with KNN algorithm
matches= flann.knnMatch(desc_image, desc_grayframe, k=2)
# initialize list to keep track of only good points
good_points=[]
for m, n in matches:
#append the points according
#to distance of descriptors
if(m.distance < 0.6*n.distance):
good_points.append(m)
Homography : To detect the homography of the object we have to obtain the matrix and use function findHomography() to obtain the homograph of the object.
Python
# maintaining list of index of descriptors
# in query descriptors
query_pts = np.float32([kp_image[m.queryIdx]
.pt for m in good_points]).reshape(-1, 1, 2)
# maintaining list of index of descriptors
# in train descriptors
train_pts = np.float32([kp_grayframe[m.trainIdx]
.pt for m in good_points]).reshape(-1, 1, 2)
# finding perspective transformation
# between two planes
matrix, mask = cv2.findHomography(query_pts, train_pts, cv2.RANSAC, 5.0)
# ravel function returns
# contiguous flattened array
matches_mask = mask.ravel().tolist()
Everything is done till now, but when we try to change or move the object in another direction then the computer cannot able to find its homograph to deal with this we have to use perspective transform. For example, humans can see near objects larger than far objects, here perspective is changing. This is called Perspective transform.
Python
# initializing height and width of the image
h, w = img.shape
# saving all points in pts
pts = np.float32([[0, 0], [0, h], [w, h], [w, 0]])
.reshape(-1, 1, 2)
# applying perspective algorithm
dst = cv2.perspectiveTransform(pts, matrix)
At the end, lets see the output
Python
# using drawing function for the frame
homography = cv2.polylines(frame, [np.int32(dst)], True, (255, 0, 0), 3)
# showing the final output
# with homography
cv2.imshow("Homography", homography)
Output :
Similar Reads
Python | Image Registration using OpenCV Image registration is a digital image processing technique that helps us align different images of the same scene. For instance, one may click the picture of a book from various angles. Below are a few instances that show the diversity of camera angles.Now, we may want to "align" a particular image
3 min read
Python | Background subtraction using OpenCV Background Subtraction has several use cases in everyday life, It is being used for object segmentation, security enhancement, pedestrian tracking, counting the number of visitors, number of vehicles in traffic etc. It is able to learn and identify the foreground mask.As the name suggests, it is abl
2 min read
Count number of Object using Python-OpenCV In this article, we will use image processing to count the number of Objects using OpenCV in Python.Google Colab link: https://colab.research.google.com/drive/10lVjcFhdy5LVJxtSoz18WywM92FQAOSV?usp=sharingModule neededOpenCv: OpenCv is an open-source library that is useful for computer vision applica
2 min read
Getting Started With Object Tracking Using OpenCV OpenCV, developed by Intel in the early 2000s, is a popular open-source computer vision library used for real-time tasks. It offers various features like image processing, face detection, object detection, and more. In this article, we explore object-tracking algorithms and how to implement them usi
4 min read
Python: OpenCV findHomography Inputs OpenCV, the popular open-source computer vision and machine learning library, offers a wide range of tools for image processing and computer vision tasks. One of the critical functions in the context of image alignment and perspective transformation is findHomography. This function is used to find t
4 min read
Image Transformations using OpenCV in Python In this tutorial, we are going to learn Image Transformation using the OpenCV module in Python. What is Image Transformation? Image Transformation involves the transformation of image data in order to retrieve information from the image or preprocess the image for further usage. In this tutorial we
5 min read