I’ve met a lot of amazing, uplifting people over the years. My PhD advisor who helped get me through graduate school. My father who was always there for me as a kid — and still is now. And my girlfriend who has always been positive, helpful, and supportive (even when I probably didn’t deserve it).
I’ve also met some demoralizing, discouraging ones. Family members who have gone out of their way to deter me from being an entrepreneur and working for myself. Colleagues who either disliked me or my work and chose to express their disdain in a public fashion. And then there are those who have said some pretty disheartening things over email, Twitter, and other internet outlets.
We’re all familiar with these types of people. Yet regardless of their demeanor (whether positive or negative), we’re all built from the same genetic material of four nucleobases: cytosine, guanine, adenine, and thymine.
These base pairs are combined in such a way that our bodies all have the same basic structure regardless of gender, race, or ethnicity. At the most structural level we all have a head, two arms, a torso, and two legs.
We can use computer vision to exploit this semi-rigid structure and extract features to quantify the human body. These features can be passed on to machine learning models that when trained can be used to detect and track humans in images and video streams. This is especially useful for the task of pedestrian detection, which is the topic we’ll be talking about in today’s blog post.
Read on to find out how you can use OpenCV and Python to perform pedestrian detection.
Looking for the source code to this post?
Jump right to the downloads section.
Pedestrian Detection OpenCV
Did you know that OpenCV has built-in methods to perform pedestrian detection?
OpenCV ships with a pre-trained HOG + Linear SVM model that can be used to perform pedestrian detection in both images and video streams. If you’re not familiar with the Histogram of Oriented Gradients and Linear SVM method, I suggest you read this blog post where I discuss the 6 step framework.
If you’re already familiar with the process (or if you just want to see some code on how pedestrian detection with OpenCV is done), just open up a new file, name it
detect.py, and we’ll get coding:
# import the necessary packages from __future__ import print_function from imutils.object_detection import non_max_suppression from imutils import paths import numpy as np import argparse import imutils import cv2 # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--images", required=True, help="path to images directory") args = vars(ap.parse_args()) # initialize the HOG descriptor/person detector hog = cv2.HOGDescriptor() hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
Lines 2-8 start by importing our necessary packages. We’ll import
print_functionto ensure our code is compatible with both Python 2.7 and Python 3 (this code will also work for OpenCV 2.4.X and OpenCV 3). From there, we’ll import the
non_max_suppressionfunction from my imutils package.
If you do not have
imutilsinstalled, let
pipinstall it for you:
$ pip install imutils
If you do have
imutilsinstalled, you’ll need to upgrade to the latest version (v0.3.1) which includes the implementation of the
non_max_suppressionfunction, along with a few other minor updates:
$ pip install --upgrade imutils
I’ve talked about non-maxima suppression twice on the PyImageSearch blog, once in this introductory post, and again in this post on implementing a faster NMS algorithm. In either case, the gist of the non-maxima suppression algorithm is to take multiple, overlapping bounding boxes and reduce them to only a single bounding box:
This helps reduce the number of false-positives reported by the final object detector.
Lines 11-13 handle parsing our command line arguments. We only need a single switch here,
--images, which is the path to the directory that contains the list of images we are going to perform pedestrian detection on.
Finally, Lines 16 and 17 initialize our pedestrian detector. First, we make a call to
hog = cv2.HOGDescriptor()which initializes the Histogram of Oriented Gradients descriptor. Then, we call the
setSVMDetectorto set the Support Vector Machine to be pre-trained pedestrian detector, loaded via the
cv2.HOGDescriptor_getDefaultPeopleDetector()function.
At this point our OpenCV pedestrian detector is fully loaded, we just need to apply it to some images:
# import the necessary packages from __future__ import print_function from imutils.object_detection import non_max_suppression from imutils import paths import numpy as np import argparse import imutils import cv2 # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--images", required=True, help="path to images directory") args = vars(ap.parse_args()) # initialize the HOG descriptor/person detector hog = cv2.HOGDescriptor() hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector()) # loop over the image paths for imagePath in paths.list_images(args["images"]): # load the image and resize it to (1) reduce detection time # and (2) improve detection accuracy image = cv2.imread(imagePath) image = imutils.resize(image, width=min(400, image.shape[1])) orig = image.copy() # detect people in the image (rects, weights) = hog.detectMultiScale(image, winStride=(4, 4), padding=(8, 8), scale=1.05) # draw the original bounding boxes for (x, y, w, h) in rects: cv2.rectangle(orig, (x, y), (x + w, y + h), (0, 0, 255), 2) # apply non-maxima suppression to the bounding boxes using a # fairly large overlap threshold to try to maintain overlapping # boxes that are still people rects = np.array([[x, y, x + w, y + h] for (x, y, w, h) in rects]) pick = non_max_suppression(rects, probs=None, overlapThresh=0.65) # draw the final bounding boxes for (xA, yA, xB, yB) in pick: cv2.rectangle(image, (xA, yA), (xB, yB), (0, 255, 0), 2) # show some information on the number of bounding boxes filename = imagePath[imagePath.rfind("/") + 1:] print("[INFO] {}: {} original boxes, {} after suppression".format( filename, len(rects), len(pick))) # show the output images cv2.imshow("Before NMS", orig) cv2.imshow("After NMS", image) cv2.waitKey(0)
On Line 20 we start looping over the images in our
--imagesdirectory. The examples in this blog post (and the additional images included in the source code download of this article) are samples form the popular INRIA Person Dataset (specifically, from the GRAZ-01 subset).
From there, Lines 23-25 handle loading our image off disk and resizing it to have a maximum width of 400 pixels. The reason we attempt to reduce our image dimensions is two-fold:
- Reducing image size ensures that less sliding windows in the image pyramid need to be evaluated (i.e., have HOG features extracted from and then passed on to the Linear SVM), thus reducing detection time (and increasing overall detection throughput).
- Resizing our image also improves the overall accuracy of our pedestrian detection (i.e., less false-positives).
Actually detecting pedestrians in images is handled by Lines 28 and 29 by making a call to the
detectMultiScalemethod of the
hogdescriptor. The
detectMultiScalemethod constructs an image pyramid with
scale=1.05and a sliding window step size of
(4, 4)pixels in both the x and y direction, respectively.
The size of the sliding window is fixed at 32 x 128 pixels, as suggested by the seminal Dalal and Triggs paper, Histograms of Oriented Gradients for Human Detection. The
detectMultiScalefunction returns a 2-tuple of
rects, or the bounding box (x, y)-coordinates of each person in the image, and
weights, the confidence value returned by the SVM for each detection.
A larger
scalesize will evaluate less layers in the image pyramid which can make the algorithm faster to run. However, having too large of a scale (i.e., less layers in the image pyramid) can lead to pedestrians not being detected. Similarly, having too small of a
scalesize dramatically increases the number of image pyramid layers that need to be evaluated. Not only can this be computationally wasteful, it can also dramatically increase the number of false-positives detected by the pedestrian detector. That said, the
scaleis one of the most important parameters to tune when performing pedestrian detection. I’ll perform a more thorough review of each of the parameters to
detectMultiScalein a future blog post.
Lines 32 and 33 take our initial bounding boxes and draw them on our image.
However, for some images you’ll notice that there are multiple, overlapping bounding boxes detected for each person (as demonstrated by Figure 1 above).
In this case, we have two options. We can detect if one bounding box is fully contained within another (as one of the OpenCV examples implements). Or we can apply non-maxima suppression and suppress bounding boxes that overlap with a significant threshold — and that’s exactly what Lines 38 and 39 do.
Note: If you’re interested in learning more about the HOG framework and non-maxima suppression, I would start by reading this introductory post on the 6-step framework. From there, check out this post on simple non-maxima suppression followed by an updated post that implements the optimized Malisiewicz method.
After applying non-maxima suppression, we draw the finalized bounding boxes on Lines 42 and 43, display some basic information about the image and number of bounding boxes on Lines 46-48, and finally display our output images to our screen on Lines 51-53.
Results of pedestrian detection in images
To see our pedestrian detection script in action, just issue the following command:
$ python detect.py --images images
Below I have provided a sample of results from the detection script:
Here we have detected a single person standing next to a police car.
In the above example we can see a man detected in the foreground of the image, while a woman pushing a baby stroller is detected in the background.
The above image serves an example of why applying non-maxima suppression is important. The
detectMultiScalefunction falsely detected two bounding boxes (along with the correct bounding box), both overlapping the true person in the image. By applying non-maxima suppression we were able to suppress the extraneous bounding boxes, leaving us with the true detection
Again, we see that multiple false bounding boxes are detected, but by applying NMS we can remove them, leaving us with the true detection in the image.
Here we are detecting pedestrians in a shopping mall. Notice two people are walking away from the camera while another is walking towards the camera. In either case, our HOG method is able to detect the people. The larger
overlapThreshin the
non_maxima_suppressionfunction ensures that the bounding boxes are not suppressed, even though they do partially overlap.
I was particularly surprised by the results of the above image. Normally the HOG descriptor does not perform well in the presence of motion blur, yet we are still able to detect the pedestrians in this image.
This is another example of multiple, overlapping bounding boxes, but due to the larger
overlapThreshthey are not suppressed, leaving us with the correct person detections.
The above image shows the versatility of our HOG + SVM pedestrian detector. We are not only able to detect the adult male, but also the three small children as well. (Note that the detector is not able to find the other child hiding behind his [presumed to be] father).
I include this image last simply because I find it amusing. We are clearly viewing a road sign, likely used to indicate a pedestrian crossing. However, our HOG + SVM detector marks the two people in this image as positive classifications!
Summary
In this blog post we learned how to perform pedestrian detection using the OpenCV library and the Python programming language.
The OpenCV library actually ships with a pre-trained HOG + Linear SVM detector based on the Dalal and Triggs method to automatically detect pedestrians in images.
While the HOG method tends to be more accurate than its Haar counter-part, it still requires that the parameters to
detectMultiScalebe set properly. In future blog posts, I’ll review each of the parameters to
detectMultiScale, detail how to tune each of them, and describe the trade-offs between accuracy and performance.
Anyway, I hope you enjoyed this article! I’m planning on doing more object detection tutorials in the future, so if you want to be notified when these posts go live, please consider subscribing to the newsletter using the form below.
I also cover object detection using the HOG + Linear SVM method in detail inside the PyImageSearch Gurus course, so be sure to take a look!
Downloads:
The post Pedestrian Detection OpenCV appeared first on PyImageSearch.
from PyImageSearch http://ift.tt/1iNY3E6
via IFTTT
No comments:
Post a Comment