To Detect human from an image using Python

Detecting humans in images is a common task in computer vision, often accomplished using machine learning models. Here’s a high-level overview of the theory and a practical guide to implement this in Python.

Detecting Human from an Image Using Python

Object Detection Basics:

Object Detection involves identifying and localizing objects within an image. This is typically done using a combination of classification (identifying what the object is) and localization (identifying where the object is).

Machine Learning Models for Detection:

  • Haar Cascades: Early method using hand-crafted features, often for face detection.
  • HOG + SVM: Histogram of Oriented Gradients (HOG) features combined with a Support Vector Machine (SVM) classifier.
  • Deep Learning Approaches:
  • CNNs: Convolutional Neural Networks are used for feature extraction and object classification.
  • R-CNN, Fast R-CNN, and Faster R-CNN: Region-based CNN approaches improve speed and accuracy of object detection.
  • YOLO (You Only Look Once): A single-stage detector that achieves high speed and accuracy.
  • SSD (Single Shot MultiBox Detector): Another single-stage object detector that balances speed and accuracy.
  • Mask R-CNN: Extends Faster R-CNN to include segmentation.

Training and Inference:

  • Training: Involves feeding labeled data (images with bounding boxes) to the model to learn the features of humans.
  • Inference: Using the trained model to detect humans in new images.

Code Implementation:

import cv2
import numpy as np
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
classes = []
with open("coco.names", "r") as f:
    classes = [line.strip() for line in f.readlines()]
image = cv2.imread("path_to_image.jpg")
height, width, channels = image.shape
blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)
class_ids = []
confidences = []
boxes = []

for out in outs:
    for detection in out:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        if confidence > 0.5 and class_id == 0:  # Filter for humans (class_id = 0 for 'person')
            center_x = int(detection[0] * width)
            center_y = int(detection[1] * height)
            w = int(detection[2] * width)
            h = int(detection[3] * height)
            x = int(center_x - w / 2)
            y = int(center_y - h / 2)
            boxes.append([x, y, w, h])
            confidences.append(float(confidence))
            class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
for i in range(len(boxes)):
    if i in indexes:
        x, y, w, h = boxes[i]
        label = str(classes[class_ids[i]])
        color = (0, 255, 0)  # Green box
        cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
        cv2.putText(image, label, (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
cv2.imshow("Image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top