Object Detection

Computer vision detection algorithms locate and identify objects within digital images or video frames. One fast detection algorithm is YOLO (You Only Look Once). YOLO is known for real-time object detection and has several versions, with YOLOv8 released in 2023. YOLO detects multiple objects in an image or video frame in a single forward pass of the neural network. It divides the input image into a grid and predicts bounding boxes and class probabilities for each grid cell.

YOLO is popular for real-time applications like self-driving cars, surveillance, and robotics. Install YOLOv8 from ultralytics using pip:

pip install ultralytics

Below is an example code snippet to perform object detection from an image file:

from ultralytics import YOLO
import matplotlib.pyplot as plt

# Download sample image
file = 'street.png'
url = 'http://apmonitor.com/dde/uploads/Main/'+file

# Load pretrained YOLOv8n model
model = YOLO('yolov8n.pt')

# Prediction
results = model(url)

# Print results
fr = results[0].plot()
# convert BGR to RGB and show
plt.imshow(fr[:,:,[2,1,0]])
plt.show()

The results variable contains boxes with coordinates to the bounding coordinates and confidence for each detection.

The detection misses the airplane and bus and other handbags or briefcases. It correctly identifies objects such as the car and people. YOLOv8 can be used with a pretrained model or trained using a custom labeled image set.

Activity: Identify Objects from Video

Extend the single image object detection to the example from Video Data Analysis. Test the object detection with the runner video sample.

import cv2
from ultralytics import YOLO
import time

# Load the YOLOv8 model
model = YOLO('yolov8n.pt')

# Open the video file
url = 'http://apmonitor.com/dde/uploads/Main/runner.mp4'
cap = cv2.VideoCapture(url)

# Save results video file at 5 fps
out = cv2.VideoWriter('results.mp4',
                      cv2.VideoWriter_fourcc(*'mp4v'),
                      5,(1920,1080))

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLOv8 inference on the frame
        results = model(frame)

        # Visualize the results on the frame
        fr = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8", fr)

        # Save frame to video
        out.write(fr)

        # Pause for 0.2 sec
        time.sleep(0.2)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break

# Release the video capture object and close the display window
cap.release()
out.release()
cv2.destroyAllWindows()

Detect Objects from Webcam Video

Modify the script to read from a webcam instead of a video file source. Change to cv2.VideoCapture(0) (1st webcam) or cv2.VideoCapture(1) (2nd webcam) if there are multiple webcams. Remove the 0.2 sec time delay, but add a time limit of 30 seconds. Selecting q also quits the video collection. There is no need to save the video. If the video is saved, adjust the frame resolution to the video output dimensions to avoid an error.

import cv2
from ultralytics import YOLO
import time

# Load the YOLOv8 model
model = YOLO('yolov8n.pt')

# Open the webcam stream
cap = cv2.VideoCapture(0)

# Record start time
start = time.time()

# Loop through the video frames
while (cap.isOpened()) and ((time.time()-start)<=30.0):
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLOv8 inference on the frame
        results = model(frame)

        # Visualize the results on the frame
        fr = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8", fr)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()
💬