Object Detection
Computer vision detection algorithms locate and identify objects within digital images or video frames. One fast detection algorithm is YOLO (You Only Look Once). YOLO is known for real-time object detection and has several versions, with YOLOv8 released in 2023. YOLO detects multiple objects in an image or video frame in a single forward pass of the neural network. It divides the input image into a grid and predicts bounding boxes and class probabilities for each grid cell.
YOLO is popular for real-time applications like self-driving cars, surveillance, and robotics. Install YOLOv8 from ultralytics using pip:
Below is an example code snippet to perform object detection from an image file:
import matplotlib.pyplot as plt
# Download sample image
file = 'street.png'
url = 'http://apmonitor.com/dde/uploads/Main/'+file
# Load pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Prediction
results = model(url)
# Print results
fr = results[0].plot()
# convert BGR to RGB and show
plt.imshow(fr[:,:,[2,1,0]])
plt.show()
The results variable contains boxes with coordinates to the bounding coordinates and confidence for each detection.
The detection misses the airplane and bus and other handbags or briefcases. It correctly identifies objects such as the car and people. YOLOv8 can be used with a pretrained model or trained using a custom labeled image set.
Activity: Identify Objects from Video
Extend the single image object detection to the example from Video Data Analysis. Test the object detection with the runner video sample.
from ultralytics import YOLO
import time
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the video file
url = 'http://apmonitor.com/dde/uploads/Main/runner.mp4'
cap = cv2.VideoCapture(url)
# Save results video file at 5 fps
out = cv2.VideoWriter('results.mp4',
cv2.VideoWriter_fourcc(*'mp4v'),
5,(1920,1080))
# Loop through the video frames
while cap.isOpened():
# Read a frame from the video
success, frame = cap.read()
if success:
# Run YOLOv8 inference on the frame
results = model(frame)
# Visualize the results on the frame
fr = results[0].plot()
# Display the annotated frame
cv2.imshow("YOLOv8", fr)
# Save frame to video
out.write(fr)
# Pause for 0.2 sec
time.sleep(0.2)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
# Break the loop if the end of the video is reached
break
# Release the video capture object and close the display window
cap.release()
out.release()
cv2.destroyAllWindows()
Detect Objects from Webcam Video
Modify the script to read from a webcam instead of a video file source. Change to cv2.VideoCapture(0) (1st webcam) or cv2.VideoCapture(1) (2nd webcam) if there are multiple webcams. Remove the 0.2 sec time delay, but add a time limit of 30 seconds. Selecting q also quits the video collection. There is no need to save the video. If the video is saved, adjust the frame resolution to the video output dimensions to avoid an error.
from ultralytics import YOLO
import time
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the webcam stream
cap = cv2.VideoCapture(0)
# Record start time
start = time.time()
# Loop through the video frames
while (cap.isOpened()) and ((time.time()-start)<=30.0):
# Read a frame from the video
success, frame = cap.read()
if success:
# Run YOLOv8 inference on the frame
results = model(frame)
# Visualize the results on the frame
fr = results[0].plot()
# Display the annotated frame
cv2.imshow("YOLOv8", fr)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
# Break the loop
break
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()