Scan for a image on the screen using Python

In this tutorial, we shall learn how to automatically detect an image on your screen using Python. We’ll manage to create an efficient means of searching for any image on the screen and retrieve its location using PyAutoGUI for capturing screenshots and OpenCV for image processing and template matching.

Image detection on the screen

Key Libraries and Tools:

PyAutoGUI : The major application is to automate mouse and keyboard actions, and to capture images. Here in the given code, it has captured the current screen as an image.
OpenCV : It is one of the powerful libraries for computer vision tasks. This code was used for loading, processing, and having template match to search for an image on the screen.
NumPy: This library is primarily used for the manipulation of arrays along with general operations with image data in a very efficient manner. In this code snippet, it is utilized to transform the screenshot into an array that OpenCV can deal with.

NumPy is short for Numerical Python. It’s a library to handle large, multi-dimensional arrays and matrices alongside a large collection of high-level mathematical functions to operate on these arrays. It is an all-purpose extensible library for the Python programming language that underlies most scientific computing in Python and has comprehensive applications for scientific computing, data analysis, machine learning, and image processing.

OpenCV is a widely used open-source library for computer vision and image processing and applies machine learning. It is broadly used in robotics, artificial intelligence, video analysis, and image processing. The tools as well as algorithms in OpenCV enable developers to manipulate images, videos, or real-time data for purposes such as object detection, image segmentation, and motion tracking.

Python code

import pyautogui
import cv2
import numpy as np

def locate_image_on_screen(image_path, confidence=0.7):  # Lower confidence level
    # Capture the screen
    screen = pyautogui.screenshot()

    # Convert the screenshot to a numpy array
    screen_np = np.array(screen)

    # Convert RGB to BGR (which OpenCV uses)
    screen_np = cv2.cvtColor(screen_np, cv2.COLOR_RGB2BGR)

    # Load the template image
    template = cv2.imread(image_path)

    # Debugging: Check if the image is loaded and print its dimensions
    if template is None:
        print("Error: Could not load image. Check the file path and image format.")
        return None
    else:
        print(f"Template image loaded with dimensions: {template.shape}")

    # Print screen dimensions for debugging
    print(f"Screen dimensions: {screen_np.shape}")

    # Resize the template if necessary
    desired_width, desired_height = 188, 180  # Adjust these values to match the size on your screen
    template = cv2.resize(template, (desired_width, desired_height))

    # Perform template matching
    result = cv2.matchTemplate(screen_np, template, cv2.TM_CCOEFF_NORMED)

    # Get the location of the best match with the given confidence level
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)

    # Debugging: Print matching values
    print(f"Max value: {max_val}, Max location: {max_loc}")

    if max_val >= confidence:
        # Calculate the center of the detected region
        h, w, _ = template.shape
        center_x = max_loc[0] + w // 2
        center_y = max_loc[1] + h // 2

        # Draw rectangle around the match for visualization
        top_left = max_loc
        bottom_right = (top_left[0] + w, top_left[1] + h)
        cv2.rectangle(screen_np, top_left, bottom_right, (0, 255, 0), 2)

        # Show the result
        cv2.imshow('Detected', screen_np)
        cv2.waitKey(0)  # Press any key to close the window
        cv2.destroyAllWindows()

        return (center_x, center_y)
    else:
        print("Image not found on the screen.")
        return None

# Example usage
image_path = r"C:\Users\pavan\OneDrive\Desktop\image scanning\Screenshot 2024-05-05 192559.png"  # Use the correct absolute path to the image
location = locate_image_on_screen(image_path)

if location:
    print(f"Image found at location: {location}")
else:
    print("Image not found on the screen.")

Output


Template image loaded with dimensions: (180, 188, 3)
Screen dimensions: (1080, 1920, 3)
Max value: 1.0, Max location: (861, 452)

How code works

Screen Capture: We capture the current window screen using pyautogui.screenshot(). This capture opens up as a numpy array, which is then processed using OpenCV.
Image Loading: Load the image we want to find on screen by using cv2.imread(). If an image does not load successfully-probably because of a wrong file path or an unsupported format-the function will print out an error message and terminate.
Template Matching: OpenCV has a strong feature called template matching. It uses a small image, known as a template, to compare it with the other image, screen capture in order to find a most similar region. In this code, we make use of the function cv2.matchTemplate() for performing the match.
Confidence Threshold: Once matched, the code retrieves the best match and inspects its confidence level. Provided the confidence exceeds the given threshold value now by default set to 0.7, we may assume the image is seen within the screen.
Rectangle and Visual Feedback: For proper graphical verification, it simply draws a rectangle on the detected area on the screen and shows the result using OpenCV’s cv2.imshow() function.
Center Coordinates: There is a calculation of the coordinates of the center for the matched region as a result of a successful match and returned. It could be an object to click on the image with PyAutoGUI.

Image detection on the screen

Python code

Related Posts

Leave a Comment Cancel Reply