The following tutorial details how to identify text on screen using Python libraries. The following identifies text on screen using Tesseract. It captures an image of any portion or whole of your computer screen, processes the image, and decodes text with the help of OCR.
Text detection on the screen
Key Libraries and Tools:
- PyAutoGUI: A python library for controlling mouse and keyboard actions. It is also able to take snapshots of the screen.
- Pillow: A Python image library to handle images.
- Tesseract-OCR: It features the powerful OCR engine from Google- Tesseract-OCR for text recognition.
- pytesseract: This is the Python wrapper for Tesseract, making it possible to integrate the Tesseract engine into any Python script easily.
- OpenCV: It is referred to as a computer vision library whose operations mainly centre upon image processing
Tesseract-OCR is an open-source Optical Character Recognition (OCR) engine, developed by Google. Tesseract is basically designed to recognize text in images and translate it into a digital format that can be edited. It is also enabled to process multiple languages on several platforms.
Python Script
from PIL import Image import pytesseract import pyautogui import cv2 import numpy as np # Set the path to the tesseract pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Capture a screenshot of a specific region region = (0, 0, 1920, 1080) screenshot = pyautogui.screenshot(region=region) screenshot.save('form_filled.png') # Convert the screenshot to a numpy array screenshot_np = np.array(screenshot) # Convert to grayscale gray_screenshot = cv2.cvtColor(screenshot_np, cv2.COLOR_BGR2GRAY) #Extract text text = pytesseract.image_to_string(gray_screenshot) print("Detected text:") print(text)
Output
India.txt x + File Edit View India, located in South Asia, is the world’s seventh-largest nation by area and its most populous, with over 1.4 billion people. It has a history going back thousands of years, but India is the birthplace of the biggest religion systems, including Hinduism, Buddhism, Jainism, and Sikhism, Indian culture is extremely diverse because of centuries-long influence by indigenous kingdoms and foreign invasions and colonization. The country finally emerged from the shackles of British rule in 1947. Mahatma Gandhi's non-violent resistance was of great importance to the freedom movement. Itis one of the fastest growing economies in the world today, which is by information technology, agriculture, and manufacturing services. India is also an innovation hub and home to many startups. The political system of India is that of the world's biggest democracy, consisting of a federal system composed of 28 states and 8 union territories. New Delhi is the capital India's diversity goes even into its geography: from the Himalayas to the beaches in southern India, everything in between is included. Linguistically, the landscape is just as vibrant with a space for 22 officially recognized languages, including the many usage of Hindi and English. Festivals like Diwali, Eid, and Christmas reflect the unity in India's diversity, something that makes India one of the most unique melting pots of cultures and beliefs.