Text detection and extraction involve finding and reading text from images. In Python, we can use OpenCV for image processing and an OCR(Optical character recognition) tool like Tesseract for reading text.
How to detect text and extract it using OpenCV and OCR in Python.
Step 1: Installation
- Install Tesseract OCR
1. Download and install tesseract from https://github.com/tesseract-ocr/tesseract
2. Install Pytesseract by:pip install pytesseract
- Install OpenCV:
pip install opencv-python
step 2: Example code
# import cv2 import numpy as np import pytesseract # Path to Tesseract executable pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" #Load Image img = cv2.imread("untitled.png") #OCR on the full preprocessed image text = pytesseract.image_to_string(img) # Convert image to grayscale def get_grayscale(image): return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Apply thresholding to preprocess the image def thresholding(image): return cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1] # Noise removal def remove_noise(image): return cv2.medianBlur(image, 5) #Display Results print(text) cv2.imshow("Img",img) cv2.waitKey(0) #
Output:
Here is the link to the image:https://photos.app.goo.gl/V1BBDG3RJP5ijWZHA
# don't stare too long you'll miss the train #
Step 3: Explanation
-
Grayscale
image
is the original color image loaded usingcv2.imread()
.
cv2.COLOR_BGR2GRAY
converts the image from Blue-Green-Red (BGR) to Grayscale. -
Thresholding
Thresholding makes the text more distinct, aiding OCR.
255
: The maximum value assigned to pixels above the threshold.
cv2.THRESH_BINARY
: The type of thresholding -
Noise removal
(cv2.medianBlur) is perfect for salt-and-pepper noise.
-
Text extraction
Tesseract OCR (
pytesseract.image_to_string
) extracts text from the processed image.