Extract Text From an Image Using Python

Extract Text From an Image Using Python

Welcome! In this tutorial, we will explore how to extract text from images using Python. Sounds interesting, right? You can easily do this using Optical Character Recognition (OCR).In this tutorial, We will extract text from images using Python, Pillow , and Tesseract OCR engine.

Step 1: Install Python Libraries

Pillow: A Python Imaging Library that provides image processing capabilities.
Pytesseract: A Python wrapper for Google’s Tesseract-OCR Engine.
Tesseract-OCR: The actual OCR engine

pip install Pillow pytesseract

Step 2 : Install Tesseract-OCR

For Windows:

Download the Tesseract installer from UB Mannheim’s website.
Run the installer and complete the installation.)
Add Tesseract to your system’s path variable (eg. C:\Program Files\Tesseract-OCR)

For macOS (using Homebrew):

brew install tesseract

For Linux (Ubuntu)

Now,Verify the Tesseract Installation

To ensure Tesseract-OCR is installed correctly, run the following command in your terminal. you will see a version if installation is successful.

tesseract --version

Step 3: Write the Python Script/Code

Now let’s write a Python script to load an image and extract its text.

3.1 Import Required Libraries

3.3 Load the Image

3.5 Print the Extracted Text

3.6 Full Code Example

Here’s the complete script:

# Import necessary libraries
from PIL import Image
import pytesseract 

# Set the Tesseract path for Windows ( comment this line if using other operating systems )
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'   

# Load the image
image_path = 'path_to_your_image.jpg' # Replace with your image file path
image = Image.open(image_path)

# Perform OCR i.e. extract text from image
extracted_text = pytesseract.image_to_string(image)

# Print the result
print("Extracted Text:\n", extracted_text)

Step 4: Run Your Script

Save the script as extract_text_from_img.py and run it in your terminal or command prompt:

Navigate to script directory

cd path\to\your\script

Run the script using the following terminal command

python extract_text_from_img.py

Extract Text From an Image Using Python

Extract Text From an Image Using Python

Step 1: Install Python Libraries

Step 2 : Install Tesseract-OCR

Now,Verify the Tesseract Installation

Step 3: Write the Python Script/Code

3.1 Import Required Libraries

3.3 Load the Image

3.4 Extract Text Using OCR

3.5 Print the Extracted Text

3.6 Full Code Example

Step 4: Run Your Script

Leave a Comment Cancel Reply

Extract Text From an Image Using Python

Step 1: Install Python Libraries

Step 2 : Install Tesseract-OCR

Now,Verify the Tesseract Installation

Step 3: Write the Python Script/Code

3.1 Import Required Libraries

3.3 Load the Image

3.4 Extract Text Using OCR

3.5 Print the Extracted Text

3.6 Full Code Example

Step 4: Run Your Script

Related Posts

Leave a Comment Cancel Reply