Coders Packet

Image Text to Speech using Python

By Rohit Mantri

This is a simple python project that extracts the text from the image. Store the text into a text file. Then retrieves that text from the file and convert it into a speech.


In this project, we need to import four modules-

1) pytesseract

Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images. Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine.

2) image

The Image module provides a class with the same name which is used to represent a PIL image. The module also provides a number of factory functions, including functions to load images from files, and to create new images.

3) gTTS

gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate's text-to-speech API.

4) os

The OS module in Python provides functions for interacting with the operating system.



1) For this project we need to install tesseract-ocr which is an optical character recognition engine so that pytesseract can use this to extract text from the image.

2) We select the image through which text is needed to be extracted

3) We extract the text and store the text into a text file

4) We take the text from the text file

5) We then create a gTTS object and use this text and select the language as English and we select slow = False because our converted audio will have a high speed

6) to open the audio file automatically we have to import os os.system()



Download Complete Code


No comments yet