Audio Book using Python and Speech Synthesis API

Using Flask as backend and HTML, JavaScript as frontend, It takes the pdf and page no from the user and converts the specified pages into audio. The audio is generated using Speech Synthesis API

This project is built on Flask(Python) as a backend to handle the pdf document, the speech synthesis API is handled by javascript inside the HTML page.

On uploading the PDF book, a random location is allocated for the file and after processing it is deleted. The specified pages are converted into an image to process it with tesseract-OCR as it can read all legible text from the page, regular pdf reading module can read only typed text in the pdf.

The read text is formatted into a proper paragraph(removing the white spaces) and is sent to the HTML page (upload.html), the HTML page, in turn, sends the text to javascript, Where the speech synthesis API converts it to audio.

Installation:

1) Install all modules in requirements.txt

2) Download Tesseract-OCR and Poppler and add its bin directory in the code.

3) Run app.py setting ur local machine as host.

4) Upload the pdf and let the web-app to read it for you, the user has got pause, play and stop option to control the audio speech.

Audio Book using Python and Speech Synthesis API

Project Files

Comments (0)

Leave a Comment

Rating

Author