Voice Recording and Transcribing using Python

Automated voice recording and transcription using sounddevice and SpeechRecognition modules in Python.

In this project, we'll first record our voice using the sounddevice python module, and the voice will be stored locally as a '.wav' extension file. We then take the audio as an input for the SpeechRecognition module that uses google's API recognize_google() to return a string that is the transcript of our recording.

sounddevice provides functions to play and record NumPy arrays of audio signals which can then be written into a ".wav" audio file using wavio module.

The requirements text file contains the packages that are necessary to run this application. To install and deploy the packages, go to the project directory using command prompt and execute the following command.

pip install -r requirements.txt

Once the packages are installed, the application is good to go. To run the app, type

python app.py

Implementation Steps:

1) Import necessary packages

import sounddevice as sd
import wavio as wv
import speech_recognition as sr

2) Set Sampling Frequency for recording, which is usually between 44000 and 48000 and the recording duration.

frequency=44100
duration=7

3) Start the sounddevice module's recorder and initialize the function with the given frequency and duration and set the channel value.

recording = sd.rec(int(duration * freq),
                   samplerate=frequency, channels=2)

4) Convert the numpy array generated to ".wav" audio file and use that as an input to transcribe using SpeechRecognition module and return the string using recognize_google() instance of Recognizer class.

wv.write("recording.wav", recording, frequency, sampwidth=2)

def transcribe():
    audio = sr.AudioFile('recording.wav')
    with audio as source:
        audio = r.record(source)

print(r.recognize_google(audio))

Coders Packet

Voice Recording and Transcribing using Python

Comments