Speech is first converted from physical sound to electrical energy using a microphone and then to digital data using an analog to digital converter. This digital data can be converted into text using various algorithms
Multiple speech recognition packages are available in Python, all of which provide different functionalities. One of the packages is the SpeechRecognition package that can be installed by running the following command on the terminal:
pip install SpeechRecognition
After installing this package, we can implement the speech recognition functionality of Python, as shown below:
import speech_recognition as srdef takecommand():r = sr.Recognizer()with sr.Microphone() as source:print('listening....')r.pause_threshold = 1audio = r.listen(source, timeout=3, phrase_time_limit=5)try:print("Recognizing....")query = r.recognize_google(audio, language= 'en-in')print("Let's talk about {}.".format(query))except Exception as e:print("voice not recognized")
pause_threshold value is the number of seconds the system will take to recognize the voice after the user has completed their sentence.timeout value is the maximum number of seconds the system will wait for the user to say something before it throws an OSError exception.phrase_time_limit value indicates the number of seconds the user can speak. In this case, it is 5. This means that if the user will speak for more than 5 seconds, that speech will not be recognized.The code above can only recognize the speech if it is in the English language since the language is set as except block, and "voice not recognized" will be printed as we have encountered an exception.