React is one of the most popular JavaScript framework libraries when it comes to web applications. It is a powerful and heavyweight library destined for the web development environment. React enables developers to create large-scale web applications that can update the data in the DOM without having to reload the page. The core objective of React is to make web application development simple, fast, and scalable.
In this tutorial, we are going to learn how to implement a voice note application in React ecosystem using the SpeechRecognition JavaScript interface. This interface enables us to listen to speech through the browser microphone and recognize each speech and translate them to text. And, we are just going to do that. The idea is to listen to the speech, recognize it, translate the speech, and display it in the UI as a note. This will also involve starting and stopping the recording of speech as well as saving the voice notes.
Let’s get started!
First, we are going to create a new React App project. For that, we need to run the following command in the required local directory:
npx create-react-app voiceNote
After the successful setup of the project, we can run the project by running the following command:
npm run start
// or
yarn start
After the successful build, a browser window will open up showing the following result:
Now, we are going to set up our starter UI template. It will include two sections:
The overall code for this is provided in the code snippet below:
import React, { useState, useEffect } from "react";
import "./index.css";
function App() {
return (
<>
<h1>Record Voice Notes</h1>
<div>
<div className="noteContainer">
<h2>Record Note Here</h2>
<button className="button">
Save
</button>
<button >
Start/Stop
</button>
</div>
<div className="noteContainer">
<h2>Notes Store</h2>
</div>
</div>
</>
);
}
export default App;
The required CSS styles are provided in the code snippet below:
.noteContainer {
border: 1px solid grey;
min-height: 15rem;
margin: 10px;
padding: 10px;
border-radius: 5px;
}
button {
margin: 5px;
}
Hence, we will get the result as displayed in the screenshot below:
As you mat notice, there are two sections. The upper box to record the voice note and the lower box to display the saved notes after the note has been recorded.
Now, we are going to define the state variables required for this overall task to take place. For that, we are going to make use of the useState
hook. The useState
hook defines the state providing two values. One is the state itself and other is the function to update that state. The initialization value can be defined as a parameter of the useState
hook. The required state variables are provided in the code snippet below:
const [isRecording, setisRecording] = useState(false);
const [note, setNote] = useState(null);
const [notesStore, setnotesStore] = useState([]);
Here, we have defined three state variables:
isRecording
: To handle the recording state of the microphone whether it id on or off. The initial value is boolean false.
note
: To take in the recorded voice note and display it as we record. The initial value is null.
notesStore
: To hold the notes that have been saved. It is initialized as an array state.
Each of these variables have their own respective function for updating themselves.
Now, we are going to initialize the microphone that will listen to the speech and convert it into text notes. For that, we are going to use a SpeechRecognition
interface provided by the JavaScript window object that will work on the Chrome browser only. SpeechRecognition
is the controller interface for the recognition service; it also handles the SpeechRecognitionEvent
sent from the recognition service. We are going to initialize this service in the microphone
constant as shown in the code snippet below:
const SpeechRecognition =
window.SpeechRecognition || window.webkitSpeechRecognition;
const microphone = new SpeechRecognition();
microphone.continuous = true;
microphone.interimResults = true;
microphone.lang = "en-US";
Then, we have configured some properties of the microphone:
microphone.continuous
: This property controls the return of continuous results for each recognition or only a single result. The value defaults to false, which returns only single recognitions.
microphone.interimResults
: This property controls whether or not interim results should be returned. The results that are not yet final are called interim results.
microphone.lang
: It returns and sets the language of the current SpeechRecognition
interface.
Now, we are going to define two functions that will be used to record and save a voice note. One is startRecordController
, which will be used to listen to the speech and then convert the result to text and set the note
state to display in the screen. The other is storeNote
function, which will be used to store the recorded voice notes. To store the note, we are using the setnotesStore
function provided by respective useState
hook of the state. The code snippet below only shows the implementation of the storeNote
function as we are going to implement startRecordController
function later:
const startRecordController = () => {
};
const storeNote = () => {
setnotesStore([...notesStore, note]);
setNote("");
};
Now, we need to assign these functions to their respective buttons so that the function is triggered once the buttons are clicked:
return (
<>
<h1>Record Voice Notes</h1>
<div>
<div className="noteContainer">
<h2>Record Note Here</h2>
{isRecording ? <span>Recording... </span> : <span>Stopped </span>}
<button className="button" onClick={storeNote} disabled={!note}>
Save
</button>
<button onClick={() => setisRecording((prevState) => !prevState)}>
Start/Stop
</button>
<p>{note}</p>
</div>
<div className="noteContainer">
<h2>Notes Store</h2>
</div>
</div>
</>
);
Here, we have also set the Recording and Recording Stopped text conditionally based on the isRecording
state. This way, users will know that the recording as started or stopped.
Now, coming to the main function that is startRecordingController
. First, we need to configure the starting and stopping of recording based on the isRecording
state.
If the isRecording
state is false
, we can start the recording using the start() method provided by the microphone
instance. If the isRecording
state is true
, we can stop the recording using the stop()
method. We can track the conversion of voice to text using the onresult
function that returns an event
parameter. The event
parameter holds the result of continuous recognition of speech, which can be stored in an array. The array can get be traversed using the map
function to get the resulting text of each recognition. Then, using the join
method, we can concatenate the results together and store them in the note
state variable. Hence, the note
state will be updated and displayed on the Upper box. We can use the onerror
method to log the errors (if any). The overall implementation of this controller function is provided in the code snippet below:
const startRecordController = () => {
if (isRecording) {
microphone.start();
microphone.onend = () => {
console.log("continue..");
microphone.start();
};
} else {
microphone.stop();
microphone.onend = () => {
console.log("Stopped microphone on Click");
};
}
microphone.onstart = () => {
console.log("microphones on");
};
microphone.onresult = (event) => {
const recordingResult = Array.from(event.results)
.map((result) => result[0])
.map((result) => result.transcript)
.join("");
console.log(recordingResult);
setNote(recordingResult);
microphone.onerror = (event) => {
console.log(event.error);
};
};
};
Now, we need to run this function when the app mounts to the screen as well as each time the value of isRecording
state changes. For that, we can make use of the useEffect
hook. Hence, we need to call our startRecordController
function inside the useEffect
hook and apply a controlling parameter as well. This is shown in the code snippet below:
useEffect(() => {
startRecordController();
}, [isRecording]);
Everything is ready now; we just have to display the stored notes. For that, we can map
through the notesStore
array state and display each stored note in the Lower box section. The template code for this is provided in the code snippet below:
return (
<>
<h1>Record Voice Notes</h1>
<div>
<div className="noteContainer">
<h2>Record Note Here</h2>
{isRecording ? <span>Recording... </span> : <span>Stopped </span>}
<button className="button" onClick={storeNote} disabled={!note}>
Save
</button>
<button onClick={() => setisRecording((prevState) => !prevState)}>
Start/Stop
</button>
<p>{note}</p>
</div>
<div className="noteContainer">
<h2>Notes Store</h2>
**{notesStore.map((note) => (
<p key={note}>{note}</p>
))}**
</div>
</div>
</>
);
Now we can try and record the voice notes.
Note: the implementation and the demo will only work on Chrome browser as SpeechRecognition interface is built for it – it will not work on any other browsers. Hence, we need to run the project as well as demo in the Chrome browser for it to work properly.
The main objective of this tutorial was to implement a voice note app in React programming environment. The overall implementation was simple enough along with the UI. The tutorial provides the basic use of the SpeechRecognition interface to initialize and use the microphone service from the Chrome browser. The use of two essential React hooks, useState
and useEffect
, was also properly demonstrated. We learned how to set up and configure the microphone properties as well as use the basic microphone methods to make voice recognition and translation to text possible. This step-by-step implementation will make it easy for React beginners to grasp the concept and use it to implement their own speech recognition React applications.