How to Record Audio in JavaScript
This post demonstrates how to create and use an audio recording API in JavaScript. Audio Recording can be a pretty neat feature to add to your website. So, buckle up!
This is written with reference to the Media Devices API and the Media Recorder API documented in MDN as well as Bryan Jennings’s post found here.
To understand how to implement the audio recording API, I’ll walk you through the steps to help you understand the functionality of different parts of the source code. Let’s do this! 🎙😀
1. Create an object that represents the audio recording API
To start, we must determine the structure of the API and the following shows the basic structure for the audio recording API.
The API object exposes 3 functions, which are:
- start(): The function that will start the audio recording.
- stop(): The function that will stop the audio recording.
- cancel(): The function that will cancel the audio recording.
Note that the start() and stop() functions return a promise as both functions will be dealing with asynchronous logic.
2. Detect the support of the MediaDevices API and the getUserMedia method in the browser
Before using the MediaDevices API, which gives you access to the connected media devices like the microphone, or the getUserMedia() method on that API, you should ensure browser support of this feature.
The following code is the feature detection code for the MediaDevices API and getUserMedia() method.
The detection code should be added before the first line of code that uses the API and it’s method. In this case, it will be the first line of the start() function.
If the feature is not supported by the browser running the code, the start() function will return a custom error when the promise is rejected. The rejection will fire an error that will be caught by the catch() method in the calling application when the API is consumed. At that point, you can alert the user to let him know.
The following code demonstrates how your application can detect and handle the lack of browser support for the audio recording feature.
3. Create the Audio Stream
In order to start the audio recording, an audio stream must be created. This is done by calling the getUserMedia() method on the navigator.mediaDevices interface.
To use the getUserMedia() method, you must pass a MediaStreamConstraints object to it. This object determines the type of media to request. In our case, we want the audio media, so the constraint object will look like the following:
{ audio: true }
The getUserMedia() will return a promise and when that promise resolves, the audio stream will be successfully created and available in the .then
callback function of that promise as an object of type MediaStream.
A MediaStream object is a stream of tracks of the requested media type, where each track is of type MediaStreamTrack.
The following code demonstrates creating the audio stream in the start() function, where the constraint parameter requests audio only.
The user will be asked to give access permission to the media device at this point if permission was not previously granted as it’s required for media device access. If user denies access, this will result in a NotAllowedError
.
The audio stream creation promise could be rejected for multiple reasons, where the error object is of type DOMException. The following table explains the possible rejection errors as explained in MDN Web Docs.
Note: No error will be caught in the audio recording API as any .then
chaining in the application after a .catch
in the audio recording API will be executed. So, catching an error in the API won’t trigger the error back in the calling application.
The following code demonstrates error handling structure to add to your application based on what cases you want to handle.
4. Start Recording
Once the MediaStream is created successfully, the stream will be passed to the MediaRecorder constructor to create a MediaRecorder instance of the given stream. The MediaRecorder API provides the media recording functionalities.
To start the recording, the start() function on the MediaRecorder must be called. This function records the media from the stream into one or more Blob objects
The start() function on MediaRecorder can take an optional parameter known as the timeslice
, which is explained in MDN Web Docs as:
The number of milliseconds to record into each
Blob
.
Based on the timeslice
value, the MediaRecorder.dataavailable event will fire an audio Blob each time the timeslice
duration passes until it runs out of audio media.
In the Audio Recording API, timeslice
won't be passed, which will record the media into a single Blob and eventually trigger the event once. That is unless the requestData method is called, which obtains the Blob saved so far and triggers the creation of a new Blob that the media will continue to record into instead.
However, If you decide to pass the timeslice
argument, make sure to store all fired audio blobs in a list to create a single blob once the recording is over, if that’s your desired approach.
Note: The start function sets the MediaRecorder.state from inactive
to recording
when the recording successfully starts. The following table explains the different states of the MediaRecorder based on MDN Web Docs.
The following code demonstrates starting the audio recording, where the audio blob(s), the MediaRecorder instance and the stream being captured are properties maintained throughout the API for usage in other functions when needed.
The start() function in the audio recording API will return a promise that will be returned back to the calling application. When that promise resolves, the calling application will know that the audio recording has successfully started through the execution of the.then()
callback function. At that point, the calling application can reflect that information in the user interface.
In the following code, when the start() function successfully starts recording, the text “Recording Audio…”
will be logged in the console by the calling application.
If something goes wrong with the audio recording API’s start() function, an error will be thrown for the calling application to catch. The following table explains the possible errors as explained in MDN Web Docs.
Note: If the browser could not start or continue the recording, a DOMError event will be fired and the MediaRecorder.dataavailable event will be triggered to fire the blob it gathered so far. After that, the MediaRecorder.stop event will be fired.
5. Stop Recording
In order to stop the audio recording, the stop() method on the MediaRecorder must be called. When the stop() method is called, the following takes place:
- The MediaRecorder.state is set to
inactive
and the media capturing stops. - A MediaRecorder.dataavailable event fires the Blob that represents the recorded audio.
- A MediaRecorder.stop event is fired.
Note: If the stop() method is called while the MediaRecorder.state is inactive
, an InvalidState error will be fired as a result of trying to stop a media capturing that’s already stopped.
In order to provide the stop recording feature in our audio recording API, a stop() function will be implemented. This function will return a promise that resolves to the recorded audio as a blob once the recording has stopped. A promise will be expected from this method as the process of stopping the recording is asynchronous.
The main components of the stop() function are actually stopping the recording by calling the stop() method on the MediaRecorder while also listening to the MediaRecorder.stop event to return the recorded audio as a Blob once the recording has successfully stopped and the full audio Blob is obtained.
While this may seem enough, doing that will not stop the red recording dot from flashing in your website’s browser tab. The reason for that is when we stop the MediaRecorder, the stream will still be active. Therefore, the stream needs to be stopped. To stop an active stream, all tracks on it must be stopped. In our case, it’s a single media track.
The following code demonstrates stopping the audio recording, which includes resetting the API properties once the recording has stopped.
In the calling application, you can simply call the stop() function on the API and wait for it to resolve.
6. Cancel Recording
To implement the cancel functionality in the audio recording API, you just need to repeat everything we implemented in the stop() function in Step 5. The only difference is that the MediaRecorder.stop event will not be handled as we will not be returning the audio Blob to the calling application as the recording has been cancelled.
The following code demonstrates cancelling the audio recording. To avoid code redundancy, we extract everything from the stop() function into the cancel() function, excluding the MediaRecorder.stop event handler. We will then simply call the cancel() function from the stop() function.
In the calling application, you can simply call the cancel() function on the API and wait for it to resolve.
Code Demo
This code demo demonstrates the usage of the explained audio recording API that can be used in your website.
In the code demo, you will be able to:
- Start the recording
- Stop or cancel the recording
- Hear the recording played back on stop
Note: The embedded CodePen might not work in some browsers as it blocks microphone access. In such case, just open the demo using this link and it will work like a charm.
You can also clone the source code from my Github using the following link.