- What Can I Do With the Web Audio API?
- Is it Well Supported by Browsers?
- Using the API
- Generating Audio
- How Does OscillatorNode Benefit the Web?
- Using OscillatorNode for Notification Sounds
- Introducing GainNode
- Mixing Things up with AudioParam
- Replaying Source Nodes
- Conclusion
- Frequently Asked Questions (FAQs) about Web Audio API
This article was peer reviewed by Mark Brown and Josh Wedekind. Thanks to all of SitePoint’s peer reviewers for making SitePoint content the best it can be!
The Web Audio API allows developers to leverage powerful audio processing techniques in the browser using JavaScript, without the need for plugins. As well as defining and processing file-based audio sources in real-time, it can synthesize sounds based upon various waveforms; this can be useful for web apps that are often consumed over low-bandwidth networks.
In this tutorial, I’m going to introduce you to the Web Audio API by presenting some of its more useful methods. I’ll demonstrate how it can be used to load and play an mp3 file, as well as to add notification sounds to a user interface (demo).
If you like this article and want to go into this topic in more depth, I’m producing a 5-part screencast series for SitePoint Premium named You Ain’t Heard Nothing Yet!
What Can I Do With the Web Audio API?
The use cases for the API in production are diverse, but some of the most common include:
- Real-time audio processing e.g. adding reverb to a user’s voice
- Generating sound effects for games
- Adding notification sounds to user interfaces
In this article, we’ll ultimately write some code to implement the third use case.
Is it Well Supported by Browsers?
Web Audio is supported by Chrome, Edge, Firefox, Opera, and Safari. That said, at the time of writing Safari considers this browser feature experimental and requires a webkit
prefix.
Using the API
The entry point of the Web Audio API is a global constructor called AudioContext. When instantiated, it provides methods for defining various nodes that conform to the AudioNode interface. These can be split into three groups:
- Source nodes – e.g. MP3 source, synthesized source
- Effect nodes – e.g. Panning
- Destination nodes – exposed by an
AudioContext
instance asdestination
; this represents a user’s default output device, such as speakers or headphones
These nodes can be chained in a variety of combinations using the connect method. Here’s the general idea of an audio graph build with the Web Audio API.
Source: MDN
Here’s an example of converting an MP3 file to an AudioBufferSourceNode and playing it via the AudioContext
instance’s destination
node:
See the Pen Playing an MP3 file with the Web Audio API by SitePoint (@SitePoint) on CodePen.
Generating Audio
As well as supporting recorded audio via AudioBufferSourceNode
, the Web Audio API provides another source node called OscillatorNode. It allows frequencies to be generated against a specified waveform. But what does that actually mean?
At a high level, frequency determines the pitch of the sound measured in Hz. The higher the frequency, the higher the pitch will be. As well as custom waves, OscillatorNode
provides some predefined waveforms, which can be specified via an instance’s type
property:
Source: Omegatron/Wikipedia
'sine'
– sounds similar to whistling'square'
– this was often used for synthesizing sounds with old video game consoles'triangle'
– almost a hybrid of a sine and square wave'sawtooth'
– generates a strong, buzzing sound
Here’s an example of how OscillatorNode can be used to synthesize sound in real-time:
See the Pen Generating sound with OscillatorNode by SitePoint (@SitePoint) on CodePen.
How Does OscillatorNode Benefit the Web?
The ability to synthesize sounds with code will result in a much smaller payload than using files. This is important for maintaining the parity of your application across all sorts of bandwidths, from 2G to 4G. It is impossible to guarantee the speed of a mobile data connection, especially in the emerging markets.
48 percent of those using mobile internet on 2G or 3G are unable to perceive any difference between 2G and 3G services.
Ericsson, The Changing Mobile Broadband Landscape
To demonstrate this, I recorded the above example of OscillatorNode
and encoded it to an MP3 file, using a bit rate that permitted the same sound quality. The resulting file is 10 KB, and, according to Chrome Dev Tools’ network throttling feature, would take 2.15 seconds to load over a regular 2G connection. In this case, the programmatic approach is the clear winner.
Using OscillatorNode for Notification Sounds
Let’s use OscillatorNode
within a real-world example. I mentioned at the beginning of the article that we would add notification sounds to a user interface. If we open this CodePen, we’ll see a messaging app UI. Upon clicking the Send button, a notification will appear to inform us that the message was sent. This boilerplate contains two parts that are of interest to us; an AudioContext
instance called context
, and a function named playSound
.
Before starting, click the Fork button. This will create a copy of the boilerplate to which you can save your changes.
It’s worth mentioning that I have tested this in both Chrome and Firefox, so you should use one of these browsers.
In playSound
, declare a variable named oscillatorNode
and assign to it the return value of context.createOscillator()
:
const oscillatorNode = context.createOscillator();
Next, let’s configure our node. Set its type
property to 'sine'
, and the frequency.value
property to 150
:
oscillatorNode.type = 'sine';
oscillatorNode.frequency.value = 150;
To play our sine wave through our speakers or headphones, call oscillatorNode.connect
, passing it a reference to the context.destination
node. Finally, let’s call oscillatorNode.start
, followed by oscillatorNode.stop
, passing into it a parameter of context.currentTime + 0.5
; this will stop the sound after 500 milliseconds has passed according to our AudioContext's
hardware scheduling timestamp. Our playSound
method now looks like this:
function playSound() {
const oscillatorNode = context.createOscillator();
oscillatorNode.type = 'sine';
oscillatorNode.frequency.value = 150;
oscillatorNode.connect(context.destination);
oscillatorNode.start();
oscillatorNode.stop(context.currentTime + 0.5);
}
Upon saving our changes and hitting Send, we’ll hear our notification sound.
Introducing GainNode
Needless to say, this is pretty garish. Why not use an effect node to make this sound more pleasing? GainNode is one example of an effect node. Gain is a means of altering the amplitude of an input signal, and in our case, it enables us to control the volume of an audio source.
Below the declaration of oscillatorNode
, declare another variable called gainNode
, and assign to it the return value of context.createGain()
:
const gainNode = context.createGain();
Under the configuration of oscillatorNode
, set gainNode
‘s gain.value
property to 0.3. This will play the sound at 30% of its original volume:
gainNode.gain.value = 0.3;
Finally, to add the GainNode
to our audio graph, pass gainNode
to oscillatorNode.connect
, then call gainNode.connect
, to which we’ll pass context.destination
:
function playSound() {
const oscillatorNode = context.createOscillator();
const gainNode = context.createGain();
oscillatorNode.type = 'sine';
oscillatorNode.frequency.value = 150;
gainNode.gain.value = 0.3;
oscillatorNode.connect(gainNode);
gainNode.connect(context.destination);
oscillatorNode.start();
oscillatorNode.stop(context.currentTime + 0.5);
}
Upon saving our changes and hitting Send, we’ll hear that our sound plays more quietly.
Mixing Things up with AudioParam
You may have observed that, in order to set the frequency of our OscillatorNode
, and the gain of our GainNode
, we had to set a property called value
. The reason for this contract is that gain
and frequency
are both AudioParams. This is an interface that can be used to not only set specific values, but also scheduled, gradually-changing values. AudioParam
exposes a number of methods and properties, but three important methods are:
setValueAtTime
– immediately changes the value at the given timelinearRampToValueAtTime
– schedules a gradual, linear, change of the value within a given end timeexponentialRampToValueAtTime
– schedules a gradual, exponential change of a value. As opposed to a linear change, which is constant, an exponential change will increase or decrease by larger increments as the scheduler approaches the end time. This can be preferable as it sounds more natural to the human ear
We’re now going to exponentially ramp both the frequency and the gain. In order to use the exponentialRampToValueAtTime
method, we need to schedule a prior event. Replace oscillatorNode.frequency.value
with a call to oscillatorNode.frequency.setValueAtTime
. Pass the same frequency of 150
Hz, and schedule it immediately by passing context.currentTime
as a second parameter:
oscillatorNode.frequency.setValueAtTime(150, context.currentTime);
Below the invocation of setValueAtTime
, call oscillatorNode.frequency.exponentialRampToValueAtTime
, with a value of 500
Hz. Schedule this 0.5
seconds from the scheduled start time:
oscillatorNode.frequency.exponentialRampToValueAtTime(500, context.currentTime + 0.5);
Upon saving and clicking Send, you’ll hear that the frequency increases as playback progresses.
To wrap things up, replace the setting of gainNode.gain.value
with an invocation of gainNode.gain.setValueAtTime
in the same vein as our OscillatorNode
‘s frequency:
gainNode.gain.setValueAtTime(0.3, context.currentTime);
To fade out the sound, exponentially ramp the gain to 0.01
over 0.5
seconds:
function playSound() {
const oscillatorNode = context.createOscillator();
const gainNode = context.createGain();
oscillatorNode.type = 'sine';
oscillatorNode.frequency.setValueAtTime(150, context.currentTime);
oscillatorNode.frequency.exponentialRampToValueAtTime(500, context.currentTime + 0.5);
gainNode.gain.setValueAtTime(0.3, context.currentTime);
gainNode.gain.exponentialRampToValueAtTime(0.01, context.currentTime + 0.5);
oscillatorNode.connect(gainNode);
gainNode.connect(context.destination);
oscillatorNode.start();
oscillatorNode.stop(context.currentTime + 0.5);
}
Upon hitting Save and Send, you’ll hear that our notification sound gets quieter over time. Now we’re sounding more human.
Here’s the completed demo.
See the Pen Notification Sounds with OscillatorNode by SitePoint (@SitePoint) on CodePen.
Replaying Source Nodes
Before concluding this article, it’s important to note a point of confusion for those who are getting to grips with the API. To play our sound again after it has finished, it would seemingly make sense to write this:
oscillatorNode.start();
oscillatorNode.stop(context.currentTime + 0.5);
oscillatorNode.start(context.currentTime + 0.5);
oscillatorNode.stop(context.currentTime + 1);
Upon doing this, we’ll observe that an InvalidStateError
is thrown, with the message cannot call start more than once
.
AudioNode
s are cheap to create, thus the design of the Web Audio API encourages developers to recreate audio nodes as and when they’re needed. In our case, we would have to call the playSound
function again.
Conclusion
I hope you have enjoyed this introduction to sound synthesis with the Web Audio API. We’ve demonstrated one of its many use cases, although the rise of notification sounds on websites and web apps is an interesting UX question that will only be answered over time.
If you want to learn more about the Web Audio API, I’m producing a 5-part screencast series for SitePoint Premium named You Ain’t Heard Nothing Yet!. The first episode is available to watch now.
Are you using the Web Audio API in your web pages and apps? I’d love to hear about your experiences and use cases in the comments below.
Frequently Asked Questions (FAQs) about Web Audio API
What is the Web Audio API and how does it work?
The Web Audio API is a high-level JavaScript API for processing and synthesizing audio in web applications. It works by creating audio nodes that process and generate sound, which are then connected together to form an audio routing graph. This graph allows for complex audio operations to be performed, such as mixing, processing, and filtering. The API also provides a number of built-in nodes for common audio operations, such as gain control, panning, and convolution.
How can I start using the Web Audio API in my web application?
To start using the Web Audio API, you first need to create an instance of the AudioContext interface, which represents the overall audio environment and provides the functionality for creating nodes and controlling their connections. Once you have an AudioContext, you can create and connect nodes to process and generate sound. The API also provides a number of built-in nodes for common audio operations, such as gain control, panning, and convolution.
What are the main components of the Web Audio API?
The main components of the Web Audio API are the AudioContext, AudioNodes, and AudioParams. The AudioContext is the main interface for creating and controlling the audio environment. AudioNodes are the building blocks of the audio routing graph, and they process and generate sound. AudioParams are used to control the parameters of the audio nodes, such as volume and frequency.
How can I control the volume of the audio with the Web Audio API?
You can control the volume of the audio with the Web Audio API by using the GainNode. The GainNode is an AudioNode that can be used to control the overall gain, or volume, of the audio. You can create a GainNode by calling the createGain method on the AudioContext, and then connect it to the audio source and destination.
Can I use the Web Audio API to play sound files?
Yes, you can use the Web Audio API to play sound files. The API provides the AudioBufferSourceNode, which can be used to play audio data stored in an AudioBuffer. You can load sound files into an AudioBuffer by using the fetch API to retrieve the file, and then decoding the audio data with the decodeAudioData method of the AudioContext.
How can I apply effects to the audio with the Web Audio API?
You can apply effects to the audio with the Web Audio API by using ConvolverNodes and BiquadFilterNodes. ConvolverNodes can be used to apply convolution effects, such as reverb, to the audio. BiquadFilterNodes can be used to apply a variety of filter effects, such as low-pass and high-pass filters.
Can I synthesize sound with the Web Audio API?
Yes, you can synthesize sound with the Web Audio API. The API provides the OscillatorNode, which can be used to generate periodic waveforms. You can control the frequency and type of the waveform with the frequency and type attributes of the OscillatorNode.
How can I visualize the audio with the Web Audio API?
You can visualize the audio with the Web Audio API by using the AnalyserNode. The AnalyserNode can be used to capture real-time frequency and time-domain data, which can then be used to create visualizations of the audio.
Can I use the Web Audio API in all browsers?
The Web Audio API is widely supported in modern browsers, including Chrome, Firefox, Safari, and Edge. However, it may not be fully supported in older browsers or some mobile browsers. You can check the compatibility of the Web Audio API on websites like Can I use.
Where can I learn more about the Web Audio API?
You can learn more about the Web Audio API from the official specification on the W3C website, as well as from various online tutorials and guides. The Mozilla Developer Network (MDN) also provides a comprehensive guide to the Web Audio API, including detailed explanations of the API’s interfaces and methods, as well as examples and demos.
James is a full-stack software developer who has a passion for web technologies. He is currently working with a variety of languages, and has engineered solutions for the likes of Sky, Channel 4, Trainline, and NET-A-PORTER.