Video Accessibility: Captions, Transcripts, and Audio Descriptions
Video is everywhere. Product demos, tutorials, webinars, social media clips, training content. If your website includes video and it's not accessible, you're shutting out millions of people. We're talking about deaf and hard-of-hearing users who can't hear the audio, blind and low-vision users who can't see what's happening on screen, and people watching in noisy environments or in a language that isn't their first.
I've scanned a lot of sites with our free accessibility checker, and video accessibility is one of the most commonly overlooked areas. People spend weeks getting their color contrast and alt text right, then embed a YouTube video with no captions and call it a day. That's a WCAG failure, and in some cases, it's an ADA liability.
This guide covers everything you need to know about making video content accessible: captions, transcripts, audio descriptions, accessible players, and the code to make it all work.
In This Article
- Why Video Accessibility Matters
- Captions vs. Subtitles
- How to Add Captions to YouTube Videos
- Vimeo Caption Options
- Self-Hosted Video Captions (WebVTT)
- When You Need Transcripts
- Audio Descriptions Explained
- Accessible Video Players
- Auto-Play and Accessibility
- Embedding Video Accessibly
- Testing Video Accessibility
Why Video Accessibility Matters
About 466 million people worldwide have disabling hearing loss, according to the World Health Organization. In the US alone, roughly 15% of adults report some degree of hearing difficulty. These users rely on captions to understand video content. Without captions, your video might as well be a blank screen for them.
But it's not just about hearing disabilities. Captions benefit a surprisingly wide audience:
- Non-native speakers who understand written English better than spoken English
- People in noisy environments like offices, public transit, or gyms
- People in quiet environments who can't turn on audio (libraries, late at night with a sleeping partner)
- People with cognitive or learning disabilities who process information better when they can read along
- Anyone searching for specific content within a video (captions make video content searchable)
From a legal standpoint, WCAG 2.2 AA has specific requirements for video content. Success Criterion 1.2.2 requires captions for prerecorded audio content in synchronized media. Success Criterion 1.2.3 requires audio descriptions or a media alternative for prerecorded video. If you want to understand how these fit into the broader standard, our WCAG overview breaks it all down.
And then there's SEO. Search engines can't watch your videos, but they can read your captions and transcripts. Adding text alternatives to your video content gives search engines more content to index, which can drive organic traffic to your pages.
Captions vs. Subtitles: What's the Difference?
People use these terms interchangeably, but they're different things, and the distinction matters for accessibility.
Subtitles are a text version of the dialogue. They assume the viewer can hear the audio but might need a translation or a text version of what's being said. Subtitles typically only include spoken words.
Captions include everything a deaf or hard-of-hearing viewer needs to understand the full audio experience. That means dialogue, yes, but also sound effects ("[door slams]"), music descriptions ("[upbeat jazz music]"), speaker identification ("SARAH:"), and other non-speech audio cues ("[phone ringing in background]").
There are two types of captions:
- Closed captions (CC) can be turned on and off by the viewer. This is what you see on YouTube when you click the CC button. These are the standard for web video.
- Open captions are burned into the video itself and can't be turned off. These are less common on the web but sometimes used for social media videos where the platform doesn't support closed captions.
For WCAG compliance, you need captions, not just subtitles. Your captions should include non-speech audio information that's relevant to understanding the content.
How to Add Captions to YouTube Videos
YouTube offers several ways to add captions, and the quality varies wildly between them.
Auto-Generated Captions
YouTube automatically generates captions using speech recognition. This is better than nothing, but "better than nothing" is a low bar. Auto-captions typically have an accuracy rate of around 60-80%, depending on audio quality, accents, background noise, and technical vocabulary. That might sound acceptable until you realize that a 70% accuracy rate means roughly one in three words could be wrong.
Auto-captions also don't include non-speech audio cues, speaker identification, or proper punctuation. They're a starting point, not a finished product. If you're relying solely on YouTube's auto-captions, you're not meeting WCAG requirements.
Editing Auto-Captions
The fastest approach is to let YouTube generate captions, then manually edit them. In YouTube Studio, go to your video, click "Subtitles" in the left sidebar, then click the auto-generated captions and select "Duplicate and edit." This gives you a timestamped transcript you can fix without starting from scratch. Go through it line by line, correcting errors, adding punctuation, and inserting non-speech audio descriptions where needed.
Uploading an SRT or VTT File
If you've created your captions externally (using a tool like Descript, Otter.ai, or a professional captioning service), you can upload them directly. YouTube accepts SRT, VTT, SBV, and several other formats. In YouTube Studio, go to Subtitles, click "Add language," then "Add" next to your language, and choose "Upload file."
Here's what a basic SRT file looks like:
1
00:00:01,000 --> 00:00:04,500
Welcome to our tutorial on form accessibility.
2
00:00:05,000 --> 00:00:09,200
Today we'll cover label elements,
error handling, and keyboard navigation.
3
00:00:10,000 --> 00:00:14,000
[upbeat intro music]
4
00:00:15,000 --> 00:00:19,500
SARAH: Let's start with the most common
mistake I see on contact forms.
Notice the non-speech audio cue on line 3 and the speaker identification on line 4. Those details are what separate real captions from basic subtitles.
Vimeo Caption Options
Vimeo's caption support is solid, though the workflow differs from YouTube.
On paid Vimeo plans, you can upload SRT or VTT caption files directly to your videos. Go to your video settings, click "Distribution," then "Subtitles," and upload your file. Vimeo also supports auto-captioning on Pro plans and above, which (like YouTube) should be treated as a starting point that needs manual review.
One thing Vimeo does well is that it preserves your caption formatting when viewers turn on CC. The player renders captions cleanly without obscuring too much of the video content.
If you're embedding Vimeo videos on your site, the captions carry over to the embedded player. Viewers can toggle them on and off just like they would on vimeo.com. No extra work needed on your end for this part.
Self-Hosted Video Captions with WebVTT
If you're hosting video files directly on your server (using the HTML5 <video> element), you'll use WebVTT (Web Video Text Tracks) for captions. WebVTT is the standard caption format for the web, and it's supported by all modern browsers.
Here's a WebVTT file:
WEBVTT
00:00:01.000 --> 00:00:04.500
Welcome to our tutorial on form accessibility.
00:00:05.000 --> 00:00:09.200
Today we'll cover label elements,
error handling, and keyboard navigation.
00:00:10.000 --> 00:00:14.000
[upbeat intro music]
00:00:15.000 --> 00:00:19.500
<v Sarah>Let's start with the most common
mistake I see on contact forms.
The format is similar to SRT but with a few differences. WebVTT files start with "WEBVTT" on the first line. Timestamps use periods instead of commas for milliseconds. And you can use voice tags (<v Speaker>) for speaker identification instead of plain text.
To link the caption file to your video, use the <track> element:
<video controls width="800">
<source src="/videos/form-tutorial.mp4" type="video/mp4">
<track
kind="captions"
src="/videos/form-tutorial.vtt"
srclang="en"
label="English"
default>
Your browser does not support the video element.
</video>
A few things to note here:
kind="captions"tells the browser (and assistive technology) that this track contains captions, not just subtitles. Usekind="subtitles"only for translations.srclang="en"specifies the language of the captions.label="English"is what users see in the caption menu.- The
defaultattribute makes these captions display automatically. Without it, users have to manually enable them.
You can include multiple <track> elements for different languages:
<video controls width="800">
<source src="/videos/form-tutorial.mp4" type="video/mp4">
<track kind="captions" src="/videos/captions-en.vtt"
srclang="en" label="English" default>
<track kind="subtitles" src="/videos/subtitles-es.vtt"
srclang="es" label="Spanish">
<track kind="subtitles" src="/videos/subtitles-fr.vtt"
srclang="fr" label="French">
</video>
Check Your Site's Video Accessibility
Run a free accessibility scan to catch missing video captions, iframe title issues, and other accessibility problems on your site. You'll get a detailed report in under 60 seconds.
When You Need Transcripts (and How to Create Them)
Captions and transcripts serve different purposes. Captions are synchronized with the video and appear in real time. Transcripts are a separate, standalone text document of everything in the video.
WCAG 2.2 AA requires captions for prerecorded video with audio (Success Criterion 1.2.2). It also requires an audio description or a full text alternative for prerecorded video (Success Criterion 1.2.3). A transcript can satisfy the text alternative requirement.
Beyond compliance, transcripts have real practical value:
- Deafblind users who use refreshable braille displays can read a transcript but can't use captions (which are visual)
- Search engines index transcript text, giving your video content more SEO value
- Users who prefer reading can skim a transcript much faster than watching a video
- Users with slow connections can read the transcript without downloading the video
A good transcript includes:
- All spoken dialogue, identified by speaker
- Relevant non-speech audio (sound effects, music)
- Descriptions of important visual information that isn't conveyed through audio
The easiest way to create a transcript is to start with your caption file and expand it into a readable document. Remove the timestamps, format it as paragraphs with speaker labels, and add descriptions of any visual-only content. Place the transcript on the same page as the video, or link to it directly below the video player.
Tools like Descript, Otter.ai, and Rev can generate transcripts from audio with decent accuracy, but you'll still need to review and edit the output. For professional content (training videos, marketing material, legal content), consider hiring a professional transcription service.
Audio Descriptions Explained
Audio descriptions are the piece of video accessibility that most people have never heard of. They're a narrated track that describes important visual information for blind and low-vision users.
Think about a tutorial video where someone demonstrates how to fill out a form. The presenter says, "Click this button here," while pointing to a specific area on screen. A sighted viewer sees exactly what's happening. A blind user hears "click this button here" and has no idea which button or where it is. An audio description would add something like, "The presenter clicks the blue Submit button in the lower right corner of the form."
WCAG 2.2 AA (Success Criterion 1.2.5) requires audio descriptions for prerecorded video content. This applies when there's important visual information that isn't already conveyed through the existing audio track.
There are two approaches to audio descriptions:
- Standard audio descriptions are inserted during natural pauses in the dialogue. A narrator describes what's happening on screen in the gaps between spoken content. This works well when there's enough silence to fit descriptions in.
- Extended audio descriptions pause the video to allow time for the description when there aren't enough natural pauses. This is less common and usually reserved for content with very little dialogue.
For the HTML5 video element, you can provide an audio description track using the <track> element with kind="descriptions":
<video controls width="800">
<source src="/videos/form-tutorial.mp4" type="video/mp4">
<track kind="captions" src="/videos/captions-en.vtt"
srclang="en" label="English" default>
<track kind="descriptions" src="/videos/descriptions-en.vtt"
srclang="en" label="Audio Descriptions">
</video>
Browser support for the descriptions track kind is still limited, so many organizations create a separate version of the video with the audio descriptions mixed into the main audio track. It's more work, but it guarantees compatibility.
For content where the audio already describes everything visual (like a podcast recording or a talking-head video with no on-screen demos), you may not need audio descriptions at all. The key question is: does the video contain visual information that a listener would miss? If yes, you need audio descriptions.
Accessible Video Players
The video player itself needs to be accessible. Users should be able to operate all controls with a keyboard, and screen readers should be able to announce what each control does.
Native HTML5 Video
The browser's built-in <video> element with the controls attribute is generally accessible. Browser-native controls are keyboard operable and work with screen readers. The downside is that they look different across browsers and offer limited customization.
<video controls width="800" preload="metadata">
<source src="/videos/demo.mp4" type="video/mp4">
<source src="/videos/demo.webm" type="video/webm">
<track kind="captions" src="/videos/captions.vtt"
srclang="en" label="English" default>
<p>Your browser doesn't support HTML5 video.
<a href="/videos/demo.mp4">Download the video</a>.</p>
</video>
Third-Party Players
If you need a custom-styled player, choose one with accessibility built in. Able Player is an open-source player specifically designed for accessibility. It supports captions, audio descriptions, transcripts, sign language video, and chapter markers. It's fully keyboard accessible with well-labeled ARIA controls.
Plyr is another lightweight option that supports keyboard navigation and captions, though it doesn't have the dedicated accessibility features of Able Player.
Video.js is widely used and keyboard accessible with proper ARIA labels on controls. It supports WebVTT captions and has a large plugin ecosystem.
Whatever player you choose, test it with keyboard-only navigation and a screen reader. Make sure you can play, pause, adjust volume, toggle captions, and seek through the video without a mouse. For more on testing with assistive technology, see our screen reader testing guide.
Auto-Play and Accessibility
Auto-playing video is one of the most common accessibility complaints. Here's why it's a problem.
Screen reader interference. Screen readers use audio to convey page content. When a video starts playing automatically, its audio competes with or drowns out the screen reader. The user has to find the video and figure out how to pause it before they can continue using the page. This is disorienting and frustrating.
Cognitive overload. Unexpected audio and motion can be overwhelming for users with cognitive disabilities, anxiety disorders, or attention difficulties.
Motion sensitivity. Auto-playing video with motion can trigger vestibular disorders in some users, causing dizziness or nausea.
WCAG Success Criterion 1.4.2 says that if audio plays automatically for more than 3 seconds, there must be a mechanism to pause or stop it, or to control the audio volume independently from the system volume. The safest approach is simply to not auto-play video at all.
If you absolutely must auto-play (for background hero videos, for example), follow these rules:
- Use the
mutedattribute so there's no audio - Provide a visible pause/stop button
- Respect the
prefers-reduced-motionmedia query and disable auto-play for users who have that preference set
<video autoplay muted loop playsinline id="hero-video">
<source src="/videos/hero-bg.mp4" type="video/mp4">
</video>
<button onclick="toggleVideo()" aria-label="Pause background video">
Pause Video
</button>
<script>
// Respect reduced motion preference
const motionQuery = window.matchMedia('(prefers-reduced-motion: reduce)');
const video = document.getElementById('hero-video');
if (motionQuery.matches) {
video.pause();
video.removeAttribute('autoplay');
}
function toggleVideo() {
if (video.paused) {
video.play();
} else {
video.pause();
}
}
</script>
Embedding Video Accessibly
Most websites don't host their own videos. They embed YouTube, Vimeo, or other third-party players using iframes. The embed code from these platforms works fine, but there are a few accessibility details you should handle yourself.
Always Add a Title to Your Iframes
This is one of the most common WCAG mistakes I see. When you embed a video with an <iframe>, screen readers announce it as "frame" with no description unless you add a title attribute. That title should describe what the video is about.
<!-- Bad: no title attribute -->
<iframe src="https://www.youtube.com/embed/dQw4w9WgXcQ"
width="800" height="450"
allow="accelerometer; autoplay; clipboard-write;
encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
<!-- Good: descriptive title -->
<iframe src="https://www.youtube.com/embed/dQw4w9WgXcQ"
title="Tutorial: How to make web forms accessible"
width="800" height="450"
allow="accelerometer; autoplay; clipboard-write;
encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
The title should be specific. "Video" or "YouTube video" doesn't help. "Tutorial: How to make web forms accessible" tells the user exactly what they'll find.
Responsive Video Containers
Iframes with fixed width and height don't scale well on mobile. Use a responsive container to maintain the aspect ratio while filling the available width:
<div style="position: relative; padding-bottom: 56.25%;
height: 0; overflow: hidden;">
<iframe src="https://www.youtube.com/embed/dQw4w9WgXcQ"
title="Tutorial: How to make web forms accessible"
style="position: absolute; top: 0; left: 0;
width: 100%; height: 100%;"
allow="accelerometer; autoplay; clipboard-write;
encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
</div>
The padding-bottom: 56.25% creates a 16:9 aspect ratio (9 / 16 = 0.5625). This is a standard pattern that works across all browsers.
ARIA Labels for Context
If you have multiple videos on a single page, consider wrapping each in a <figure> with a <figcaption> or using aria-labelledby to connect the video to a nearby heading. This gives screen reader users more context about each video. For a deeper look at how ARIA attributes work, check our ARIA labels guide.
<figure>
<iframe src="https://www.youtube.com/embed/dQw4w9WgXcQ"
title="Tutorial: How to make web forms accessible"
width="800" height="450"
allowfullscreen>
</iframe>
<figcaption>
Video tutorial demonstrating accessible form patterns
with live code examples.
</figcaption>
</figure>
Testing Video Accessibility
Here's a practical checklist you can work through for every video on your site.
Caption Quality Check
- Watch the video with captions on and the sound off. Can you follow the content?
- Are speaker changes identified?
- Are relevant sound effects and music described?
- Is the timing accurate (captions appear and disappear with the audio)?
- Are there spelling or grammar errors?
- Do captions use proper punctuation?
Player Accessibility Check
- Can you play and pause the video using only a keyboard?
- Can you adjust the volume with keyboard controls?
- Can you toggle captions on and off with the keyboard?
- Can you enter and exit fullscreen with the keyboard?
- Is there a visible focus indicator on all player controls?
- Do screen readers announce the player controls correctly?
Embed Check
- Does every
<iframe>have a descriptivetitleattribute? - Do embedded videos resize properly on mobile?
- Are there no auto-playing videos with audio?
Transcript Check
- Is there a transcript available on the same page or linked directly below the video?
- Does the transcript include speaker identification, sound effects, and visual descriptions?
- Is the transcript in an accessible format (HTML text, not an image of text)?
Start by running an automated accessibility scan on your page. It'll catch missing iframe titles, auto-play issues, and other machine-detectable problems. Then work through the manual checks above for each video. Automated tools catch about 30-40% of accessibility issues. The rest requires human review, especially for caption quality and audio description completeness.
Start Small
If you have a lot of video content and none of it has captions, don't try to caption everything at once. Start with your highest-traffic videos and work your way down. Prioritize videos that are essential to understanding your product or service (tutorials, onboarding content, product demos). Even partial coverage is better than none. Every captioned video removes a barrier for real people.
Scan Your Site for Video Issues
Missing iframe titles, auto-playing media, and other video accessibility problems show up in our free accessibility scan. Get a detailed report with every issue listed and specific fixes for each one. Takes under 60 seconds.