WebVTT API
Web Video Text Tracks ( WebVTT ) are text tracks providing specific text "cues" that are time-aligned with other media, such as video or audio tracks. The WebVTT API provides functionality to define and manipulate these text tracks. The WebVTT API is primarily used for displaying subtitles or captions that overlay with video content, but it has other uses: providing chapter information for easier navigation and generic metadata that needs to be time-aligned with audio or video content.
Concepts and usage
A text track is a container for time-aligned text data that can be played in parallel with a video or audio track to provide a translation, transcription, or overview of the content. A video or audio media element may define tracks of different kinds or in different languages, allowing users to display appropriate tracks based on their preferences or needs.
The different kinds of text data that can be specified are listed below. Note that browsers do not necessarily support all kinds of text tracks.
subtitles
provide a textual translation of spoken dialog.
This is the default type of text track, and if used, the source language must be specified.
captions
provide a transcription of spoken text, and may include information about other audio such as music or background noise.
They are intended for hearing impaired users.
chapters
provide high level navigation information, allowing users to more easily switch to relevant content.
metadata
is used for any other kinds of time-aligned information.
The individual time-aligned units of text data within a track are referred to as "cues". Each cue has a start time, end time, and textual payload. It may also have "cue settings", which affect its display region, position, alignment, and/or size. Lastly, a cue may have a label, which can be used to select it for CSS styling.
A text track and cues can be defined in a file using the
WebVTT File Format
, and then associated with a particular
<video>
element using the
<track>
element.
Alternatively you can add a
TextTrack
to a media element in JavaScript using
HTMLMediaElement.addTextTrack()
, and then add individual
VTTCue
objects to the track with
TextTrack.addCue()
.
The
::cue
CSS
pseudo-element
can be used both in HTML and in a WebVTT file to style the cues for a particular element, for a particular tag within a cue, for a VTT class, or for a cue with a particular label.
The
::cue-region
pseudo-element is intended for styling cues in a particular region, but is not supported in any browser.
Most important WebVTT features can be accessed using either the file format or Web API.
Interfaces
-
VTTCue
-
VTTRegion
-
TextTrack
-
TextTrackCue
-
TextTrackCueList
-
TextTrackList
Represents a cue, the text displayed in a particular timeslice of the text track associated with a media element.
Represents a portion of a video element onto which a
VTTCue
can be rendered.
Represents a text track, which holds the list of cues to display along with an associated media element at various points while it plays.
An abstract base class for various cue types, such as
VTTCue
.
An array-like object that represents a dynamically updating list of
TextTrackCue
objects.
An instance of this type is obtained from
TextTrack.cues
in order to get all the cues in the
TextTrack
object.
Represents a list of the text tracks defined for a media element, with each track represented by a separate
TextTrack
instance in the list.
Related interfaces
-
TrackEvent
Part of the HTML DOM API, this is the interface for the
addtrack
and
removetrack
events that are fired when a track is added or removed from
TextTrackList
(or more generally, when a track is added/removed from an HTML media element).
Related CSS extensions
These CSS pseudo-element are used to style cues in media with VTT tracks.
::cue
Matches cues within a selected element in media with VTT tracks.
Note:
The specification defines another pseudo-element,
::cue-region
, but this is not supported by any browsers.