CARVIEW |
Media Source Extensions
W3C Editor's Draft 04 January 2013
- This version:
- https://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html
- Latest published version:
- https://www.w3.org/TR/media-source/
- Latest editor's draft:
- https://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html
- Editors:
- Aaron Colwell, Google Inc.
- Adrian Bateman, Microsoft Corporation
- Mark Watson, Netflix Inc.
Copyright © 2013 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Abstract
This specification extends HTMLMediaElement to allow JavaScript to generate media streams for playback. Allowing JavaScript to generate streams facilitates a variety of use cases like adaptive streaming and time shifting live streams.
Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
The working groups maintains a list of all bug reports that the editors have not yet tried to address. This draft highlights some of the pending issues that are still to be discussed in the working group. No decision has been taken on the outcome of these issues including whether they are valid.
Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the mailing list mentioned below and take part in the discussions.
This document was published by the HTML Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-html-media@w3.org (subscribe, archives). All feedback is welcome.
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Table of Contents
- 1. Introduction
- 2. Source Buffer Model
- 3. MediaSource Object
- 4. SourceBuffer Object
- 5. SourceBufferList Object
- 6. URL Object
- 7. HTMLMediaElement attributes
- 8. Byte Stream Formats
- 9. Examples
- 10. Revision History
1. Introduction
This specification allows JavaScript to dynamically construct media streams for <audio> and <video>. It defines objects that allow JavaScript to pass media segments to an HTMLMediaElement. A buffering model is also included to describe how the user agent should act when different media segments are appended at different times. Byte stream specifications for WebM, ISO Base Media File Format, and MPEG-2 Transport Streams are given to specify the expected format of byte streams used with these extensions.

1.1 Goals
This specification was designed with the following goals in mind:
- Allow JavaScript to construct media streams independent of how the media is fetched.
- Define a splicing and buffering model that facilitates use cases like adaptive streaming, ad-insertion, time-shifting, and video editing.
- Minimize the need for media parsing in JavaScript.
- Leverage the browser cache as much as possible.
- Provide byte stream definitions for WebM, the ISO Base Media File Format, and MPEG-2 Transport Streams.
- Not require support for any particular media format or codec.
1.2 Definitions
- Initialization Segment
-
A sequence of bytes that contain all of the initialization information required to decode a sequence of media segments. This includes codec initialization data, Track ID mappings for multiplexed segments, and timestamp offsets (e.g. edit lists).
NoteThe byte stream format specifications contain format specific examples.
- Media Segment
-
A sequence of bytes that contain packetized & timestamped media data for a portion of the presentation timeline. Media segments are always associated with the most recently appended initialization segment.
NoteThe byte stream format specifications contain format specific examples.
- Decoder Buffer
A buffer that holds initialization data and coded frames that will be decoded and rendered. This buffer may not exist in actual implementations, but it is intended to represent media data that will be decoded no matter what media segments are appended to update the
SourceBuffer
. This distinction is important when considering appends that happen close to the current playback position. See Track Buffer to Decoder Buffer transfer for details.- Random Access Point
A position in a media segment where decoding and continuous playback can begin without relying on any previous data in the segment. For video this tends to be the location of I-frames. In the case of audio, most audio frames can be treated as a random access point. Since video tracks tend to have a more sparse distribution of random access points, the location of these points are usually considered the random access points for multiplexed streams.
- Presentation Start Time
The presentation start time is the earliest time point in the presentation and specifies the initial playback position and earliest possible position. All presentations created using this specification have a presentation start time of 0.
- MediaSource object URL
-
A MediaSource object URL is a unique Blob URI created by
createObjectURL()
. It is used to attach aMediaSource
object to an HTMLMediaElement.These URLs are the same as what the File API specification calls a Blob URI, except that anything in the definition of that feature that refers to File and Blob objects is hereby extended to also apply to
MediaSource
objects. - Track ID
A Track ID is a byte stream format specific identifier that marks sections of the byte stream as being part of a specific track. The Track ID in a track description identifies which sections of a media segment belong to that track.
- Track Description
A byte stream format specific structure that provides the Track ID, codec configuration, and other metadata for a single track. Each track description inside a single initialization segment must have a unique Track ID.
- Coded Frame
A unit of compressed media data that has a presentation timestamp and decode timestamp. The presentation timestamp indicates when the frame should be rendered. The decode timestamp indicates when the frame needs to be decoded. If frames can be decoded out of order, then the decode timestamp must be present in the bytestream. If frames cannot be decoded out of order and a decode timestamp is not present in the bytestream, then the decode timestamp is equal to the presentation timestamp.
- Parent Media Source
- The parent media source of a
SourceBuffer
object is theMediaSource
object that created it. - Append Sequence
- A series of
appendArrayBuffer()
orappendStream()
calls on aSourceBuffer
without any interveningabort()
calls. The media segments in an append sequence must be adjacent and monotonically increasing in time without any gaps. Anabort()
call starts a new append sequence which allows media segments to be appended in non-monotonically increasing order.
2. Source Buffer Model
The subsections below outline the buffering model for this specification. It describes the various rules and behaviors associated with appending
data to an individual SourceBuffer
. At the highest level, the web application creates SourceBuffer
objects and appends sequences of
initialization segments and media segments to update their state. The media element pulls media data out of the
MediaSource
object, plays it, and fires events just like it would if a normal URL was passed to the src
attribute.
The web application is expected to monitor media element events to determine when it needs to append more media segments.
2.1 Appending a Media Segment over a buffered region
There are several ways that media segments can overlap segments in the SourceBuffer
. Behavior for the different overlap situations are described below. If more than one overlap applies, then the start overlap must be resolved first, followed by any complete overlaps, and finally the end overlap. If a segment contains multiple tracks then the overlap is resolved independently for each track.
Bug 19673 - Seamless audio signal transitions at splice points
Bug 19784 - timestampOffset with multiplexed Media Segments
2.1.1 Complete Overlap

The figure above shows how the SourceBuffer
is updated when a new media segment completely overlaps a segment in the buffer. In this case, the new segment completely replaces the old segment.
2.1.2 Start Overlap

The figure above shows how the SourceBuffer
is updated when the beginning of a new media segment overlaps a segment in the buffer. In this case, the new segment replaces all the old media data in the overlapping region. Since media segments are constrained to starting with random access points, this provides a seamless transition between segments.
When an audio frame in the SourceBuffer
overlaps with the start of the new media segment special behavior is required. At a minimum implementations must support dropping the old audio frame that overlaps the start of the new segment and insert silence for the small gap that is created. Higher quality implementations may support crossfading or crosslapping between the overlapping audio frames. No matter which strategy is implemented, no gaps are created in the ranges reported by buffered
and playback must never stall at the overlap.
2.1.3 End Overlap

The figure above shows how the SourceBuffer
is updated when the end of a new media segment overlaps the beginning of a segment in the buffer. In this case, the SourceBuffer
tries to keep as much of the old segment as possible. The amount saved depends on where the closest random access point, in the old segment, is to the end of the new segment. In the case of audio, if the gap is smaller than the size of an audio frame, then the SourceBuffer
may render silence for this gap. This gap must not be reflect in buffered
. The entire new segment must be added to the SourceBuffer
, but it is up to the implementation to determine how much of the old segment data is retained.
An implementation may keep old segment data before the end of the new segment to avoid creating a gap if it wishes. Doing this though can significantly increase implementation complexity and could cause delays at the splice point.
The web application can use buffered
to determine how much of the old segment was preserved.
2.1.4 Middle Overlap

The figure above shows how the SourceBuffer
is updated when the new media segment is in the middle of the old segment. This condition is handled by first resolving the start overlap and then resolving the end overlap.
2.2 Track Buffer to Decoder Buffer transfer
The track buffer represents the media that the web application would like the media element to play. The decoder buffer contains the data that will actually get decoded and rendered. In most cases the decoder buffer will simply contain a subset of the track buffer near the current playback position. These two buffers start to diverge when media segments that overlap or are very close to the current playback position are appended. Depending on the contents of the new media segment it may not be possible to switch to the new data immediately because there isn't a random access point close enough to the current playback position. The quality of the implementation determines how much data is considered "in the decoder buffer." It should transfer data to the decoder buffer as late as possible whilst maintaining seamless playback. Some implementations may be able to instantiate multiple decoders or decode the new data significantly faster than real-time to achieve a seamless splice immediately. Other implementations may delay until the next random access point before switching to the newly appended data. Notice that this difference in behavior is only observable when appending close to the current playback position. The decoder buffer represents a media subsegment, like a group of pictures or something with similar decode dependencies, that the media element commits to playing. This commitment may be influenced by a variety of things like limited decoding resources, hardware decode buffers, a jitter buffer, or the desire to limit implementation complexity.
Here is an example to help clarify the role of the decoder buffer. Say the current playback position has a timestamp of 8 and the media element pulled frames with timestamp 9 & 10 into the decoder buffer. The web application then appends a higher quality media segment that starts with a random access point at timestamp 9. The track buffer will get updated with the higher quality data, but the media element won't be able to switch to this higher quality data until the next random access point at timestamp 20. This is because a frame for timestamp 9 is already in the decoder buffer. The decoder buffer represents the "point of no return." for decoding. If a seek occurs the media element may choose to use the higher quality data since a seek might imply flushing the decoder buffer and the user expects a break in playback.
3. MediaSource Object
The MediaSource object represents a source of media data for an HTMLMediaElement. It keeps track of the readyState
for this source as well as a list of SourceBuffer
objects that can be used to add media data to the presentation. MediaSource objects are created by the web application and then attached to an HTMLMediaElement. The application uses the SourceBuffer
objects in sourceBuffers
to add media data to this source. The HTMLMediaElement fetches this media data from the MediaSource
object when it is needed during playback.
enum ReadyState {
"closed",
"open",
"ended"
};
Enumeration description | |
---|---|
closed | Indicates the source is not currently attached to a media element. |
open |
The source has been opened by a media element and is ready for data to be appended to the SourceBuffer objects in sourceBuffers .
|
ended |
The source is still attached to a media element, but endOfStream() has been called.
|
enum EndOfStreamError {
"network",
"decode"
};
Enumeration description | |
---|---|
network |
Terminates playback and signals that a network error has occured. Note If the JavaScript fetching media data encounters a network error it should use this status code to terminate playback. |
decode |
Terminates playback and signals that a decoding error has occured. Note If the JavaScript code fetching media data has problems parsing the data it should use this status code to terminate playback. |
[Constructor]
interface MediaSource : EventTarget {
readonly attribute SourceBufferList
sourceBuffers;
readonly attribute SourceBufferList
activeSourceBuffers;
readonly attribute ReadyState
readyState;
attribute unrestricted double duration;
SourceBuffer
addSourceBuffer (DOMString type);
void removeSourceBuffer (SourceBuffer
sourceBuffer);
SourceBuffer
? getSourceBuffer (VideoTrack videoTrack);
SourceBuffer
? getSourceBuffer (AudioTrack audioTrack);
SourceBuffer
? getSourceBuffer (TextTrack textTrack);
void endOfStream (optional EndOfStreamError
error);
void setTrackInfo (VideoTrack track, DOMString kind, DOMString language);
void setTrackInfo (AudioTrack track, DOMString kind, DOMString language);
void setTrackInfo (TextTrack track, DOMString kind, DOMString language);
static bool isTypeSupported (DOMString type);
};
3.1 Attributes
activeSourceBuffers
of typeSourceBufferList
, readonly-
Contains the subset of
sourceBuffers
that are providing the selected video track, the enabled audio tracks, and the "showing" or "hidden" text tracks.NoteThe Changes to selected/enabled track state section describes how this attribute gets updated.
duration
of type unrestricted double-
Allows the web application to set the presentation duration. The duration is initially set to NaN when the
MediaSource
object is created.On getting, run the following steps:
- If the
readyState
attribute is"closed"
then return NaN and abort these steps. - Return the current value of the attribute.
On setting, run the following steps:
- If the value being set is negative or NaN then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If the
readyState
attribute is not"open"
then throw anINVALID_STATE_ERR
exception and abort these steps. - Run the duration change algorithm with new duration set to the value being assigned to this attribute.
Note
appendArrayBuffer()
,appendStream()
andendOfStream()
can update the duration under certain circumstances.
- If the
readyState
of typeReadyState
, readonly-
Indicates the current state of the
MediaSource
object. When theMediaSource
is createdreadyState
must be set to"closed"
. sourceBuffers
of typeSourceBufferList
, readonly-
Contains the list of
SourceBuffer
objects associated with thisMediaSource
. WhenreadyState
equals"closed"
this list will be empty. OncereadyState
transitions to"open"
SourceBuffer objects can be added to this list by usingaddSourceBuffer()
.
3.2 Methods
addSourceBuffer
-
Adds a new
SourceBuffer
tosourceBuffers
.Parameter Type Nullable Optional Description type DOMString
✘ ✘ Return type:SourceBuffer
When this method is invoked, the user agent must run the following steps:
- If type is null or an empty string then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If type contains a MIME type that is not supported or contains a MIME type that is not supported with the types specified for the other
SourceBuffer
objects insourceBuffers
, then throw aNOT_SUPPORTED_ERR
exception and abort these steps. - If the user agent can't handle any more SourceBuffer objects then throw a
QUOTA_EXCEEDED_ERR
exception and abort these steps. - If the
readyState
attribute is not in the"open"
state then throw anINVALID_STATE_ERR
exception and abort these steps. - Create a new
SourceBuffer
object and associated resources. - Add the new object to
sourceBuffers
and queue a task to fire a simple event namedaddsourcebuffer
atsourceBuffers
. - Return the new object.
- If type is null or an empty string then throw an
endOfStream
-
Signals the end of the stream.
Parameter Type Nullable Optional Description error EndOfStreamError
✘ ✔ Return type:void
When this method is invoked, the user agent must run the following steps:
- If the
readyState
attribute is not in the"open"
state then throw anINVALID_STATE_ERR
exception and abort these steps. - Change the
readyState
attribute value to"ended"
. -
Queue a task to fire a simple event named
sourceended
at theMediaSource
. - If error is not set, null, or an empty string
-
- Run the duration change algorithm with new duration set to the highest end timestamp across all
SourceBuffer
objects insourceBuffers
.
NoteThis allows the duration to properly reflect the end of the appended media segments. For example, if the duration was explicitly set to 10 seconds and only media segments for 0 to 5 seconds were appended before endOfStream() was called, then the duration will get updated to 5 seconds.
- Notify the media element that it now has all of the media data. Playback should continue until all the media passed in via
appendArrayBuffer()
andappendStream()
has been played.
- Run the duration change algorithm with new duration set to the highest end timestamp across all
- If error is set to
"network"
-
- If the
HTMLMediaElement.readyState
attribute equalsHAVE_NOTHING
- Run the steps of the resource fetch algorithm.
- If the
HTMLMediaElement.readyState
attribute is greater thanHAVE_NOTHING
- Run the "If the connection is interrupted after some media data has been received, causing the user agent to give up trying to fetch the resource" steps of the resource fetch algorithm.
- If the
- If error is set to
"decode"
-
- If the
HTMLMediaElement.readyState
attribute equalsHAVE_NOTHING
- Run the "If the media data can be fetched but is found by inspection to be in an unsupported format, or can otherwise not be rendered at all" steps of the resource fetch algorithm.
- If the
HTMLMediaElement.readyState
attribute is greater thanHAVE_NOTHING
- Run the media data is corrupted steps of the resource fetch algorithm.
- If the
- Otherwise
- Throw an
INVALID_ACCESS_ERR
exception.
- If the
getSourceBuffer
-
Gets the
SourceBuffer
object that created a specificVideoTrack
.Parameter Type Nullable Optional Description videoTrack VideoTrack
✘ ✘ Return type:
, nullableSourceBuffer
When this method is invoked, the user agent must run the following steps:
- If videoTrack was not created by any
SourceBuffer
object insourceBuffers
then return null. - Return the
SourceBuffer
object insourceBuffers
that created videoTrack.
- If videoTrack was not created by any
getSourceBuffer
-
Gets the
SourceBuffer
object that created a specificAudioTrack
.Parameter Type Nullable Optional Description audioTrack AudioTrack
✘ ✘ Return type:
, nullableSourceBuffer
When this method is invoked, the user agent must run the following steps:
- If audioTrack was not created by any
SourceBuffer
object insourceBuffers
then return null. - Return the
SourceBuffer
object insourceBuffers
that created audioTrack.
- If audioTrack was not created by any
getSourceBuffer
-
Gets the
SourceBuffer
object that created a specificTextTrack
.Parameter Type Nullable Optional Description textTrack TextTrack
✘ ✘ Return type:
, nullableSourceBuffer
When this method is invoked, the user agent must run the following steps:
- If textTrack was not created by any
SourceBuffer
object insourceBuffers
then return null. - Return the
SourceBuffer
object insourceBuffers
that created textTrack.
- If textTrack was not created by any
isTypeSupported
, static-
Check to see whether the
MediaSource
is capable of creatingSourceBuffer
objects for the the specified MIME type.NoteIf true is returned from this method, it only indicates that the
MediaSource
implementation is capable of creatingSourceBuffer
objects for the specified MIME type. AnaddSourceBuffer()
call may still fail if sufficient resources are not available to support the addition of a newSourceBuffer
.NoteThis method returning true implies that HTMLMediaElement.canPlayType() will return "maybe" or "probably" since it does not make sense for a
MediaSource
to support a type the HTMLMediaElement knows it cannot play.Parameter Type Nullable Optional Description type DOMString
✘ ✘ Return type:bool
When this method is invoked, the user agent must run the following steps:
- If type is an empty string, then return false.
- If type does not contain a valid MIME type string, then return false.
- If type contains a media type or media subtype that the MediaSource does not support, then return false.
- If type contains at a codec that the MediaSource does not support, then return false.
- If the MediaSource does not support the specified combination of media type, media subtype, and codecs then return false.
- Return true.
removeSourceBuffer
-
Removes a
SourceBuffer
fromsourceBuffers
.Parameter Type Nullable Optional Description sourceBuffer SourceBuffer
✘ ✘ Return type:void
When this method is invoked, the user agent must run the following steps:
- If sourceBuffer is null then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If sourceBuffer specifies an object that is not in
sourceBuffers
then throw aNOT_FOUND_ERR
exception and abort these steps. - Remove track information from
audioTracks
,videoTracks
, andtextTracks
for all tracks associated with sourceBuffer and queue a task to fire a simple event named change at the modified lists. - If sourceBuffer is in
activeSourceBuffers
, then remove it fromactiveSourceBuffers
and queue a task to fire a simple event namedremovesourcebuffer
atactiveSourceBuffers
. - Remove sourceBuffer from
sourceBuffers
and queue a task to fire a simple event namedremovesourcebuffer
atsourceBuffers
. - Destroy all resources for sourceBuffer.
- If sourceBuffer is null then throw an
setTrackInfo
-
Set the
kind
andlanguage
of theVideoTrack
track.NoteThis method would be unnecessary if the
kind
andlanguage
attributes ofVideoTrack
were not read-only.Parameter Type Nullable Optional Description track VideoTrack
✘ ✘ kind DOMString
✘ ✘ language DOMString
✘ ✘ Return type:void
setTrackInfo
-
Set the
kind
andlanguage
of theAudioTrack
track.NoteThis method would be unnecessary if the
kind
andlanguage
attributes ofAudioTrack
were not read-only.Parameter Type Nullable Optional Description track AudioTrack
✘ ✘ kind DOMString
✘ ✘ language DOMString
✘ ✘ Return type:void
setTrackInfo
-
Set the
kind
andlanguage
of theTextTrack
track.NoteThis method would be unnecessary if the
kind
andlanguage
attributes ofTextTrack
were not read-only.Parameter Type Nullable Optional Description track TextTrack
✘ ✘ kind DOMString
✘ ✘ language DOMString
✘ ✘ Return type:void
3.3 Event Summary
Event name | Interface | Dispatched when... |
---|---|---|
sourceopen |
Event |
readyState transitions from "closed" to "open" or from "ended" to "open" . |
sourceended |
Event |
readyState transitions from "open" to "ended" . |
sourceclose |
Event |
readyState transitions from "open" to "closed" or "ended" to "closed" . |
3.4 Algorithms
3.4.1 Attaching to a media element
A MediaSource
object can be attached to a media element by assigning a MediaSource object URL to the media element src
attribute or the src attribute of a <source> inside a media element. A MediaSource object URL is created by passing a MediaSource object to createObjectURL()
.
If the resource fetch algorithm absolute URL matches the MediaSource object URL, run the following steps right before the "Perform a potentially CORS-enabled fetch" step in the resource fetch algorithm.
- If
readyState
is NOT set to"closed"
- Run the steps of the resource fetch algorithm.
- Otherwise
-
- Set the
readyState
attribute to"open"
. -
Queue a task to fire a simple event named
sourceopen
at theMediaSource
. - Allow the resource fetch algorithm to progress based on data passed in via
appendArrayBuffer()
andappendStream()
.
- Set the
3.4.2 Detaching from a media element
The following steps are run in any case where the media element is going to transition to NETWORK_EMPTY and queue a task to fire a simple event named emptied at the media element. These steps should be run right before the transition.
- Set the
readyState
attribute to"closed"
. - Set the
duration
attribute to NaN. - Remove all the
SourceBuffer
objects fromactiveSourceBuffers
. -
Queue a task to fire a simple event named
removesourcebuffer
atactiveSourceBuffers
. - Remove all the
SourceBuffer
objects fromsourceBuffers
. -
Queue a task to fire a simple event named
removesourcebuffer
atsourceBuffers
. -
Queue a task to fire a simple event named
sourceclose
at theMediaSource
.
3.4.3 Seeking
Run the following steps as part of the "Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position" step of the seek algorithm:
- The media element looks for media segments containing the new playback position in each
SourceBuffer
object inactiveSourceBuffers
.- If one or more of the objects in
activeSourceBuffers
is missing media segments for the new playback position -
- Set the
HTMLMediaElement.readyState
attribute toHAVE_METADATA
. - The media element waits for the necessary media segments to be passed to
appendArrayBuffer()
orappendStream()
.NoteThe web application can use
buffered
to determine what the media element needs to resume playback.
- Set the
- Otherwise
- Continue
- If one or more of the objects in
- The media element resets all decoders and initializes each one with data from the appropriate initialization segment.
- The media element feeds data from the media segments into the decoders until the new playback position is reached.
- Resume the seek algorithm at the "Await a stable state" step.
3.4.4 SourceBuffer Monitoring
The following steps are periodically run during playback to make sure that all of the SourceBuffer
objects in activeSourceBuffers
have enough data to ensure uninterrupted playback. Appending new segments and changes to activeSourceBuffers
also cause these steps to run because they affect the conditions that trigger state transitions.
The web application can monitor changes in HTMLMediaElement.readyState
to drive media segment appending.
Bug 18592 - How much is "enough data to ensure uninterrupted playback"
- If
buffered
for all objects inactiveSourceBuffers
do not containTimeRanges
for the current playback position: -
- Set the
HTMLMediaElement.readyState
attribute toHAVE_METADATA
. - If this is the first transition to
HAVE_METADATA
, then queue a task to fire a simple event namedloadedmetadata
at the media element. - Abort these steps.
- Set the
- If
buffered
for all objects inactiveSourceBuffers
containTimeRanges
that include the current playback position and enough data to ensure uninterrupted playback: -
- Set the
HTMLMediaElement.readyState
attribute toHAVE_ENOUGH_DATA
. -
Queue a task to fire a simple event named
canplaythrough
at the media element. - Playback may resume at this point if it was previously suspended by a transition to
HAVE_CURRENT_DATA
. - Abort these steps.
- Set the
- If
buffered
for at least one object inactiveSourceBuffers
contains aTimeRange
that includes the current playback position but not enough data to ensure uninterrupted playback: -
- Set the
HTMLMediaElement.readyState
attribute toHAVE_FUTURE_DATA
. - If the previous value of
HTMLMediaElement.readyState
was less thanHAVE_FUTURE_DATA
, then queue a task to fire a simple event namedcanplay
at the media element. - Playback may resume at this point if it was previously suspended by a transition to
HAVE_CURRENT_DATA
. - Abort these steps.
- Set the
- If
buffered
for at least one object inactiveSourceBuffers
contains aTimeRange
that ends at the current playback position and does not have a range covering the time immediately after the current position: -
- Set the
HTMLMediaElement.readyState
attribute toHAVE_CURRENT_DATA
. - If this is the first transition to
HAVE_CURRENT_DATA
, then queue a task to fire a simple event namedloadeddata
at the media element. - Playback is suspended at this point since the media element doesn't have enough data to advance the timeline.
- Abort these steps.
- Set the
3.4.5 Changes to selected/enabled track state
During playback activeSourceBuffers
needs to be updated if the selected video track, the enabled audio tracks, or a text track mode changes. When one or more of these changes occur the following steps need to be followed.
- If the selected video track changes, then run the following steps:
-
- If the
SourceBuffer
associated with the previously selected video track is not associated with any other enabled tracks, run the following steps:- Remove the
SourceBuffer
fromactiveSourceBuffers
. -
Queue a task to fire a simple event named
removesourcebuffer
atactiveSourceBuffers
- Remove the
- If the
SourceBuffer
associated with the newly selected video track is not already inactiveSourceBuffers
, run the following steps:- Add the
SourceBuffer
toactiveSourceBuffers
. -
Queue a task to fire a simple event named
addsourcebuffer
atactiveSourceBuffers
- Add the
- If the
- If an audio track becomes disabled and the
SourceBuffer
associated with this track is not associated with any other enabled or selected track, then run the following steps: -
- Remove the
SourceBuffer
associated with the audio track fromactiveSourceBuffers
-
Queue a task to fire a simple event named
removesourcebuffer
atactiveSourceBuffers
- Remove the
- If an audio track becomes enabled and the
SourceBuffer
associated with this track is not already inactiveSourceBuffers
, then run the following steps: -
- Add the
SourceBuffer
associated with the audio track toactiveSourceBuffers
-
Queue a task to fire a simple event named
addsourcebuffer
atactiveSourceBuffers
- Add the
- If a text track mode becomes "disabled" and the
SourceBuffer
associated with this track is not associated with any other enabled or selected track, then run the following steps: -
- Remove the
SourceBuffer
associated with the text track fromactiveSourceBuffers
-
Queue a task to fire a simple event named
removesourcebuffer
atactiveSourceBuffers
- Remove the
- If a text track mode becomes "showing" or "hidden" and the
SourceBuffer
associated with this track is not already inactiveSourceBuffers
, then run the following steps: -
- Add the
SourceBuffer
associated with the text track toactiveSourceBuffers
-
Queue a task to fire a simple event named
addsourcebuffer
atactiveSourceBuffers
- Add the
3.4.6 Duration change
Follow these steps when duration
needs to change to a new duration.
- If the current value of
duration
is equal to new duration, then abort these steps. - Set old duration to the current value of
duration
. - Update
duration
to new duration. - If the new duration is less than old duration, then call
remove(new duration, old duration)
on all objects insourceBuffers
. - Update the
media controller duration
to new duration and run the HTMLMediaElement duration change algorithm.
4. SourceBuffer Object
Bug 20327 - Continuous splice flag
interface SourceBuffer : EventTarget {
readonly attribute boolean appending;
readonly attribute TimeRanges buffered;
attribute double timestampOffset;
void appendArrayBuffer (ArrayBuffer data);
void appendStream (Stream stream, optional unsigned long long maxSize);
void abort ();
void remove (double start, double end);
};
4.1 Attributes
appending
of type boolean, readonly-
Indicates whether an
appendArrayBuffer()
orappendStream()
operation is still being processed. buffered
of type TimeRanges, readonly-
Indicates what
TimeRanges
are buffered in theSourceBuffer
.When the attribute is read the following steps must occur:
- If this object has been removed from the
sourceBuffers
attribute of the parent media source then throw anINVALID_STATE_ERR
exception and abort these steps. - Return a new static normalized TimeRanges object for the media segments buffered.
- If this object has been removed from the
timestampOffset
of type double-
Controls the offset applied to timestamps inside subsequent media segments that are appended to this
SourceBuffer
. ThetimestampOffset
is initially set to 0 which indicates that no offset is being applied.On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
- If this object has been removed from the
sourceBuffers
attribute of the parent media source, then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
readyState
attribute of the parent media source is not in the"open"
state, then throw anINVALID_STATE_ERR
exception and abort these steps. - If this object is waiting for the end of a media segment to be appended, then throw an
INVALID_STATE_ERR
and abort these steps. - Update the attribute to the new value.
Issue 5Bug 19676 - timestampOffset accuracy
- If this object has been removed from the
4.2 Methods
abort
-
Aborts the current segment and resets the segment parser.
No parameters.Return type:void
When this method is invoked, the user agent must run the following steps:
- If this object has been removed from the
sourceBuffers
attribute of the parent media source then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
readyState
attribute of the parent media source is not in the"open"
state then throw anINVALID_STATE_ERR
exception and abort these steps. - Run the reset parser state algorithm.
- If the
appending
attribute equals true, then run the following steps:- Set the
appending
attribute to false. - Queue a task to fire a simple event named
abort
at thisSourceBuffer
object. - Queue a task to fire a simple event named
appendend
at thisSourceBuffer
object.
- Set the
- If this object has been removed from the
appendArrayBuffer
-
Appends the segment data in an ArrayBuffer to the source buffer.
Parameter Type Nullable Optional Description data ArrayBuffer
✘ ✘ Return type:void
When this method is invoked, the user agent must run the following steps:
- If data is null then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If this object has been removed from the
sourceBuffers
attribute of the parent media source then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
appending
attribute equals true, then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
readyState
attribute of the parent media source is in the"closed"
state then throw anINVALID_STATE_ERR
exception and abort these steps. -
If the
readyState
attribute of the parent media source is in the"ended"
state then run the following steps:- Set the
readyState
attribute of the parent media source to"open"
-
Queue a task to fire a simple event named
sourceopen
at the parent media source .
- Set the
- If data.byteLength is 0, then abort these steps.
-
If the buffer full flag equals true, then throw a
QUOTA_EXCEEDED_ERR
exception and abort these step.NoteThe web application must use
remove()
to free up space in theSourceBuffer
. - Add data to the end of the input buffer.
- Set the
appending
attribute to true. - Queue a task to fire a simple event named
appendstart
at thisSourceBuffer
object. - Asynchronously run the segment parser loop algorithm.
- If data is null then throw an
appendStream
-
Appends segment data to the source buffer from a Stream.
Parameter Type Nullable Optional Description stream Stream
✘ ✘ maxSize unsigned long long
✘ ✔ Return type:void
When this method is invoked, the user agent must run the following steps:
- If stream is null then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If this object has been removed from the
sourceBuffers
attribute of the parent media source then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
readyState
attribute of the parent media source is in the"closed"
state then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
appending
attribute equals true, then throw anINVALID_STATE_ERR
exception and abort these steps. -
If the
readyState
attribute of the parent media source is in the"ended"
state then run the following steps:- Set the
readyState
attribute of the parent media source to"open"
- Queue a task to fire a simple event named
sourceopen
at the parent media source .
- Set the
- If maxSize equals 0, then abort these steps.
-
If the buffer full flag equals true, then throw a
QUOTA_EXCEEDED_ERR
exception and abort these step.NoteThe web application must use
remove()
to free up space in theSourceBuffer
. - Set the
appending
attribute to true. - Queue a task to fire a simple event named
appendstart
at thisSourceBuffer
object. - Asynchronously run the stream append loop algorithm with stream and maxSize.
- If stream is null then throw an
remove
-
Removes media for a specific time range.
Parameter Type Nullable Optional Description start double
✘ ✘ end double
✘ ✘ Return type:void
When this method is invoked, the user agent must run the following steps:
- If start is negative or greater than
duration
, then throw anINVALID_ACCESS_ERR
exception and abort these steps. - If end is less than or equal to start, then throw an
INVALID_ACCESS_ERR
exception and abort these steps. - If this object has been removed from the
sourceBuffers
attribute of the parent media source then throw anINVALID_STATE_ERR
exception and abort these steps. - If the
readyState
attribute of the parent media source is not in the"open"
state then throw anINVALID_STATE_ERR
exception and abort these steps. -
For each track buffer in this source buffer, run the following steps:
- Let remove end timestamp be the current value of
duration
-
If this track buffer has a random access point timestamp that is greater than or equal to end, then update remove end timestamp to that timestamp.
NoteRandom access point timestamps can be different across tracks because the dependencies between coded frames within a track are usually different than the dependencies in another track.
- Remove all media data, from this track buffer, that contain starting timestamps greater than or equal to start and less than the remove end timestamp.
-
If this object is in
activeSourceBuffers
, the current playback position is greater than or equal to start and less than the remove end timestamp, andHTMLMediaElement.readyState
is greater thanHAVE_METADATA
, then set theHTMLMediaElement.readyState
attribute toHAVE_METADATA
and stall playback.NoteThis transition occurs because media data for the current position has been removed. Playback cannot progress until media for the current playback position is appended or the selected/enabled tracks change.
- Let remove end timestamp be the current value of
- If buffer full flag equals true and this object is ready to accept more bytes, then set the buffer full flag to false.
- If start is negative or greater than
4.3 Track Buffers
A track buffer stores the track descriptions and coded frames for an individual
track. The track buffer is updated as initialization segments and media segments are appended to the
SourceBuffer
.
Each track buffer has a last decode timestamp variable that stores the decode timestamp of the last coded frame appended in the current append sequence. The variable is initially unset to indicate that no coded frames have been appended yet.
4.4 Event Summary
Event name | Interface | Dispatched when... |
---|---|---|
appendstart |
Event |
appending transitions from false to true. |
appendend |
Event |
appending transitions from true to false. |
error |
Event |
An error occurred during the append. |
abort |
Event |
The append was aborted by an abort() call. |
4.5 Algorithms
4.5.1 Segment Parser Loop
All SourceBuffer objects have an internal append state variable that keeps track of the high-level segment parsing state. It is initially set to WAITING_FOR_SEGMENT and can transition to the following states as data is appended.
Append state name | Description |
---|---|
WAITING_FOR_SEGMENT | Waiting for the start of an initialization segment or media segment to be appended. |
PARSING_INIT_SEGMENT | Currently parsing an initialization segment. |
PARSING_MEDIA_SEGMENT | Currently parsing a media segment. |
The input buffer is a byte buffer that is used to hold unparsed bytes across appendArrayBuffer()
and appendStream()
calls. The buffer is empty when the SourceBuffer object is created.
The buffer full flag keeps track of whether appendArrayBuffer()
or
appendStream()
is allowed to accept more bytes. It is set to false when the SourceBuffer object is created and gets updated
as data is appended and removed.
When this algorithm is invoked, run the following steps:
- Loop Top: If the input buffer is empty, then jump to the need more data step below.
- If the input buffer starts with bytes that violate the byte stream format specifications, then run the append error algorithm and abort this algorithm.
- Remove any bytes that the byte stream format specifications say should be ignored from the start of the input buffer.
-
If the append state equals WAITING_FOR_SEGMENT, then run the following steps:
- If the beginning of the input buffer indicates the start of an initialization segment, set the append state to PARSING_INIT_SEGMENT.
- If the beginning of the input buffer indicates the start of an media segment, set append state to PARSING_MEDIA_SEGMENT.
- Jump to the loop top step above.
-
If the append state equals PARSING_INIT_SEGMENT, then run the following steps:
- If the input buffer does not contain a complete initialization segment yet, then jump to the need more data step below.
- Run the initialization segment received algorithm.
- Remove the initialization segment bytes from the beginning of the input buffer.
- Set append state to WAITING_FOR_SEGMENT.
- Jump to the loop top step above.
-
If the append state equals PARSING_MEDIA_SEGMENT, then run the following steps:
- If the first initialization segment flag is false, then run the append error algorithm and abort this algorithm.
-
If the input buffer does not contain a complete media segment header yet, then jump to the need more data step below.
NoteImplementations may choose to implement this state as an incremental parser so that it is not necessary to have the entire media segment before running the coded frame processing algorithm.
- Run the coded frame processing algorithm.
- Remove the media segment bytes from the beginning of the input buffer.
- If this
SourceBuffer
is full and cannot accept more media data, then set the buffer full flag to true. -
Set append state to WAITING_FOR_SEGMENT.
NoteIncremental parsers should only do this transition after the entire media segment has been received.
- Jump to the loop top step above.
- Need more data: If the stream append loop algorithm is running and still has data to read, then abort these steps.
- Set the
appending
attribute to false. - Queue a task to fire a simple event named
appendend
at thisSourceBuffer
object.
4.5.2 Reset Parser State
When the parser state needs to be reset, run the following steps:
- If the append state equals PARSING_MEDIA_SEGMENT and the input buffer contains some complete coded frames, then run the coded frame processing algorithm as if the media segment only contained these frames.
- Unset the last decode timestamp on all track buffers.
- Remove all bytes from the input buffer.
- Set append state to WAITING_FOR_SEGMENT.
4.5.3 Append Error
When an error occurs during an append, run the following steps:
- Run the reset parser state algorithm.
- Abort the stream append loop algorithm if it is running.
- Set the
appending
attribute to false. -
Queue a task to fire a simple event named
error
at thisSourceBuffer
object.Issue 6Need a way to convey error information.
- Queue a task to fire a simple event named
appendend
at thisSourceBuffer
object.
4.5.4 Stream Append Loop
When a Stream is passed to appendStream()
, the following steps are run to transfer data from the
Stream to the SourceBuffer
. This algorithm is initialized with the stream and maxSize parameters
from the appendStream()
call.
- If maxSize is set, then let bytesLeft equal maxSize.
- Loop Top: If maxSize is set and bytesLeft equals 0, then jump to the loop done step below.
- If stream has been aborted, then run the append error algorithm and abort this algorithm.
- If stream has been closed, then jump to the loop done step below.
-
If the buffer full flag equals true, then run the append error algorithm and abort this algorithm.
NoteThe web application must use
remove()
to free up space in theSourceBuffer
. - Read data from stream into data:
- If maxSize is set:
-
- Read up to bytesLeft bytes from stream into data.
- Subtract the number of bytes in data from bytesLeft.
- Otherwise:
- Read all available bytes in stream into data.
- If an error occured while reading from stream, then run the append error algorithm and abort this algorithm.
- Add data to the end of the input buffer.
- Run the segment parser loop algorithm.
- Jump to the loop top step above.
- Loop Done: Set the
appending
attribute to false. - Queue a task to fire a simple event named
appendend
at thisSourceBuffer
object.
4.5.5 Initialization Segment Received
The following steps are run when the segment parser loop successfully parses a complete initialization segment:
Each SourceBuffer object has an internal first initialization segment flag that tracks whether the first initialization segment has been appended. This flag is set to false when the SourceBuffer is created and updated by the algorithm below.
- Update the
duration
attribute if it currently equals NaN:- If the initialization segment contains a duration:
- Run the duration change algorithm with new duration set to the duration in the initialization segment.
- Otherwise:
- Run the duration change algorithm with new duration set to positive Infinity.
- If the initialization segment has no audio, video, or text tracks, then call
endOfStream("decode")
and abort these steps. - If the first initialization segment flag is true, then run the following steps:
- Verify the following properties. If any of the checks fail then call
endOfStream("decode")
and abort these steps.- The number of audio, video, and text tracks match what was in the first initialization segment.
- The codecs for each track, match what was specified in the first initialization segment.
- If more than one track for a single type are present (ie 2 audio tracks), then the Track IDs match the ones in the first initialization segment.
- Add the appropriate track descriptions from this initialization segment to each of the track buffers.
- Verify the following properties. If any of the checks fail then call
- Let active track flag equal false.
-
If the first initialization segment flag is false, then run the following steps:
-
For each audio track in the initialization segment, run following steps:
- Let new audio track be a new
AudioTrack
object. - Generate a unique ID and assign it to the
id
property on new audio track. -
If
audioTracks
.length
equals 0, then run the following steps:- Set the
enabled
property on new audio track to true. - Set active track flag to true.
- Set the
- Add new audio track to
audioTracks
. - Create a new track buffer to store coded frames for this track.
- Add the track description for this track to the track buffer.
- Let new audio track be a new
-
For each video track in the initialization segment, run following steps:
- Let new video track be a new
VideoTrack
object. - Generate a unique ID and assign it to the
id
property on new video track. -
If
videoTracks
.length
equals 0, then run the following steps:- Set the
selected
property on new video track to true. - Set active track flag to true.
- Set the
- Add new video track to
videoTracks
. - Create a new track buffer to store coded frames for this track.
- Add the track description for this track to the track buffer.
- Let new video track be a new
-
For each text track in the initialization segment, run following steps:
-
Let new text track be a new
TextTrack
object with its properties populated with the appropriate information from the initialization segment. -
If the
mode
property on new text track equals"showing"
or"hidden"
, then set active track flag to true. - Add new text track to
textTracks
. - Create a new track buffer to store coded frames for this track.
- Add the track description for this track to the track buffer.
-
Let new text track be a new
- If active track flag equals true, then run the following steps:
- Add this
SourceBuffer
toactiveSourceBuffers
. - Queue a task to fire a simple event named
addsourcebuffer
atactiveSourceBuffers
- Add this
- Set first initialization segment flag to true.
-
-
If the
HTMLMediaElement.readyState
attribute isHAVE_NOTHING
, then run the following steps:-
If one or more objects in
sourceBuffers
have first initialization segment flag set to false, then abort these steps. - Set the
HTMLMediaElement.readyState
attribute toHAVE_METADATA
. - Queue a task to fire a simple event named
loadedmetadata
at the media element.
-
If one or more objects in
-
If the active track flag equals true and the
HTMLMediaElement.readyState
attribute is greater thanHAVE_CURRENT_DATA
, then set theHTMLMediaElement.readyState
attribute toHAVE_METADATA
.
4.5.6 Coded Frame Processing
When complete coded frames have been parsed by the segment parser loop then the following steps are run:
-
For each coded frame in the media segment run the following steps:
- Let presentation timestamp be a double precision floating point representation of the coded frame's presentation timestamp.
- Let decode timestamp be a double precision floating point representation of the coded frame's decode timestamp.
-
If
timestampOffset
is not 0, then run the following steps:- Add
timestampOffset
to the presentation timestamp. - Add
timestampOffset
to the decode timestamp. - If the presentation timestamp or decode timestamp is less than the presentation start time, then call
endOfStream("decode")
, and abort these steps.
- Add
- Let track buffer equal the track buffer that the coded frame should be added to.
- If last decode timestamp for track buffer is set and decode timestamp is less than
last decode timestamp, then call
endOfStream("decode")
and abort these steps. - Add the coded frame with the presentation timestamp and decode timestamp, to the track buffer.
- Set last decode timestamp for track buffer to decode timestamp.
-
If the
HTMLMediaElement.readyState
attribute isHAVE_METADATA
and the new coded frames cause all objects inactiveSourceBuffers
to have media data for the current playback position, then run the following steps:- Set the
HTMLMediaElement.readyState
attribute toHAVE_CURRENT_DATA
. - If this is the first transition to
HAVE_CURRENT_DATA
, then queue a task to fire a simple event namedloadeddata
at the media element.
- Set the
-
If the
HTMLMediaElement.readyState
attribute isHAVE_CURRENT_DATA
and the new coded frames cause all objects inactiveSourceBuffers
to have media data beyond the current playback position, then run the following steps:- Set the
HTMLMediaElement.readyState
attribute toHAVE_FUTURE_DATA
. -
Queue a task to fire a simple event named
canplay
at the media element.
- Set the
-
If the
HTMLMediaElement.readyState
attribute isHAVE_FUTURE_DATA
and the new coded frames cause all objects inactiveSourceBuffers
to have enough data to start playback, then run the following steps:- Set the
HTMLMediaElement.readyState
attribute toHAVE_ENOUGH_DATA
. -
Queue a task to fire a simple event named
canplaythrough
at the media element.
- Set the
- If the media segment contains data beyond the current
duration
, then run the duration change algorithm with new duration set to the maximum of the current duration and the highest end timestamp reported byHTMLMediaElement.buffered
.
5. SourceBufferList Object
SourceBufferList is a simple container object for SourceBuffer
objects. It provides read-only array access and fires events when the list is modified.
interface SourceBufferList : EventTarget {
readonly attribute unsigned long length;
getter SourceBuffer (unsigned long index);
};
5.1 Attributes
length
of type unsigned long, readonly-
Indicates the number of
SourceBuffer
objects in the list.
5.2 Methods
SourceBuffer
-
Allows the SourceBuffer objects in the list to be accessed with an array operator (i.e. []).
Parameter Type Nullable Optional Description index unsigned long
✘ ✘ Return type:getter
When this method is invoked, the user agent must run the following steps:
- If index is greater than or equal to the
length
attribute then return undefined and abort these steps. - Return the index'th
SourceBuffer
object in the list.
- If index is greater than or equal to the
5.3 Event Summary
Event name | Interface | Dispatched when... |
---|---|---|
addsourcebuffer |
Event |
When a SourceBuffer is added to the list. |
removesourcebuffer |
Event |
When a SourceBuffer is removed from the list. |
6. URL Object
partial interface URL {
static DOMString createObjectURL (MediaSource
mediaSource);
};
6.1 Methods
createObjectURL
, static-
Creates URLs for
MediaSource
objects.NoteThis algorithm is intended to mirror the behavior of the File API createObjectURL() method with autoRevoke set to true.
Parameter Type Nullable Optional Description mediaSource MediaSource
✘ ✘ Return type:DOMString
When this method is invoked, the user agent must run the following steps:
- If mediaSource is NULL the return null.
- Return a unique MediaSource object URL that can be used to dereference the mediaSource argument, and run the rest of the algorithm asynchronously.
- provide a stable state
- Revoke the MediaSource object URL by calling revokeObjectURL() on it.
7. HTMLMediaElement attributes
This section specifies what existing attributes on the HTMLMediaElement should return when a MediaSource
is attached to the element.
The HTMLMediaElement.seekable attribute returns a new static normalized TimeRanges object created based on the following steps:
- If
duration
equals NaN - Return an empty
TimeRanges
object. - If
duration
equals positive Infinity - Return a single range with a start time of 0 and an end time equal to the highest end time reported by the
HTMLMediaElement.buffered
attribute. - Otherwise
- Return a single range with a start time of 0 and an end time equal to
duration
.
The HTMLMediaElement.buffered
attribute returns a new static normalized TimeRanges object created based on the following steps:
- Let active ranges be the ranges returned by
buffered
for eachSourceBuffer
object inactiveSourceBuffers
. - Let intersection range be the intersection of the active ranges.
-
If
readyState
is"ended"
, then run the following steps:- Let highest end time be the largest end time in the active ranges.
- Let highest intersection end time be the highest end time in the intersection range.
- If the highest intersection end time is less than the highest end time, then update the intersection range so that the highest intersection end time equals the highest end time.
Issue 7Bug 18615 - Define how SourceBuffer.buffered maps to HTMLMediaElement.buffered
Issue 8Bug 18400 - Define and document timestamp heuristics
- Return the intersection range.
8. Byte Stream Formats
The bytes provided through appendArrayBuffer()
and appendStream()
for a SourceBuffer
form a logical byte stream. The format of this byte stream depends on the media container format in use and is defined in a byte stream format specification. Byte stream format specifications based on WebM , the ISO Base Media File Format, and MPEG-2 Transport Streams are provided below. These format specifications are intended to be the authoritative source for how data from these containers is formatted and passed to a SourceBuffer
. If a MediaSource
implementation claims to support any of these container formats, then it must implement the corresponding byte stream format specification described below.
This section provides general requirements for all byte stream formats:
- A byte stream format specification must define initialization segments and media segments.
- It must be possible to identify segment boundaries and segment type (initialization or media) by examining the byte stream alone.
- The following rules apply to all initialization segments within a byte stream:
-
The number and type of tracks must be consistent.
For example, if the first initialization segment has 2 audio tracks and 1 video track, then all initialization segments that follow it in the byte stream must describe 2 audio tracks and 1 video track.
-
Track IDs do not need to be the same across initialization segments if the segment describes only one track of each type.
For example, if an initialization segment describes a single audio track and a single video track, the internal Track IDs do not need to be the same.
- Track IDs must be the same across initialization segments if the segment describes multiple tracks of a single type. (e.g. 2 audio tracks).
-
Codecs changes are not allowed.
For example, a byte stream that starts with an initialization segment that specifies a single AAC track and later contains an initialization segment that specifies a single AMR-WB track is not allowed. Support for multiple codecs is handled with multiple
SourceBuffer
objects. -
Video frame size changes are allowed and must be supported seamlessly.
NoteThis will cause the <video> display region to change size if the web application does not use CSS or HTML attributes (width/height) to constrain the element size.
-
Audio channel count changes are allowed, but they may not be seamless and could trigger downmixing.
NoteThis is a quality of implementation issue because changing the channel count may require reinitializing the audio device, resamplers, and channel mixers which tends to be audible.
-
- The following rules apply to all media segments within a byte stream:
- All timestamps must be mapped to the same presentation timeline.
- Segments must start with a random access point to facilitate seamless splicing at the segment boundary.
- Gaps between media segments that are smaller than the audio frame size are allowed and must not cause playback to stall. Such gaps must not be reflected by
buffered
.NoteThis is intended to simplify switching between audio streams where the frame boundaries don't always line up across encodings (e.g. Vorbis).
- The combination of an initialization segment and any contiguous sequence of media segments associated with it must:
- Identify the number and type (audio, video, text, etc.) of tracks in the Segments
- Identify the decoding capabilities needed to decode each track (i.e. codec and codec parameters)
- If a track is encrypted, provide any encryption parameters necessary to decrypt the content (except the encryption key itself)
- For each track, provide all information necessary to decode and render the earliest random access point in the sequence of Media Segments and all subsequent samples in the sequence (in presentation time). This includes, in particular,
- Information that determines the intrinsic width and height of the video (specifically, this requires either the picture or pixel aspect ratio, together with the encoded resolution).
- Information necessary to convert the video decoder output to a format suitable for display
- Identify the global presentation timestamp of every sample in the sequence of Media Segments
For example, if I1 is associated with M1, M2, M3 then the above must hold for all the combinations I1+M1, I1+M2, I1+M1+M2, I1+M2+M3, etc.
Byte stream specifications must at a minimum define constraints which ensure that the above requirements hold. Additional constraints may be defined, for example to simplify implementation.
8.1 WebM Byte Streams
This section defines segment formats for implementations that choose to support WebM.
8.1.1 Initialization Segments
A WebM initialization segment must contain a subset of the elements at the start of a typical WebM file.
The following rules apply to WebM initialization segments:
- The initialization segment must start with an EBML Header element, followed by a Segment header.
- The size value in the Segment header must signal an "unknown size" or contain a value large enough to include the Segment Information and Tracks elements that follow.
- A Segment Information element and a Tracks element must appear, in that order, after the Segment header and before any further EBML Header or Cluster elements.
- Any elements other than an EBML Header or a Cluster that occur before, in between, or after the Segment Information and Tracks elements are ignored.
8.1.2 Media Segments
A WebM media segment is a single Cluster element.
The following rules apply to WebM media segments:
- The Timecode element in the Cluster contains a presentation timestamp in TimecodeScale units.
- The TimecodeScale in the WebM initialization segment most recently appended applies to all timestamps in the Cluster
- The Cluster header may contain an "unknown" size value. If it does then the end of the cluster is reached when another Cluster header or an element header that indicates the start of an WebM initialization segment is encountered.
- Block & SimpleBlock elements must be in time increasing order consistent with the WebM spec.
- If the most recent WebM initialization segment describes multiple tracks, then blocks from all the tracks must be interleaved in time increasing order. At least one block from all audio and video tracks must be present.
- Cues or Chapters elements may follow a Cluster element. These elements must be accepted and ignored by the user agent.
8.1.3 Random Access Points
A SimpleBlock element with its Keyframe flag set signals the location of a random access point for that track. Media segments containing multiple tracks are only considered a random access point if the first SimpleBlock for each track has its Keyframe flag set. The order of the multiplexed blocks must conform to the WebM Muxer Guidelines.
8.2 ISO Base Media File Format Byte Streams
This section defines segment formats for implementations that choose to support the ISO Base Media File Format ISO/IEC 14496-12 (ISO BMFF).
Bug 18933 - Segment byte boundaries are not defined
8.2.1 Initialization Segments
An ISO BMFF initialization segment must contain a single Movie Header Box (moov). The tracks in the Movie Header Box must not contain any samples (i.e. the entry_count in the stts, stsc and stco boxes must be set to zero). A Movie Extends (mvex) box must be contained in the Movie Header Box to indicate that Movie Fragments are to be expected.
The initialization segment may contain Edit Boxes (edts) which provide a mapping of composition times for each track to the global presentation time.
8.2.2 Media Segments
An ISO BMFF media segment must contain a single Movie Fragment Box (moof) followed by one or more Media Data Boxes (mdat).
The following rules apply to ISO BMFF media segments:
- The Movie Fragment Box must contain at least one Track Fragment Box (traf).
- The Movie Fragment Box must use movie-fragment relative addressing and the flag default-base-is-moof must be set; absolute byte-offsets must not be used.
- External data references must not be used.
- If the Movie Fragment contains multiple tracks, the duration by which each track extends should be as close to equal as practical.
- Each Track Fragment Box must contain a Track Fragment Decode Time Box (tfdt)
- The first sample in each Track Fragment Run Box (trun) must indicate that the sample is a random access point.
- The Media Data Boxes must contain all the samples referenced by the Track Fragment Run Boxes (trun) of the Movie Fragment Box.
8.2.3 Random Access Points
A random access point as defined in this specification corresponds to a Stream Access Point of type 1 or 2 as defined in Annex I of ISO/IEC 14496-12.
8.3 MPEG-2 Transport Stream Byte Streams
This section defines segment formats for implementations that choose to support MPEG-2 Transport Streams (MPEG-2 TS) specified in ISO/IEC 13818-1.
8.3.1 General
MPEG-2 TS media and initialization segments must conform to the MPEG-2 TS Adaptive Profile (ISO/IEC 13818-1:2012 Amd. 2).
The following rules must apply to all MPEG-2 TS segments:
- Segments must contain complete MPEG-2 TS packets.
- Segments must contain only complete PES packets and sections.
- Segments must contain exactly one program.
- All MPEG-2 TS packets must have the transport_error_indicator set to 0
8.3.2 Initialization Segments
An MPEG-2 TS initialization segment must contain a single PAT and a single PMT. Other SI, such as CAT, that are invariant for all subsequent media segments, may be present.
8.3.3 Media Segments
The following rules apply to all MPEG-2 TS media segments:
- PSI that is identical to the information in the initialization segment may appear repeatedly throughout the segment.
- The media segment will not rely on initialization information in another media segment.
- Media Segments must contain only complete PES packets and sections.
- Each PES packet must be comprised of one or more complete access units.
- Each PES packet must have a PTS timestamp.
- PCR must be present in the Segment prior to the first byte of a TS packet payload containing media data.
- The presentation duration of each media component within the Media Segment should be as close to equal as practical.
8.3.4 Random Access Points
A random access point as defined in this specification corresponds to Elementary Stream Random Access Point as defined in ISO/IEC 13818-1.
8.3.5 Timestamp Rollover & Discontinuities
Timestamp rollovers and discontinuities must be handled by the UA. The UA's MPEG-2 TS implementation must maintain an internal offset
variable, MPEG2TS_timestampOffset, to keep track of the offset that needs to be applied to timestamps
that have rolled over or are part of a discontinuity. MPEG2TS_timestampOffset is initially set to 0 when the SourceBuffer
is
created. This offset must be applied to the timestamps as part of the conversion process from MPEG-2 TS packets
into coded frames for the coded frame processing algorithm. This results in the coded frame timestamps
for a packet being computed by the following equations:
Coded Frame Presentation Timestamp = (MPEG-2 TS presentation timestamp) + MPEG2TS_timestampOffset Coded Frame Decode Timestamp = (MPEG-2 TS decode timestamp) + MPEG2TS_timestampOffset
MPEG2TS_timestampOffset is updated in the following ways:
- Each time a timestamp rollover is detected, 2^33 must be added to MPEG2TS_timestampOffset.
- When a discontinuity is detected, MPEG2TS_timestampOffset must be adjusted to make the timestamps after the discontinuity appear to come immediately after the timestamps before the discontinuity.
- When
abort()
is called, MPEG2TS_timestampOffset must be set to 0. - When
timestampOffset
is successfully set, MPEG2TS_timestampOffset must be set to 0.
9. Examples
Example use of the Media Source Extensions
<script> function onSourceOpen(videoTag, e) { var mediaSource = e.target; var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"'); videoTag.addEventListener('seeking', onSeeking.bind(videoTag, mediaSource)); videoTag.addEventListener('progress', onProgress.bind(videoTag, mediaSource)); var initSegment = GetInitializationSegment(); if (initSegment == null) { // Error fetching the initialization segment. Signal end of stream with an error. mediaSource.endOfStream("network"); return; } // Append the initialization segment. var firstAppendHandler = function(e) { var sourceBuffer = e.target; sourceBuffer.removeEventListener('appendend', firstAppendHandler); // Append some initial media data. appendNextMediaSegment(mediaSource); }; sourceBuffer.addEventListener('appendend', firstAppendHandler); sourceBuffer.appendArrayBuffer(initSegment); } function appendNextMediaSegment(mediaSource) { if (mediaSource.readyState == "ended") return; // If we have run out of stream data, then signal end of stream. if (!HaveMoreMediaSegments()) { mediaSource.endOfStream(); return; } // Make sure the previous append is not still pending. if (mediaSource.sourceBuffers[0].appending) return; var mediaSegment = GetNextMediaSegment(); if (!mediaSegment) { // Error fetching the next media segment. mediaSource.endOfStream("network"); return; } mediaSource.sourceBuffers[0].appendArrayBuffer(mediaSegment); } function onSeeking(mediaSource, e) { var video = e.target; // Abort current segment append. mediaSource.sourceBuffers[0].abort(); // Notify the media segment loading code to start fetching data at the // new playback position. SeekToMediaSegmentAt(video.currentTime); // Append a media segment from the new playback position. appendNextMediaSegment(mediaSource); } function onProgress(mediaSource, e) { appendNextMediaSegment(mediaSource); } </script> <video id="v" autoplay> </video> <script> var video = document.getElementById('v'); var mediaSource = new MediaSource(); mediaSource.addEventListener('sourceopen', onSourceOpen.bind(this, video)); video.src = window.URL.createObjectURL(mediaSource); </script>
10. Revision History
Version | Comment |
---|---|
04 January 2013 |
|
14 December 2012 | Pubrules, Link Checker, and Markup Validation fixes. |
13 December 2012 |
|
08 December 2012 |
|
06 December 2012 |
|
28 November 2012 |
|
09 November 2012 | Converted document to ReSpec. |
18 October 2012 | Refactored SourceBuffer.append() & added SourceBuffer.remove(). |
8 October 2012 |
|
1 October 2012 | Fixed various addsourcebuffer & removesourcebuffer bugs and allow append() in ended state. |
13 September 2012 | Updated endOfStream() behavior to change based on the value of HTMLMediaElement.readyState. |
24 August 2012 |
|
22 August 2012 |
|
17 August 2012 | Minor editorial fixes. |
09 August 2012 | Change presentation start time to always be 0 instead of using format specific rules about the first media segment appended. |
30 July 2012 | Added SourceBuffer.timestampOffset and MediaSource.duration. |
17 July 2012 | Replaced SourceBufferList.remove() with MediaSource.removeSourceBuffer(). |
02 July 2012 | Converted to the object-oriented API |
26 June 2012 | Converted to Editor's draft. |
0.5 | Minor updates before proposing to W3C HTML-WG. |
0.4 | Major revision. Adding source IDs, defining buffer model, and clarifying byte stream formats. |
0.3 | Minor text updates. |
0.2 | Updates to reflect initial WebKit implementation. |
0.1 | Initial Proposal |