CARVIEW |
Media Capture from DOM Elements
W3C Working Draft
- This version:
- https://www.w3.org/TR/2021/WD-mediacapture-fromelement-20210119/
- Latest published version:
- https://www.w3.org/TR/mediacapture-fromelement/
- Latest editor's draft:
- https://w3c.github.io/mediacapture-fromelement/
- Test suite:
- https://github.com/web-platform-tests/wpt/tree/master/mediacapture-fromelement
- Previous version:
- https://www.w3.org/TR/2020/WD-mediacapture-fromelement-20201202/
- Editors:
- Martin Thomson (Mozilla)
- Miguel Casas-Sanchez (Google, Inc.) (Media Element parts.)
- Emircan Uysaler (Google, Inc.) (Canvas Element parts.)
- Participate:
- GitHub w3c/mediacapture-fromelement
- File a bug
- Commit history
- Pull requests
- Participate:
- Mailing list
Copyright © 2015-2021 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
Abstract
This document defines how a stream of media can be captured from a DOM element, such as a
video
, audio
, or canvas
element, in the form of a MediaStream
[GETUSERMEDIA].
Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation.
This document was published by the Web Real-Time Communications Working Group as a Working Draft. This document is intended to become a W3C Recommendation.
GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-webrtc@w3.org (archives).
Publication as a Working Draft does not imply endorsement by the W3C Membership.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 15 September 2020 W3C Process Document.
1. Introduction
This section is non-normative.
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
The captured media is formed into a MediaStream
[GETUSERMEDIA], which can
then be consumed by the various APIs that process streams of media, such as WebRTC
[WEBRTC], or Web Audio [WEBAUDIO].
2. Conformance
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST and MUST NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
3. HTML Media Element Media Capture Extensions
The method
is added on HTML [HTML5] media elements.
Methods for capture are added to both captureStream
HTMLMediaElement
and HTMLCanvasElement
.
Both MediaStream
and HTMLMediaElement
expose the concept
of a track
. Since there is no common type used for
HTMLMediaElement
, this document uses the term track to refer
to either VideoTrack
or AudioTrack
.
MediaStreamTrack
is used to identify the media in a MediaStream
.
WebIDLpartial interface HTMLMediaElement {
MediaStream captureStream
();
};
Methods
-
captureStream
-
The
captureStream()
method produces a real-time capture of the media that is rendered to the media element.The captured
MediaStream
comprises ofMediaStreamTrack
s that render the content from the set of selected (forVideoTrack
s, or other exclusively selected track types) or enabled (forAudioTrack
s, or other track types that support multiple selections) tracks from the media element. If the media element does not have a selected or enabled tracks of a given type, then noMediaStreamTrack
of that type is present in the captured stream.A
video
element can therefore capture a videoMediaStreamTrack
and any number of audioMediaStreamTrack
s. Anaudio
element can capture any number of audioMediaStreamTrack
s. In both cases, the set of capturedMediaStreamTrack
s could be empty.Unless and until there is a track of given type that is selected or enabled, no
MediaStreamTrack
of that type is present in the captured stream. In particular, if the media element does not have a source assigned, then the capturedMediaStream
has no tracks. Consequently, a media element with a ready state of HAVE_NOTHING produces no capturedMediaStreamTrack
instances. Once metadata is available and the selected or enabled tracks are determined, new capturedMediaStreamTrack
instances are created and added to theMediaStream
.A captured
MediaStreamTrack
ends when playback ends (and theended
event fires) or when the track that it captures is no longer selected or enabled for playback. A track is no longer selected or enabled if the source is changed by setting thesrc
orsrcObject
attributes of the media element.The set of captured
MediaStreamTrack
s change if the source of the media element changes. If the source for the media element ends, a different source is selected.If the selected
VideoTrack
or enabledAudioTrack
s for the media element change, aaddtrack
event with a newMediaStreamTrack
is generated for each track that was not previously selected or enabled; and aremovetrack
events is generated for each track that ceases to be selected or enabled. AMediaStreamTrack
MUST end prior to being removed from theMediaStream
.Since a
MediaStreamTrack
can only end once, a track that is enabled, disabled and re-enabled will be captured as two separate tracks. Similarly, restarting playback after playback ends causes a new set of capturedMediaStreamTrack
instances to be created. Seeking during playback without changing track selection does not generate events or cause a capturedMediaStreamTrack
to end.The
MediaStreamTrack
s that comprise the capturedMediaStream
become muted or unmuted as the tracks they capture change state. At any time, a media element might not have active content available for capture on a given track for a variety of reasons:- Media playback could be paused.
- A track might not have content for the current playback time if that time is either before the content of that track starts or after the content ends.
- A
MediaStreamTrack
that is acting as a source could be muted or disabled. - The contents of the track might become inaccessible to the current origin due to cross-origin protections. For instance, content that is rendered from an HTTP URL can be subject to a redirect on a request for partial content, or the enabled or selected tracks can change to include cross-origin content.
Absence of content is reflected in captured tracks through the
muted
attribute. A capturedMediaStreamTrack
MUST have amuted
attribute set totrue
if its corresponding source track does not have available and accessible content. Amute
event is raised on theMediaStreamTrack
when content availability changes.What output a muted capture produces as a result will vary based on the type of media: a
VideoTrack
ceases to capture new frames when muted, causing the captured stream to show the last captured frame; a mutedAudioTrack
produces silence.Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding a media element cause captured video to stop. Similarly, the audio level or volume of the media element does not affect the volume of captured audio.
Captured audio from an element with an effective playback rate other than 1.0 MUST be time-stretched. An unplayable playback rate causes the captured audio track to become muted.
4. HTML Canvas Element Media Capture Extensions
The
method is added to the HTML [HTML5] canvas
element. The resulting captureStream
provides methods
that allow for controlling when frames are sampled from the canvas.
CanvasCaptureMediaStreamTrack
WebIDLpartial interface HTMLCanvasElement {
MediaStream captureStream
(optional double frameRequestRate);
};
Methods
-
captureStream
-
The
captureStream()
method produces a real-time video capture of the surface of the canvas. The resulting media stream has a single video
that matches the dimensions of the canvas element.CanvasCaptureMediaStreamTrack
Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a
SecurityError
exception if the canvas is not origin-clean.A captured stream MUST immediately cease to capture content if the origin-clean flag of the source canvas becomes false after the stream is created by
captureStream
()
. The capturedMediaStreamTrack
MUST become muted, producing no new content while the canvas remains in this state.Each track that captures a canvas has a [[frameCaptureRequested]] internal slot that is set to
true
when a new frame is requested from the canvas.The value of [[frameCaptureRequested]] on all new tracks is set to
true
when the track is created. On creation of the captured track with a specific, non-zero frameRequestRate, the user agent starts a periodic timer at an interval of1/frameRequestRate
seconds. At each activation of the timer, [[frameCaptureRequested]] is set totrue
.In order to support manual control of frame capture with the
() method, browsers MUST support a value of 0 for frameRequestRate. However, a captured stream MUST request capture of a frame when created, even if frameRequestRate is zero.requestFrame
This method throws a
NotSupportedError
if frameRequestRate is negative.A new frame is requested from the canvas when [[frameCaptureRequested]] is true and the canvas is painted. Each time that the captured canvas is painted, the following steps are executed:
- For each track capturing from the canvas execute the following
steps:
- If new content has been drawn to the canvas since it was last painted, and if the [[frameCaptureRequested]] internal slot of track is set, add a new frame to track containing what was painted to the canvas.
- If a frameRequestRate value was specified, set the
[[frameCaptureRequested]] internal slot of track
to
false
.
When adding new frames to track containing what was painted to the canvas, the alpha channel content of the canvas must be captured and preserved if the canvas is not fully opaque. The consumers of this track might not preserve the alpha channel.
NoteThis algorithm results in a captured track not starting until something changes in the canvas.
Parameter Type Nullable Optional Description frameRequestRate double
✘ ✔ Return type:MediaStream
- For each track capturing from the canvas execute the following
steps:
4.1
The CanvasCaptureMediaStreamTrack
CanvasCaptureMediaStreamTrack
The CanvasCaptureMediaStreamTrack
is an extension of
MediaStreamTrack
that provide a single requestFrame
()
method.
Applications that depend on tight control over the rendering of content to the media
stream can use this method to control when frames from the canvas are captured.
WebIDL[Exposed=Window] interfaceCanvasCaptureMediaStreamTrack
: MediaStreamTrack { readonly attribute HTMLCanvasElementcanvas
; undefinedrequestFrame
(); };
Attributes
-
canvas
of typeHTMLCanvasElement
, readonly - The canvas element that this media stream captures.
Methods
-
requestFrame
-
The
requestFrame
() method allows applications to manually request that a frame from the canvas be captured and rendered into the track. In cases where applications progressively render to a canvas, this allows applications to avoid capturing a partially rendered frame.NoteAs currently specified, this results in no
SecurityError
or other error feedback if the canvas is not origin-clean. In part, this is because we don't track where requests for frames come from. Do we want to highlight that?No parameters.Return type:undefined
5. Security Considerations
Media elements can render media resources from origins that differ from the origin of the
media element. In those cases, the contents of the resulting MediaStreamTrack
MUST be protected from access by the document origin.
How this protection manifests will differ, depending on how the content is accessed. For
instance, rendering inaccessible video to a canvas
element [HTML]
causes the origin-clean
flag of the canvas to become false
; attempting to create a Web Audio
MediaStreamAudioSourceNode
[WEBAUDIO] succeeds, but produces no information
to the document origin (that is, only silence is transmitted into the audio context);
attempting to transfer the media using WebRTC [WEBRTC] results in no information being
transmitted.
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.
6. Change Log
This section will be removed before publication.
Changes since 2015-tbd-tbd
A. Acknowledgements
This document is based on the stream processing specification [streamproc] originally developed by Robert O'Callahan.
B. References
B.1 Normative references
- [HTML]
- HTML Standard. Anne van Kesteren; Domenic Denicola; Ian Hickson; Philip Jägenstedt; Simon Pieters. WHATWG. Living Standard. URL: https://html.spec.whatwg.org/multipage/
- [HTML5]
- HTML5. Ian Hickson; Robin Berjon; Steve Faulkner; Travis Leithead; Erika Doyle Navara; Theresa O'Connor; Silvia Pfeiffer. W3C. 27 March 2018. W3C Recommendation. URL: https://www.w3.org/TR/html5/
- [mediacapture-streams]
- Media Capture and Streams. Cullen Jennings; Bernard Aboba; Jan-Ivar Bruaroey; Henrik Boström; youenn fablet; Daniel Burnett; Adam Bergkvist; Anant Narayanan. W3C. 14 January 2021. W3C Candidate Recommendation. URL: https://www.w3.org/TR/mediacapture-streams/
- [RFC2119]
- Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
- [RFC8174]
- Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://tools.ietf.org/html/rfc8174
- [streamproc]
- MediaStream Processing API. Robert O'Callahan. W3C. 31 May 2012. W3C Note. URL: https://www.w3.org/TR/streamproc/
- [WEBAUDIO]
- Web Audio API. Paul Adenot; Hongchan Choi. W3C. 14 January 2021. W3C Candidate Recommendation. URL: https://www.w3.org/TR/webaudio/
- [WEBIDL]
- Web IDL. Boris Zbarsky. W3C. 15 December 2016. W3C Editor's Draft. URL: https://heycam.github.io/webidl/
- [WEBRTC]
- WebRTC 1.0: Real-Time Communication Between Browsers. Cullen Jennings; Henrik Boström; Jan-Ivar Bruaroey. W3C. 15 December 2020. W3C Proposed Recommendation. URL: https://www.w3.org/TR/webrtc/
B.2 Informative references
- [GETUSERMEDIA]
- Media Capture and Streams. Cullen Jennings; Bernard Aboba; Jan-Ivar Bruaroey; Henrik Boström; youenn fablet; Daniel Burnett; Adam Bergkvist; Anant Narayanan. W3C. 14 January 2021. W3C Candidate Recommendation. URL: https://www.w3.org/TR/mediacapture-streams/