Exporters From Japan
Wholesale exporters from Japan   Company Established 1983
CARVIEW
Select Language

Examples

This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g., giving the page access to the local camera) and then disabling the stream (e.g., revoking that access).

<button id="startBtn">Start</button>
<script>
const startBtn = document.getElementById('startBtn');
startBtn.onclick = async () => {
  try {
    startBtn.disabled = true;
    const constraints = {
      audio: true,
      video: true
    };
    const stream = await navigator.mediaDevices.getUserMedia(constraints);
    for (const track of stream.getTracks()) {
      track.onended = () => {
        startBtn.disabled = stream.getTracks().some((t) => t.readyState == 'live');
      };
    }
  } catch (err) {
    console.error(err);
  }
};
</script>
      

This example allows people to take photos of themselves from the local video camera. Note that the Image Capture specification [[?image-capture]] provides a simpler way to accomplish this.

<script>
window.onload = async () => {
  const video = document.getElementById('monitor');
  const canvas = document.getElementById('photo');
  const shutter = document.getElementById('shutter');
  try {
    video.srcObject = await navigator.mediaDevices.getUserMedia({video: true});
    await new Promise(resolve => video.onloadedmetadata = resolve);
    canvas.width = video.videoWidth;
    canvas.height = video.videoHeight;
    document.getElementById('splash').hidden = true;
    document.getElementById('app').hidden = false;
    shutter.onclick = () => canvas.getContext('2d').drawImage(video, 0, 0);
  } catch (err) {
    console.error(err);
  }
};
</script>
<h1>Snapshot Kiosk</h1>
<section id="splash">
  <p id="errorMessage">Loading...</p>
</section>
<section id="app" hidden>
  <video id="monitor" autoplay></video>
  <button id="shutter">&#x1F4F7;</button>
  <canvas id="photo"></canvas>
</section>
      

Permissions Integration

This specification defines two [=powerful features=] identified by the [=powerful feature/names=] "camera" and "microphone".

It defines the following types and algorithms:

[=powerful feature/permission descriptor type=]
          dictionary CameraDevicePermissionDescriptor : PermissionDescriptor {
            boolean panTiltZoom = false;
          };
        

A permission covers access to at least one device of a kind.

The semantics of the descriptor is that it queries for access to any device of that kind. Thus, if a query for the "camera" permission returns {{PermissionState/"granted"}}, the client knows that it will get access to one camera without a permission prompt, and if {{PermissionState/"denied"}} is returned, it knows that no getUserMedia request for a camera will succeed.

If the User Agent considers permission given to some, but not all, devices of a kind, a query will return {{PermissionState/"granted"}}.

If the User Agent considers permission denied to all devices of a kind, a query will return {{PermissionState/"denied"}}.

`{name: "camera", panTiltZoom: true}` is [=PermissionDescriptor/stronger than=] `{name: "camera", panTiltZoom: false}`.

A {{PermissionState/"granted"}} permission is no guarantee that getUserMedia will succeed. It only indicates that the user will not be prompted for permission. There are many other things (such as constraints or the camera being in use) that can cause getUserMedia to fail.

[=powerful feature/permission revocation algorithm=]
This is the result of calling the [=device permission revocation algorithm=] passing {{PermissionDescriptor/name}} as argument.

Permissions Policy Integration

This specification defines two [=policy-controlled features=] identified by the strings "camera" and "microphone". Both have a [=policy-controlled feature/default allowlist=] of "self".

A [=document=]'s [=Document/permissions policy=] determines whether any content in that document is allowed to use {{MediaDevices/getUserMedia}} to request camera or microphone respectively. If disabled in any document, no content in the document will be [=allowed to use=] {{MediaDevices/getUserMedia}} to request the camera or microphone respectively. This is enforced by the [=request permission to use=] algorithm.

Additionally, {{MediaDevices/enumerateDevices}} will only enumerate devices the document is [=allowed to use=].

Privacy Indicator Requirements

This specification expresses privacy indicator requirements using algorithms from the viewpoint of a single {{MediaDevices}} object. Implementers are encouraged to extrapolate these principles to unify presentation of indicators to cover multiple {{MediaDevices}} objects that can co-exist on a page due to iframes.

For each kind of device that {{MediaDevices/getUserMedia()}} exposes,

  • Define any<kind>Accessible (e.g. anyAudioAccessible, anyVideoAccessible) as the logical OR of the {{MediaDevices/[[kindsAccessibleMap]]}}[kind] value and all the {{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] values for devices of that kind.
  • Define any<kind>Live (e.g. anyAudioLive, anyVideoLive) to be the logical OR of the {{MediaDevices/[[kindsAccessibleMap]]}}[kind] value and all the {{MediaDevices/[[devicesLiveMap]]}}[deviceId] values for devices of that kind.

Define anyAccessible to be the logical OR of all any<kind>Accessible values.

Define anyLive to be the logical OR of all any<kind>Live values.

Then the following are requirements on the [=User Agent=]:

  • The [=User Agent=] MUST indicate to the user when the value of anyAccessible changes.
  • The [=User Agent=] MUST indicate to the user when the value of anyLive changes.
  • If the [=User Agent=] provides indication to the user per kind, then for each any<kind>Accessible value and any<kind>Live value, it MUST at minimum indicate when the value changes.
  • If the [=User Agent=] provides indication to the user per device, then for each {{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] value and {{MediaDevices/[[devicesLiveMap]]}}[deviceId] value, it MUST at minimum indicate when the value changes.
  • Any false-to-true transition indicated MUST remain observable for a sufficient time that a reasonably-observant user could become aware of it. This SHOULD be at least 3 seconds.
  • Any of the above transition indications MAY be combined as long as the combined indication cannot transition to false if any of its component indications are still true.

and the following are encouraged behaviors for the [=User Agent=]:

  • The [=User Agent=] is encouraged to provide ongoing indication of the current state of anyAccessible.
  • The [=User Agent=] is encouraged to provide ongoing indication of the current state of anyLive and to make any generic hardware device indicator light match.
  • If the [=User Agent=] provides indication to the user per kind, then for each any<kind>Accessible value and any<kind>Live value, it is encouraged to provide ongoing indication of the current state of the value. It is also encouraged to make any device-type-specific hardware indicator light match the corresponding any<kind>Live value.
  • If the [=User Agent=] provides indication to the user per device, then for each {{MediaDevices/[[devicesAccessibleMap]]}}[deviceId] value and {{MediaDevices/[[devicesLiveMap]]}}[deviceId] value, it is encouraged to provide ongoing indication of the current state of the value. It is also encouraged to make any device-specific hardware indicator light match the corresponding {{MediaDevices/[[devicesLiveMap]]}}[deviceId] value.
  • Any of the above ongoing indications MAY be used instead of the corresponding required transition indication provided the false-to-true transition requirement is met.

Privacy and Security Considerations

This section is non-normative; it specifies no new behavior, but instead summarizes information already present in other parts of the specification.

This specification extends the Web platform with the ability to manage input devices for media - specifically microphones, and cameras. It also potentially allows exposure of information about other media devices, such as audio output devices (speakers and headphones), but the details of such exposure is relegated to other specifications. Capturing audio and video from the user's microphone and camera exposes personally-identifiable information to applications, and this specification requires obtaining explicit user consent before sharing it.

Ahead of camera or microphone capture, an application (the "drive-by web") is only offered the ability to tell whether the user has a camera or a microphone (but not how many). The identifiers for the devices are designed to not be useful for a fingerprint that can track the user between origins, but the presence of camera or microphone ability adds two bits to the fingerprint surface. It recommends to treat the per-origin persistent identifier {{MediaDeviceInfo/deviceId}} as other persistent storage (e.g. cookies) are treated.

Once camera or microphone capture has begun, this specification describes how to get access to, and use, media data from the devices mentioned. This data may be sensitive; advice is given that indicators should be supplied to indicate that devices are in use, but both the nature of permission and the indicators of in-use devices are platform decisions.

Permission to begin capture may be given on a case-by-case basis, or be persistent. In the case of a case-by-case permission, it is important that the user be able to say "no" in a way that prevents the UI from blocking user interaction until permission is given - either by offering a way to say a "persistent NO" or by not using a modal permissions dialog.

Once capture of camera or microphone has begun, the web document gains the ability to list all available media capture devices and their labels. This ability lasts until the web document is closed, and cannot be persisted. In most cases, the labels are stable across origins, and thus potentially provide a way to track a given device across time and origins.

This specification exposes device information of devices other than those in use. This is for backwards compatibility and legacy reasons. Future specifications are advised to not use this model and instead follow best practices as described in the device enumeration design principles.

For open web documents where capture has begun or has taken place, or for web documents that [=Document/is in view|are in view=], the devicechange event can end up being fired at the same time across [=navigables=] and origins each time a new media device is added or removed; user agents can mitigate the risk of correlation of browsing activity across origins by fuzzing the timing of these events, or by deferring their firing until those web documents [=Document/is in view | come into view=].

Once a web document gains access to a media stream from a capture device, it also gains access to detailed information about the device, including its range of operating capabilities (e.g. available resolutions for a camera). These operating capabilities are for the most part persistent across browsing sessions and origins, and thus provide a way to track a given device across time and origins.

Once access to a video stream from a capture device is obtained, that stream can most likely be used to fingerprint uniquely the said device (e.g. via dead pixel detection). Similarly, once access to an audio stream is obtained, that stream can most likely be used to fingerprint user location down to the level of a room or even simultaneous occupation of a room by disparate users (e.g. via analysis of ambient audio or of unique audio purposely played out of the device speaker). User-level mitigation for both audio and video consists of covering up the camera and/or microphone or revoking permission via [=User Agent=] chrome controls.

It is possible to use constraints so that the failure of a getUserMedia call will return information about devices on the system without prompting the user, which increases the surface available for fingerprinting. The [=User Agent=] should consider limiting the rate at which failed getUserMedia calls are allowed in order to limit this additional surface.

In the case of a stored persistent permission to begin capture, it is important that it is easy to find the list of granted permissions and revoke permissions that the user wishes to revoke.

Once permission has been granted, the [=User Agent=] should make two things readily apparent to the user:

  • That the page has access to the devices for which permission is given
  • Whether or not any of the devices are presently recording ("on air") indicator

Developers of sites with stored permissions should be careful that these permissions not be abused. These permissions can be revoked using the [[permissions]] API.

In particular, they should not make it possible to automatically send audio or video streams from authorized media devices to an end point that a third party can select.

Indeed, if a site offered URLs such as https://webrtc.example.org/?call=user that would automatically set up calls and transmit audio/video to user, it would be open for instance to the following abuse:

Users who have granted stored permissions to https://webrtc.example.org/ could be tricked to send their audio/video streams to an attacker EvilSpy by following a link or being redirected to https://webrtc.example.org/?user=EvilSpy.

Extensibility

Although new versions of this specification may be produced in the future, it is also expected that other standards will need to define new capabilities that build upon those in this specification. The purpose of this section is to provide guidance to creators of such extensions.

Any WebIDL-defined interfaces, methods, or attributes in the specification may be extended. Two likely extension points are defining a new media type and defining a new constrainable property.

Defining a new {{MediaStreamTrack/kind}} of media (beyond audio and video)

At a minimum, defining a new media type would require

  • adding a new getXXXXTracks() method for the type to the {{MediaStream}} interface,
  • describing what a muted or disabled track of that type will render (see [[[#media-flow-and-life-cycle]]]),
  • adding the new type as an additional valid value for the {{MediaStreamTrack/kind}} attribute on the {{MediaStreamTrack}} interface,
  • defining any constrainable properties (see ) that are applicable to the media type for each source,
  • updating how the {{HTMLMediaElement}} works with a {{MediaStream}} containing a track of the new media type (see ), including adding a corollary to [= stream/audible =]/inaudible for the new media type,
  • updating {{MediaDeviceKind}} if the new type has enumerable devices,
  • updating the {{MediaStreamTrack/getCapabilities()}} and {{MediaDevices/getUserMedia()}} descriptions,
  • adding the new type to the {{MediaStreamConstraints}} dictionary,
  • describing any new security and/or privacy considerations (see ) introduced by the new type, and
  • if the new type requires user authorization, defining new permissions for it, including a new PermissionDescriptor name associated with the new {{MediaStreamTrack/kind}}, and defining how these permissions, along with access starting and ending, as well as muting/disabling, affect any new and/or existing "on-air" and "device accessible" indicator states (see MediaDevices).

Additionally, it should include updating

  • the source definition,
  • the list of media stream consumers,
  • the description of the {{MediaStreamTrack/label}} attribute on the {{MediaStreamTrack}} interface,
  • the list of sinks (see [[[#the-model-sources-sinks-constraints-and-settings]]]), and
  • the best practice statements referring to video and audio (see ).

It might also include

  • explaining how the media is expected to be used by potential consumers, and
  • giving examples in {{MediaStreamTrackState}} of how such a track might become ended.

Defining a new constrainable property

This will require thinking through and defining how Constraints, Capabilities, and Settings for the property (see ) will work. The relevant text in {{MediaTrackSupportedConstraints}}, {{MediaTrackCapabilities}}, {{MediaTrackConstraints}}, {{MediaTrackSettings}}, , and {{MediaStreamConstraints}} are the model to use.

Creators of extension specifications are strongly encouraged to notify the specification maintainers on the specification repository.
Future versions of this specification and others created by the WebRTC Working Group will take into consideration all extensions they are aware of in an attempt to reduce potential usage conflicts.

Defining a new sink for {{MediaStreamTrack}} and {{MediaStream}}

Other specs can define new sinks for {{MediaStream}} and/or {{MediaStreamTrack}}. At a minimum, a new consumer of a {{MediaStreamTrack}} will need to define:

  • how a {{MediaStreamTrack}} will be consumed in the various states in which it can be, including muted and disabled (see [[[#media-flow-and-life-cycle]]]).

Defining a new source of {{MediaStreamTrack}}

Other specs can define new sources of {{MediaStreamTrack}}. At a minimum, a new source of {{MediaStreamTrack}} will need to

  • define a new API to [=create a MediaStreamTrack=] of the relevant {{MediaStreamTrack/kind}}s from this new source ({{MediaDevices/getUserMedia()}} is dedicated to camera and microphone sources),
  • declare which constrainable properties (see ), if any, are applicable to each {{MediaStreamTrack/kind}} of media this new source produces, and how they work with this source,
  • describe how and when to [=MediaStreamTrack/set a track's muted state=] for this source,
  • describe how and when to end tracks from this source,
  • if capture of the source is a [=powerful feature=] requiring [=express permission=], describe its permissions integration and permissions policy integration,
  • if capture of the source poses a privacy concern, describe its privacy indicator requirements.

Acknowledgements

The editors wish to thank the Working Group chairs and Team Contact, Harald Alvestrand, Stefan Håkansson, Erik Lagerway and Dominique Hazaël-Massieux, for their support. Substantial text in this specification was provided by many people including Jim Barnett, Harald Alvestrand, Travis Leithead, Josh Soref, Martin Thomson, Jan-Ivar Bruaroey, Peter Thatcher, Dominique Hazaël-Massieux, and Stefan Håkansson. Dan Burnett would like to acknowledge the significant support received from Voxeo and Aspect during the development of this specification.