forked from w3c/mediacapture-extensions
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
422 lines (417 loc) · 20.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
<!doctype html>
<html lang="en-us">
<head>
<title>Media Capture and Streams Extensions</title>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<script src="https://www.w3.org/Tools/respec/respec-w3c" class="remove"></script>
<script class='remove'>
"use strict";
// See https://github.com/w3c/respec/wiki/ for how to configure ReSpec
var respecConfig = {
group: "webrtc",
xref: ["html", "infra", "permissions", "dom", "mediacapture-streams", "webaudio", "webidl"],
edDraftURI: "https://w3c.github.io/mediacapture-extensions/",
editors: [
{name: "Jan-Ivar Bruaroey", company: "Mozilla Corporation", w3cid: 79152},
],
shortName: "mediacapture-extensions",
specStatus: "ED",
subjectPrefix: "[mediacapture-extensions]",
github: "https://github.com/w3c/mediacapture-extensions",
};
</script>
</head>
<body>
<section id="abstract">
<p>This document defines a set of ECMAScript APIs in WebIDL to extend the [[mediacapture-streams]] specification.</p>
</section>
<section id="sotd">
<p>This is an unofficial proposal.</p>
</section>
<section id="introduction">
<h2>Introduction</h2>
<p>This document contains proposed extensions and modifications to the
[[mediacapture-streams]] specification.</p>
<p>New features and modifications to existing features proposed here may be
considered for addition into the main specification post Recommendation.
Deciding factors will include maturity of the extension or modification,
consensus on adding it, and implementation experience.</p>
<p>A concrete long-term goal is reducing the fingerprinting surface of
{{MediaDevices/enumerateDevices()}} by deprecating exposure of the device
{{MediaDeviceInfo/label}} in its results. This requires relieving
applications of the burden of building user interfaces to select cameras and
microphones in-content, by offering this in user agents as part of
{{MediaDevices/getUserMedia()}} instead.</p>
<p>Miscellaneous other smaller features are under consideration as well,
such as constraints to control multi-channel audio beyond stereo.</p>
</section>
<section>
<h2>Terminology</h2>
<p>
This document uses the definitions {{MediaDevices}}, {{MediaStreamTrack}},
{{MediaStreamConstraints}} and {{ConstrainablePattern}}
from [[!mediacapture-streams]].
<p>The terms [=permission state=], [=request permission to use=], and
<a data-cite="permissions">prompt the user to choose</a> are defined in
[[!permissions]].</p> </p>
</section>
<section id="conformance">
</section>
<section id="camera and microphone picker">
<h2>In-browser camera and microphone picker</h2>
<p>The existing {{MediaDevices/enumerateDevices()}} function exposes camera
and microphone {{MediaDeviceInfo/label}}s to let applications build
in-content user interfaces for camera and microphone selection. Applications
have had to do this because {{MediaDevices/getUserMedia()}} did not offer a
web compatible in-agent device picker. This specification aims to rectify
that.</p>
<p>Due to the significant fingerprinting vector caused by device
{{MediaDeviceInfo/label}}s, and the well-established nature of the existing
APIs, the scope of this particular effort is limited to removing
{{MediaDeviceInfo/label}}, leaving the overall constraints-based model
intact. This helps ensure a migration path more viable than to a
less-powerful API.</p>
<p>This specification augments the existing {{MediaDevices/getUserMedia()}}
function instead of introducing a new less-powerful API to compete with it,
for that reason as well.</p>
<section id="new getusermedia semantics">
<h3>getUserMedia "user-chooses" semantics</h3>
<p>This specification introduces
slightly altered semantics to the {{MediaDevices/getUserMedia()}}
function called <code>"user-chooses"</code> that guarantee a picker will
be shown to the user in cases where the user agent would otherwise choose
for the user (that is: when application constraints do not narrow down
the choices to a single device). This is orthoginal to permission, and
offers a better and more consistent user experience across applications
and user agents.
</p>
<p>Unfortunately, since the <code>"user-chooses"</code> semantics may
produce user agent prompts at different times and in different situations
compared to the old semantics, they are somewhat incompatible with
expectations in some existing web applications that tend to call
{{MediaDevices/getUserMedia()}} repeatedly and lazily instead of using
e.g. <code>stream.clone()</code>.</p>
</section>
<section id="web compatibility">
<h3>Web compatibility and migration</h3>
<p>User agents are encouraged to provide the new semantics as opt-in
initially for web compatibility. User agents MUST deprecate (remove)
{{MediaDeviceInfo/label}} from {{MediaDeviceInfo}} over time, though specific migration strategies
are left to user agents. User agents SHOULD migrate to offering the new
semantics by default (opt-out) over time.</p>
<p>Since the constraints-model remains intact, web compatibility problems
are expected to be limited to:</p>
<ul>
<li>
Sites that never migrated show e.g. "Camera 1", "Camera 2" etc.
instead of descriptive device labels
</li>
<li>
Sites with no device management strategy provoke a picker in the
user agent every visit for users with more than a singular choice
of camera or microphone (a feature of sorts)
</li>
</ul>
</section>
<section id="mediadevices-interface">
<h3>MediaDevices Interface Extensions</h3>
<div>
<pre class="idl"
>partial interface MediaDevices {
readonly attribute GetUserMediaSemantics defaultSemantics;
};</pre>
</div>
<section>
<h2>Attributes</h2>
<dl data-link-for="MediaDevices" data-dfn-for="MediaDevices"
class="attributes">
<dt id="def-mediadevices-defaultSemantics"><dfn><code>defaultSemantics</code></dfn>
of type <span class="idlAttrType"><a>GetUserMediaSemantics</a></span>, readonly</dt>
<dd>
<p>The default semantics of {{MediaDevices/getUserMedia()}} in this
user agent.</p>
<p>User agents SHOULD default to <code>"browser-chooses"</code>
for backwards compatibility, until a transition plan has been
enacted where a majority of user agents collectively switch their
defaults to <code>"user-chooses"</code> for improved user privacy,
and usage metrics suggest this transition is feasible without
major breakage.</p>
</dd>
</dl>
</section>
</section>
<section id="mediastreamconstraints-dictionary-extensions">
<h3>MediaStreamConstraints dictionary extensions</h3>
<div>
<pre class="idl"
>partial dictionary MediaStreamConstraints {
GetUserMediaSemantics semantics;
};</pre>
<section>
<h2>Dictionary {{MediaStreamConstraints}} Members</h2>
<dl data-link-for="MediaStreamConstraints" data-dfn-for=
"MediaStreamConstraints" class="dictionary-members">
<dt><dfn><code>semantics</code></dfn> of type <span class=
"idlMemberType">{{GetUserMediaSemantics}}</span></dt>
<dd>
<p>In cases where the specified constraints do not narrow
multiple choices between devices down to one per kind, specifies
how the final determination of which devices to pick from the
remaining choices MUST be made. If not specified, then the
<a data-link-for="MediaDevices">defaultSemantics</a> are used.
</p>
</dd>
</dl>
</section>
</div>
</section>
<section id="getusermediasemantics-enum">
<h3>GetUserMediaSemantics enum</h3>
<div>
<pre class="idl"
>enum GetUserMediaSemantics {
"browser-chooses",
"user-chooses"
};</pre>
<table data-link-for="GetUserMediaSemantics" data-dfn-for=
"GetUserMediaSemantics" class="simple">
<tbody>
<tr>
<th colspan="2"><dfn>GetUserMediaSemantics</dfn> Enumeration
description</th>
</tr>
<tr>
<td><dfn><code id=
"idl-def-GetUserMediaSemantics.browser-chooses">browser-chooses</code></dfn></td>
<td>
<p>When application-specified constraints do not narrow multiple
choices between devices down to one per kind, the user agent is
allowed to make the final determination between the remaining
choices.
</p>
</td>
</tr>
<tr>
<td><dfn><code id=
"idl-def-GetUserMediaSemantics.user-chooses">user-chooses</code></dfn></td>
<td>
<p>When application-specified constraints do not narrow
multiple choices between devices down to one per kind, the user
agent MUST
<a href="prompt-the-user-to-choose">prompt the user to choose</a>
between the remaining choices, even if the application already
has permission to some or all of them.</p>
</td>
</tr>
</tbody>
</table>
</div>
</section>
<section>
<h2>Algorithms</h2>
<p>When the {{MediaDevices/getUserMedia()}} method is invoked, run the
following steps before invoking the {{MediaDevices/getUserMedia()}}
algorithm:</p>
<ol>
<li>
<p>Let <var>mediaDevices</var> be the object on which this method was
invoked.</p>
</li>
<li>
<p>Let <var>constraints</var> be the method's first argument.</p>
</li>
<li>
<p>Let <var>semanticsPresent</var> be <code>true</code> if
<var>constraints</var><code>.semantics</code> [= map/exists =],
otherwise <code>false</code>.</p>
</li>
<li>
<p>Let <var>semantics</var> be
<var>constraints</var><code>.semantics</code>
if <a href="https://heycam.github.io/webidl/#dfn-present">present</a>,
or the value of <var>mediaDevices</var><code>.<a data-link-for="MediaDevices">defaultSemantics</a></code>
otherwise.</p>
</li>
<li>
<p>Replace step 6.5.1. of the {{MediaDevices/getUserMedia()}}
algorithm in its entirety with the following two steps:</p>
<ol>
<li>
<p>Let <var>descriptor</var> be a {{PermissionDescriptor}}
with its {{PermissionDescriptor/name}} member set to the permission name
associated with <var>kind</var> (e.g. {{PermissionName/"camera"}} for
<code>"video"</code>, {{PermissionName/"microphone"}} for <code>"audio"</code>), and,
optionally, consider its {{DevicePermissionDescriptor/deviceId}} member set to any appropriate
device's <var>deviceId</var>.</p>
</li>
<li>
<p>If the number of unique devices sourcing tracks of
media type <var>kind</var> in <var>candidateSet</var>
is greater than <code>1</code> and
<var>semantics</var> is <code>"user-chooses"</code>,
then <a>prompt the user to choose</a> a device with
<var>descriptor</var>, resulting in provided media.
Otherwise, <a>request permission to use</a> a
device with <var>descriptor</var>, while considering
all devices being attached to a live and
<a>same-permission</a> MediaStreamTrack in the current
[=browsing
context=] to mean having permission status {{PermissionState/"granted"}},
resulting in provided media.</p>
<p><dfn>Same-permission</dfn> in this context means a
{{MediaStreamTrack}} that required the same level of
permission to obtain as what is being requested.</p>
<p>When asking the user’s permission, the user agent
MUST disclose whether permission will be granted only to
the device chosen, or to all devices of that
<var>kind</var>.</p>
<p>Let <var>track</var> be the provided media, which
MUST be precisely one track of type <var>kind</var> from
<var>finalSet</var>. If <var>semantics</var> is
<code>"browser-chooses"</code> then the decision of
which track to choose from <var>finalSet</var> is up
to the User Agent, which MAY use the value of the computed
"fitness distance" from the <a href=
"https://www.w3.org/TR/mediacapture-streams/#dfn-selectsettings">
SelectSettings</a>
algorithm, the value of <var>semanticsPresent</var>,
or any other internally-available information about
the devices, as inputs to its decision.
If <var>semantics</var> is <code>"user-chooses"</code>,
and the application has not narrowed down the choices
to one, then the user agent MUST ask the user to make
the final selection.</p>
<p>Once selected, the source of the
{{MediaStreamTrack}} MUST NOT change.</p>
<p>User Agents are encouraged to default to or present
a default choice based primarily on fitness distance,
and secondarily on the user's primary or system default
device for <var>kind</var> (when possible). User Agents
MAY allow users to use any media source, including
pre-recorded media files.</p>
</li>
</ol>
</li>
</ol>
</section>
<section>
<h2>Examples</h2>
<div>
<p>This example shows a setup with a start button and a camera selector
using the new semantics (microphone is not shown for brievity but is
equivalent).</p>
<pre class="example">
<button id="start">Start</button>
<button id="chosenCamera" disabled>Camera: none</button>
<script>
let cameraTrack = null;
start.onclick = async () => {
try {
const stream = await navigator.mediaDevices.getUserMedia({
video: {deviceId: localStorage.cameraId}
});
setCameraTrack(stream.getVideoTracks()[0]);
} catch (err) {
console.error(err);
}
}
chosenCamera.onclick = async () => {
try {
const stream = await navigator.mediaDevices.getUserMedia({
video: true,
semantics: "user-chooses"
});
setCameraTrack(stream.getVideoTracks()[0]);
} catch (err) {
console.error(err);
}
}
function setCameraTrack(track) {
cameraTrack = track;
const {deviceId, label} = track.getSettings();
localStorage.cameraId = deviceId;
chosenCamera.innerText = `Camera: ${label}`;
chosenCamera.disabled = false;
}
</script>
</pre>
</div>
</section>
</section>
<section>
<h2>Transferable MediaStreamTrack</h2>
<div>
<p>A {{MediaStreamTrack}} is a <a data-cite="!HTML/#transferable-objects">transferable object</a>.
This allows manipulating real-time media outside the context it was requested or created in,
for instance in workers or third-party iframes.</p>
<p>To preserve the existing privacy and security infrastructure, in particular for capture tracks,
the track source lifetime management remains tied to the context that created it.
The transfer algorithm MUST ensure the following behaviors:</p>
<p>
<ol>
<li><p>The context named <var>originalContext</var> that created a track named <var>originalTrack</var> remains in control
of the <var>originalTrack</var> source, named <var>trackSource</var>, even when <var>originalTrack</var> is transferred into <var>transferredTrack</var>.</p>
</li>
<li>
<p>In particular, <var>originalContext</var> remains the proxy to privacy indicators of <var>trackSource</var>.
<var>transferredTrack</var> or any of its clones are considered as tracks using <var>trackSource</var>
as if they were tracks created in and controlled by <var>originalContext</var>.</p>
</li>
<li><p>When <var>originalContext</var> goes away, <var>trackSource</var> gets ended, thus <var>transferredTrack</var> gets ended.</p></li>
<li><p>When <var>originalContext</var> would have muted/unmuted <var>originalTrack</var>, <var>transferredTrack</var> gets muted/unmuted.</p></li>
<li><p>If <var>transferredTrack</var> is cloned in <var>transferredTrackClone</var>, <var>transferredTrackClone</var> is tied to <var>trackSource</var>.
It is not tied to <var>originalTrack</var> in any way.</p></li>
<li><p>If <var>transferredTrack</var> is transferred into <var>transferredAgainTrack</var>, <var>transferredAgainTrack</var> is tied to <var>trackSource</var>.
It is not tied to <var>transferredTrack</var> or <var>originalTrack</var> in any way.</p></li>
</ol>
</p>
</div>
<div>
<p>The WebIDL changes are the following:
<pre class="idl"
>[Exposed=(Window,Worker), Transferable]
partial interface MediaStreamTrack {
};</pre>
</div>
<div>
<p>At creation of a {{MediaStreamTrack}} object, called <var>track</var>, run the following steps:</p>
<ol>
<li><p>Initialize <var>track</var>.`[[IsDetached]]` to <code>false</code>.</p></li>
</ol>
</div>
<div>
<p>The {{MediaStreamTrack}} <a data-cite="!HTML/#transfer-steps">transfer steps</a>, given <var>value</var> and <var>dataHolder</var>, are:</p>
<ol>
<li><p>If <var>value</var>.`[[IsDetached]]` is <code>true</code>, throw a "DataCloneError" DOMException.</p></li>
<li><p>Set <var>dataHolder</var>.`[[id]]` to <var>value</var>.{{MediaStreamTrack/id}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[kind]]` to <var>value</var>.{{MediaStreamTrack/kind}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[label]]` to <var>value</var>.{{MediaStreamTrack/label}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[readyState]]` to <var>value</var>.{{MediaStreamTrack/readyState}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[enabled]]` to <var>value</var>.{{MediaStreamTrack/enabled}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[muted]]` to <var>value</var>.{{MediaStreamTrack/muted}}.</p></li>
<li><p>Set <var>dataHolder</var>.`[[source]]` to <var>value</var> underlying source.</p></li>
<li><p>Set <var>dataHolder</var>.`[[constraints]]` to <var>value</var> active constraints.</p></li>
<li><p>Set <var>value</var>.`[[IsDetached]]` to <code>true</code>.</p></li>
<li><p>Set <var>value</var>.{{MediaStreamTrack/[[ReadyState]]}} to <a data-cite="!mediacapture-streams/#track-ended">"ended"</a> (without stopping the underlying source or firing an `ended` event).</p></li>
</ol>
</div>
<div><p>{{MediaStreamTrack}} <a data-cite="!HTML/#transfer-receiving-steps">transfer-receiving steps</a>, given <var>dataHolder</var> and <var>track</var>, are:</p>
<ol>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/id}} to <var>dataHolder</var>.`[[id]]`.</p></li>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/kind}} to <var>dataHolder</var>.`[[kind]]`.</p></li>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/label}} to <var>dataHolder</var>.`[[label]]`.</p></li>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/readyState}} to <var>dataHolder</var>.`[[readyState]]`.</p></li>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/enabled}} to <var>dataHolder</var>.`[[enabled]]`.</p></li>
<li><p>Initialize <var>track</var>.{{MediaStreamTrack/muted}} to <var>dataHolder</var>.`[[muted]]`.</p></li>
<li><p>[=MediaStreamTrack/Initialize the underlying source=] of <var>track</var> to <var>dataHolder</var>.`[[source]]` with
[=MediaStreamTrack/Initialize the underlying source/tieSourceToContext=] equal to <code>false</code>.</p></li>
<li><p>Set <var>track</var>'s constraints to <var>dataHolder</var>.`[[constraints]]`.</p></li>
</ol>
</div>
<div>
<p>The underlying source is supposed to be kept alive between the transfer and transfer-receiving steps, or as long as the data holder is alive.
In a sense, between these steps, the data holder is attached to the underlying source as if it was a track.</p>
</div>
</section>
</body>
</html>