Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby #101

jan-ivar · 2023-01-24T23:33:35Z

§ 3.10.2 Transmitting stored encoded media ...

"Wait signals" or "your message is important to us, please stay on the line", inserted prior to switching to a live interactive stream.

Insertion of announcements or alarm signals in otherwise live streams.

Insertion of static content (such as profile pictures) when the sender temporarily disables a camera.

These kinds of features already exist today, and are implemented client-side using existing web technology.

Moving such app decisions to the sender, seems like a step backwards, reminiscent of dial-tone and DTMF, whose justifications were non-web-client recipients (and from another era).

Even if we can produce compelling use cases that hinge on video being switched out sender-side, we have replaceTrack for that already, and VideoTrackGenerator to bring in other sources of media. Operating on decoded media is simple, well-supported and allows local playback (self-view) of what is being sent. Re-encoding also ensures uniformity and scalability (e.g. SVC encoding).

Moving this logic to the encoded layer would seem to require significant API changes, which don't seem justified merely to optimize out a decoding/re-encoding step from time to time.

E.g. I can imagine a use case of an online teacher wishing the audience to see a training video instead of the teacher, but in this case, the better app IMHO is the one that sends the video using state-of-the-art tech for this (e.g. MSE, giving audience members/end-users the ability to pause/resume), not the one that encodes it into the WebRTC camera stream to simplify the life for the web developer dealing with a unified RTCPeerConnection API, at a cost to end-users.

§ 3.10.3 Decoding pre-encoded media ... we have pre-encoded media (either dynamically generated or stored) that we wish to process in the same way as one processes media coming in over a PeerConnection

The "we" here seems to refer to web developers, not end-users. It therefore infers no benefits to end-users, and is not an acceptable use case to me. Moreover, even for web developers, it's not clear what benefits, if any, come from treating non-WebRTC media as WebRTC media. An RTCPeerConnection outputs a MediaStreamTrack which seems a versatile enough integration point.

alvestrand · 2023-01-26T14:38:28Z

This argument seems to hinge on the idea that non-encoded media is as easy to handle as encoded media for the "insertion" cases.

Non-encoded media is an order of magnitude larger than encoded media, and the transformation takes significant cycles.

We know that people are bothered by the heat generated by our current conferencing applications. This use case is predicated on the theory that saving encode/decode cycles is a Good Thing.

Illogical point: Web developers don't have relevant pre-encoded media (except for testing). The users have media. So the last paragraph makes no sense to me.

dontcallmedom-bot · 2023-05-16T17:01:16Z

This issue was mentioned in WEBRTCWG-2023-05-16 (Page 15)

aboba changed the title ~~Introducing app content-decisions at the encoded video level isn't webby~~ Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby #101

Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby #101

jan-ivar commented Jan 24, 2023

alvestrand commented Jan 26, 2023

dontcallmedom-bot commented May 16, 2023

Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby #101

Section 3.10.2: Introducing app content-decisions at the encoded video level isn't webby #101

Comments

jan-ivar commented Jan 24, 2023

alvestrand commented Jan 26, 2023

dontcallmedom-bot commented May 16, 2023