IETF Week: Media and QUIC (the future in 5 or so years)

vr000m Dailynista
edited November 2022 in Tech Updates πŸ“’

We closely track, experiment, play with ideas that are moving the Real-time Media Delivery field forward. This post is summary of Microsoft's observation. Kudos to the team, links within show the details, this is a high level summary. If you have thoughts, comments -- feel free to share below.

I do want to preface that there is a lot of work still to do here, and we are in very early experimental phase of audio/video and QUIC. Nonetheless, I am really excited about theswe initial results, and what we may be able to do in the future with webrtc and QUIC in the browser. Microsoft ran tests with webcodecs, RTP, webtransport -- all in javascript, no wasm involved.

Before I dive into the results, quickly explaining the set up. The sender sent media captured by getUserMedia, encoded with webcodecs, packetised by the application in javascript and then sent over webtransport. The application use the ideas presented in rtp over quic and the default QUIC congestion control (read more discussion on congestion control in the github thread). The receiver receives the QUIC packets overwebtransport, reconstructs the video frame and plays it back using webcodecs,basically packet dejittering and reconstruction happens in javascript. In this case, the Microsoft team is not using some of the dejittering and playback available via webrtc, rather doing the dejitter and playback in javascript. This is okay to do now, as the webrtc bindings are missing and the explains why the decoding latency was high (foreshadowing the results, see below).

Summarising their findings:

  1. Glass-to-Glass (G2G) latency considerably higher than the frame latency. For example, 1Mbps Full HD Video at 30FPS , the G2G was on average 630ms, while the frame latency was 100ms.
  2. They did not observe any frame re-ordering at the receiver. (interesting... we will need to delve deeper into the mapping of video frame to QUIC stream)
  3. Bandwidth utilisation was application limited, i.e., not enough video packets were generated (my initial thought is to consider filling the unutilised bandwidth with padding or repeating the last fragment of video data or some part of the existing I-Frame/Golden Frame)

While the latencies are high and not ideal fore real-time, these can be tuned and improved. Specifically, for real-time, the congestion control implemented in webtransport is not ideal. Ergo, the first improvement for webrtc-related real-time usecases would be to tune the application and at the transport level. Nonetheless, the vanilla results looked pretty good.

See details of the results at: and the slide 20 from the IETF latest meeting.