What client-side tricks do you use to scale WebRTC applications to larger call sizes?

mark_at_daily Community Manager, Dailynista admin

I've worked with teams building video calling applications that run in a browser for quite a while now. If I've learned anything, it's that there's always a new trick to learn or limitation to work around.

For example, in joining Daily, I learned that to balance CPU and network traffic, it's best to keep the overall download bitrate under 3Mbps (and upload bitrate under 1Mbps). This helps to: 1) keep the UDP network traffic within a range most routers can handle and 2) keep CPU decoding in a range most modern PCs can handle. A nice benefit of establishing this limit is that it helps scope the UI/UX design for your team.

What types of tricks have you learned or limitations have you hit in building and scaling WebRTC applications?


  • aconchillo
    aconchillo Dailynista

    On the client side, as you mention, you can keep traffic low by sending a low bitrate, this will also keep CPU usage low (because there's less to encode). One way to do that is by simply sending a smaller resolution video, however this means that the other end will receive that small video as well. So, there's always a trade off on what to do.

    But there's also the receiving part (tracks you download from other clients). So, on the UI you can always play tricks on what tracks you receive. For example, for a very large call there's no need to render (or even receive) all video streams if they are not visible in the screen. That's probably the trick that will help you the most from having to process (download, decode and render) a bunch of streams.

    Also, simulcast might help as well. It will not help much on what you send as a client, actually it will make it a bit worst since you need to encode more than one stream (e.g. low, medium and high quality for example) and also upload all those three. But on the receiving client you will be able to display the quality you want, for example, rendering 1 high quality video stream (speaker in the case of a video conference app) and rendering 10 low quality video streams (non-speakers). That would reduce both download bandwidth usage and CPU decoding usage.