Video on Demand: Part 1 (The Hitchhiker's Guide to Computer Networks)

Data from the 2019 Global
Internet Phenomena Report
Video is the dominant type of traffic on the Internet. In 2019, video on demand constituted 60% of the total global Internet traffic. The image to the right shows the huge gap between the bandwidth demand of video streaming and the second highest application which is web traffic (think email, blogs, and news). This means that if you are in the computer networks business, you are pretty much in the business of video delivery. So the biggest question here, what is the best way to deliver video? 

The "best" in this context can mean a couple of different things. It can mean delivering the best quality video for the user. Video quality does not only refer to whether a video is in HD or UHD but also whether it freezes and hangs or not. The "best" way to deliver video can also mean the most cost efficient way. If we can save 1% of the cost of delivering all videos on the Internet, we can save tens of millions of dollars in network operations costs. This post will explore some of the technologies used to deliver high quality video at low cost.

Before we go any further, I recommend that you do the following experiment: start watching any youtube video on your phone, then put your phone in airplane mode after the first 5 seconds of the video. The video will not immediately stop after the phone goes in airplane mode, rather, it will keep on playing for a few more seconds before it stops. The first question, with a somewhat obvious answer, is: Where did the extra few seconds of video come from if the phone was not connected to the Internet? Clearly, they were downloaded before you put your phone in airplane mode. This leads us to the following questions: Why does Youtube (or any other video streaming service for that matter) do this? How much video is downloaded and how is that determined? Have you ever been watching a video that switched between high quality and low quality, what's up with that? 

Adaptive Bitrate (ABR) Video Streaming Algorithms

Videos are not stored as one lengthy file. Rather, videos are broken into small chunks that are downloaded in sequence as you watch the video. Each chunk has around 2-10 seconds of video. Each chunk is stored in different bitrates. Examples of bitrates include Standard Definition (SD), High Definition (HD), and Ultra High Definition (UHD). Chucks that get downloaded but are not yet watched are stored in a buffer (the gray line ahead of the red line in your Youtube play like the GIF above).

The time to download a chunk should be smaller than the time it takes to watch it. Otherwise, you will watch a chunk, and then wait a little bit until the next is downloaded. This is actually what happens when a video freezes and rebuffers (like the GIF above). Rebuffering literally means downloading a chunk into an empty buffer (refilling the buffer) because you watched all that was downloaded in the buffer. The higher the bitrate (the quality or definition) of a video, the longer it takes to download its chunks. This means that if your Internet speed is low, and you force the play to download high quality chunks, the player will have to do a lot of rebuffering.

ABR algorithms are part of the video player app on your phone or smart TV that decides how many chunks to download and the quality of each chunk. The "Adaptive" in ABR refers to choosing the bitrate of the downloaded chunks based on the Internet speed. If your Internet speed changes, say because someone else is watching another video, the adaptive algorithm reduces the bitrate of the downloaded chunks. It will also increase the bitrate when your Internet speed improves.

Researchers identified three video qualities that almost all modern ABR algorithms try to optimize: 1) The quality of the video (obviously) although if you are watching a video on a small screen you won't really notice the difference between HD and UHD. In other words, what the ABR algorithm will consider the best quality for a video depends on the size of the screen you are watching that video on. 2) The number of times the video rebuffers while you are watching. ABR algorithms will sacrifice quality to avoid rebuffering. 3) The ABR algorithm should start the video as soon as possible.

It is not trivial to balance all four factors: try to start the video as fast as possible by downloading the lowest bitrate and you risk shifting between bitrates too many times as you try to improve quality. Delay the start just a bit to download the best quality video and you risk it taking too long or the Internet changing midway and then you have to switch. Fill the buffer too much and you consume too much of the user's data plan, and fill it too little and you risk the video rebuffering. Most modern ABR algorithms find a good balance but there is still room for improvement.

The ABR algorithm measures your Internet speed, and based on that decides the highest video quality possible that will still avoid rebuffering. The image below shows that depending on the available bandwidth (Internet speed), the best bitrate changes. The ABR algorithm downloads a number of chunks to keep the video running even if your Internet is temporarily interrupted but not too much that will download parts of the video far ahead that you might not get to it. This is why when you put your phone in airplane mode, the video kept on going for a while. 
Illustration of the most appropriate bitrate as the available bandwidth changes over time
(source of the images

Developing new ABR algorithms remains an active area of research. ABR algorithms rely on knowing your Internet speed now and guessing what it is going to be like in the future (to download the right amount of chunks to avoid interruptions). Guessing what your Internet is going to be like is especially challenging when you are on the move and using your wireless data. There are a lot of factors that affect the Internet speed you get on your phone when you are using your data plan. Different ABR algorithms take different factors into account and adapt to them in different ways.  My colleagues at MIT developed a machine learning-based ABR algorithm that uses deep learning to identify the best bitrate given a bunch of factors including the video, screen size, and network conditions.


Another interesting problem in ABR algorithms arises when we deal with 360 videos. Try playing the 360 video above, you can look around by moving your phone or just swiping on the video. When you are completely facing the sand, the video doesn't display the ocean facing parts. Should the part you are seeing (the sand) be viewed at the same quality as the part you are not seeing (the rest of the ocean)? Note that if you download them both at the same quality, that means you will be downloading a lot of large chunks and that can mean that the overall quality can be low. However, if you decide to download the parts that you are not viewing at a low quality, you can improve the quality of the parts you can actually see. Such decisions can be made based on the interesting part of the video and by modeling and predicting the motion of users through such videos. 

In this post, I only cover the video streaming algorithm. There are other aspects of video on demand that are of interest from a networks point of view like where to store the videos and how multiple videos share the network. Such problems I will cover in follow up posts.

Comments

Popular posts from this blog

Integrating Click Router and GNURadio

Attaching a WiFi Dongle to an AR Drone 2.0 (a.k.a. Compiling Drivers for ARM Based Devices)

CUDA compilation on Linux (Makefiles)