The Hitchhiker's Guide to Computer Networks

Introduction

One of the few reliable things that didn't crumble under the pressure of quarantining and working from home is our digital infrastructure. It remains largely unaffected as we conduct most of our business activities online. This is no accident. It is the result of 40+ years of continuous research to make that infrastructure faster, cheaper, and more reliable. My goal writing this series is to shed some light on the different hidden aspects of our digital infrastructure. In doing so, I hope that I will help the reader get a behind the scene's view of what we take for granted when we are watching HD videos, buying our groceries online, and chatting online with people on the other side of the planet. 

A typical Internet user pays for their Internet and online services (Netflix, Amazon Prime, Spotify, iCloud) with some expectations of the quality of experience they will receive. Users also expect that the way they interact with the digital world will keep improving. That's partly why they upgrade their phones, tablets, and laptops and buy new devices like smart assistants. In this series, I will discuss how different parts of the infrastructure work and how that affects such users. Specifically, I will be viewing the different challenges in making the Internet tick as one of three aspects of our everyday interaction with the digital world:

Quality of Experience: When we do anything online, we expect it to work right every time. This expectation of good quality of experience comes from the reliability and promise the infrastructure has consistently shown. But have you ever watched a video that keeps switching between standard definition to high definition only to end up freezing? Of course you have and that's what I refer to here as bad quality of experience. There are a lot of moving parts from your home network, to your Internet service provider, to your video service provider that have to work well together and separately to ensure that you have a good quality of experience online. 

Flow of money: The network of connected people and corporations that form the Internet does not only capture the flow of information between them, it also captures the flow of money. You pay your Internet service providers, video service provider, music service, etc. Those service providers in turn pay other parts of the infrastructure like Internet Transit Providers (Internet service providers for the Internet service providers) and cloud services providers (the place where all your videos, pictures, music, and emails are stored and processed). This flow of money is at the center of a lot of how the infrastructure evolves, which can actually end up impacting your monthly Internet bill. 

Gateways to the digital world: At the turn of the century, we interacted with the Internet primarily through text-based content viewed on big screens. Twenty years later, we are primarily interacting with it through video on small screens. This required an evolution in personal devices as well as the digital infrastructure (moving from carrying text to carrying 4K HD videos requires a 10,000 folds increase in capacity). The future promises even more developments where our interaction with the digital world is more seamless (think augmented reality and virtual reality with devices less bulky and awkward than the Oculus and Google glasses). Between us and that future there are some challenges that we need to overcome to make the devices more natural and infrastructure even faster.  

In every post in this series, I answer three questions regarding each aspect of the infrastructure: What is it? Why should I care? And why haven't we solved all of its problems already? Hopefully, the answer to the first two questions will get the reader to ask the third question. In my answers to the third question, I will attempt to highlight some of the challenges we face with improving and understanding the infrastructure we have built. 

Target audience: This series is developed specifically for non-technical people. It is not meant to provide a deep technical description of any particular area. I will avoid most technical terms and the alphabet soup that I think acts as a barrier to understanding the exciting hidden parts of our digital infrastructure.

Overview of our Digital Infrastructure

Before I start explaining different aspects of the foundation of our modern online lives, I will give a very brief and intentionally cartoonic view of the Internet today.

Map of Internet marine cables

When I refer to a network from now on, I refer to a set of connected devices that can communicate within a single infrastructure that is owned and operated by a single entity. For example, within your home network your phone, laptop, smart TV, digital assistant, and smart lights can all communicate through your home wireless router and they are all owned by you. Even when not connected to the Internet, this network can provide some useful functions carrying your data between devices. My wife and I exchange pictures that each of us took of our toddler every night. This exchange happens within our home network. Networks can be very large, spanning whole countries or even multiple continents. For instance, Google's private network carries data between four continents.

The Internet is a network of networks. This means that by definition the Internet is not owned or operated by any single entity. However, the definition also has a very important implication: all the networks that form the Internet have to agree on some set of primitives that allow them to interact without a glitch. This set of primitives has to be very small (so as to avoid imposing any strict limitations on the operator of any single network) but it also has to be expressive enough so that it can handle the diversity of the networks and their operators. This is no trivial matter. All companies in all countries that want to be part of the Internet have to agree on this (think the US, Russia, and China agreeing on and using the same "language"). This means that this "language" has to be expressive enough to allow all networks to interact but limited enough so that each company and country can still enact its own policies within its own network. The main primitive of the Internet, that everyone has to agree on, is that when a network is given a piece of data with a destination address, the network will "attempt" to deliver the data to its destination or forward it to another network that knows how to deliver it. In fact, network operators can choose to prioritize the delivery of data based on the source, destination, or nature of that data, dropping low priority data. 

Not all networks are created equal. In fact, networks are typically created to serve a specific purpose (from the hardware they use to the language that hardware speaks). Your home router creates your home network. Your Internet Service Provider has a different set of devices forming its network. The purpose of the network of an Internet Service Provider is to connect residential and small business networks to the rest of the Internet. Transit providers are large networks that connect other big networks to each other. For instance, it can be that a Transit provider is what connects your Internet Service Provider (and consequently your home network) to Facebook's network. Modern online services (including Facebook) typically reside in datacenters. Datacenter networks carry data within a datacenter for processing (like facebook finding your face in a picture).  The figure below shows how your home network is connected to the rest of the world through different types of networks.

A simple view of the modern digital infrastructure

Given this picture, think about the quality of experience you are having when watching a video. Specifically, if you are watching a Netflix video and it starts to freeze or switch between standard definition and high definition, whom should you blame? You should probably be able to tell now that the answer to this question is not so simple. It can be that your roommate is watching another video and that's causing glitches in your home network. It can also be that your Internet service provider is overloaded by customers and their network is glitching. The problem can also be in the Transit service provider connecting the datacenter that has your video and your Internet service provider. It can also be a problem with Amazon's datacenter network where most Netflix data is stored. Although in the specific case of Video, some of these scenarios are much more likely than others, depending on the online service you are using and the conditions of each of the networks, the likelihood of who's to blame changes. Also this should provide a clear reasoning for why each of these service providers would deploy their own optimizations in order to avoid any glitches with your video (each optimization is typically a different research question).

If you examine the picture above carefully, you will see that a single company can play multiple roles. This significantly affects the flow of money. For instance, Google can be your Internet provider through Google Fiber or Google Fi. It can also be your video and music provider through Youtube and Youtube Music. This means that a single company can be in control of your view of the Internet and the main beneficiary of your online activity. This evolution in roles happens at a fast pace and at various scales, leading to a very peculiar property for a completely artificial system: The Internet is so big and evolves so quickly that we don't fully understand some aspects of its behavior and how it impacts our own behavior as its users. There is a whole research area that tries to measure and understand the evolution and properties of the modern Internet. 

About the Rest of the Series 

In follow up (hopefully weekly and more realistically biweekly) posts, I will focus on a specific type of network or problems that is still facing the builders of our digital infrastructure. I will try to alternate between topics based on what aspect of our lives it touches (quality of experience, money, or gadgets). I hope that I will be able to cover the following list of topics:

Video: Being the dominant type of traffic on the Internet and the center of our lives in the quarantine, it seems like a good place to start. I will answer questions like: why does the quality of video change when the connection is not good? to why do live videos glitch more than recorded ones? I will also explore some of the frontiers in video streaming including streaming 360 HD videos and augmented reality.

Internet Architecture: This topic is concerned with defining the primitives that all networks that form the Internet have to agree on and how money flows (who pays whom). I will answer questions like: why is the Internet resilient and not easy to control? and how can freedom of speech be compromised on the Internet (despite its resilience and inherent support of freedom of speech)? I will also discuss some of the exciting recent attempts to make the evolution of the Internet easier without compromising its resilience and security.

Internet of Things: As processors become cheaper and smaller, we are able to stick them on any device and call it a smart device (smart light bulbs, smart microwaves, smart scales etc). The typical implication of such "smartness" is that it can be controlled from your phone from anywhere. This means that infrastructure has to support billions of connected devices. In covering this (maybe recurring) topic I will answer questions like: what are some of the new applications of the Internet of Things? why can't we just plug in all the new devices and make them work like my phone?

Congestion Control: I will tell you the story of the protocol that "saved the Internet" and how the Internet is "not quite saved" till this day. 

Stay tuned for more posts and more topics to be covered in this series.

Comments

Popular posts from this blog

Sharing the Network: From Circuit Switching to Packet Switching (The Hitchhiker's Guide to Computer Networks)

Video on Demand: Part 1 (The Hitchhiker's Guide to Computer Networks)

My PhD coping mechanism (or how speaking my mind out loud helps me maintain my sanity)