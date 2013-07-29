Stanislav Shalunov , cofounder and CTO of connectivity startup Open Garden , was the chair of the Transport Working Group at the Internet2 consortium from 2000 to 2006, where he developed the Low Extra Delay Background (LEDBAT) protocol to solve the problem of traffic congestion on the consortium’s network. Today, the LEDBAT protocol is used by companies across the world, including BitTorrent and Apple, to send large files quickly across the web, accounting for 13%-20% of all Internet traffic. I spoke to Shalunov over Skype about how he came up with LEDBAT, and how it works.

Many scientists at the time were using Internet2 for transporting amounts of data that were a much larger than anything that TCP has been used for. A canonical example is the Large Hadron Collider, but there were many other projects. The reason that Internet2 existed was to enable these very high-speed connections.

At the time, there was no LEDBAT yet. One of the core problems with TCP, which was the standard way to send data, is that you need a very low loss rate if you’re going to send fast. For example, if you are sending across the continent or between the East Coast and Europe at 10 gigabits per second, you need, roughly speaking, one loss per 90 minutes. These are rough numbers, but this is an extraordinary requirement for a network to meet. This makes everything very expensive and very hard to engineer. If you have the slightest imperfection in the way fiber is attached anywhere in the path, you’ll have a much higher loss rate. In fact, at the time, I calculated that this was bumped up against physical limits, the bit error rate, within about one order of magnitude of the physical limit.

It’s hard to make something go that fast, that close to physical limits. The reason TCP has this property is that it relies exclusively on loss for congestion indication. That’s what needs to tell the host that there is congestion in the network so that the host can slow down. The only way that TCP will listen is if you drop a packet. If you drop one packet per 90 minutes you don’t tell the sender very much. For 90 minutes you are completely silent and you tell them to slow down, but even then you don’t know by how much.

It’s very hard to take advantage of this loss to build high-speed transport protocols. It was obvious to me that we needed to look at delay as well as loss. When you start taking delay into account, you get information with every packet that you receive. That’s much more information. You may have 100,000 packets per second. Suddenly, you have plenty of information about the exact state of the network at any given time.

You can then very rapidly adapt the moment that something happens on the network. You can respond just right. You don’t have to hunt for the right rate, slowly increasing over the course of an hour only to find yourself needing to halve your rate. This was important.

The other thing that we realized while researching this was that this solves a different problem with TCP as well. Normally, for TCP to experience losses it must first cause the buffer to overflow. Every Internet bottleneck has some amount of buffer space. Sometimes people measure it in kilobytes, bytes, and megabytes, but it’s not the right way of measuring it. The right way of measuring it is in units of time. That way it is scaled with respect to the speed of the link. These buffers, for the Internet to work with TCP, must be at least a few hundred milliseconds. But sometimes you find places where these buffers are much larger.