HTTP 2.0 Interview - Mark Nottingham, IETF httpbis working group chair

Late last year, the HTTP 2.0 standardization movement started at the IETF httpbis working group. I had a chance to interview Mark Nottingham (Akamai), chair of the working group. He drives the HTTP 2.0 standardization. It was a very exciting interview, so I tried to deliver this article as raw as possible. I wish you enjoy it !

--- Tell us about the history of HTTP 2.0.

HTTP 1.1 and HTTP 1.0 had well understood limitations. Especially around the fact that, when you have a connection open, you can make requests and get responses. But in HTTP 1.0, you can put only one request out, and you have to wait until you get the response back to get another one out.

Then we introduced pipeline in HTTP 1.1 to try to address that, but even then, if you had a request that took a long time to process, or the response was very large, it would block other requests on that connection. And pipelining itself has a lot of issues, which we have been exploring for the last couple of years.

It was even known back then, that it would be better in the HTTP world to have multiplexing. The ability to make multiple requests and responses freely, and even interleaving with each other on the wire. That was attempted; there was a effort in the early part of this century called HTTPng. And that didn't go anywhere. Because it didn't get any implement or interest. There wasn't enough activity at that time. So, it died, and we had HTTP 1.1, for the last decade or so.

And now, in the last few years, a lot of folks have been looking at web performance and how important it is. And how one can get a last little bit of performance out the web stack. This limitation came up again.

In the HTTP community, we always thought, "we tried once to do HTTPng and it didn't go anywhere, we won't be able to do that again". But because of the dynamics of the situation, especially because of browsers being very aggressive about evolution and innovation, it became possible.

Mike Belshe and Roberto Peon, who are the two authors of SPDY came up with this proposal. And because Google is in a really unique position with having a web site as well as the browser, they were able to do that experiment. You say it's Google, but it's not really Google. It really was Mike and Roberto on their 20% time doing stuff on their own. Slowly and surely it convinced people, it is something important and it was really good. Back in 2009, when they started, I had dinner with Mike, he came to me and said "We want to take this to the standards. We don't want this to be Google proprietary". And then Firefox got involved, they got very interested, they got the upside of it. And other parties got involved. Akamai got interested, and Microsoft came into the table.

And now we are in the point, that it's pretty obvious that we have enough moment within the industry to make a change. The implementers, the folks who want to do this, can affect change and make it better. We have an opportunity to version the protocol that is the basis of the entire web. Which is very exciting, it is very risky of course, if we fail, it would not be good for anyone.

But because we have the implementers, the implementers give back information to us, and we work closely with them. We are meeting this week, are all implementers of browsers, middleware, and servers and web sites. That's exciting to have all those people in the room.

For me, the interesting part of this and the tricky part is, we have to deliver on the promises; getting better performance, mostly. But at the same time, we can't sacrifice the interoperability with the rest of the web. We can't break the web in the process. And we haven't talked about it a lot, but for me, there is a implicit balance that how the web is put together. There are different parties involved; the end user is at the browser, there is the web site and they have the content, and the intermediaries, there are the network providers. For better or worse there needs to be a balance between these parties about how the web operates. I don't want to say that we won't upset that balance, but we have to be very very careful, because the unintended side effects could be very disastrous. So, that's a heavy responsibility in what we are doing.

Multiplexing

From a technical standpoint, we are looking at couple of things. We're looking at doing full multiplexing, which is the format on the wire is pretty straight forward. You just have to have a stream effectively and associate things together, so that they get back as HTTP request and responses. The tricky part of that is, once you get full multiplexing, now all of a sudden, you still have the potential for thing to block each other. "I as the server, I am sending the responses to you as the client. Do I send you the GIFs first, do I send the JavaScript or the CSS first?". And that needs to be discussed.

So there is the mechanism in the protocol called prioritization. You can have hints exchanged. But what's most important for the browser right now, for example, "I am now presenting this part of the content, so give me these images, or give me this JavaScript". And you need some form of flow control as well. You need to be able to say, "You are shoving all these things to me, you are filling up my buffer, you need to slow down". That's very tricky as well because, is it about the entire TCP connection, or is it per flow, we are talking about per flows these days, and how does that interact with TCP flow control layer underneath. We don't want to reinvent that from scratch. So, we will need aid of the guys down the transport layer, to make sure that works well.

Header compression

Another major component is header compression. People has known that HTTP headers are very repetitive. For example, every request goes with the user agent which is always the same.

It goes with the referer which is similar, and often the same, each time. And the headers for content negotiation, formats and language and so forth. But we need to keep all those things, because this is how the web works now.

We can't break the web. We can't just decide "we are going to change all these, and strip them out", because a lot of web sites use all these information for better or worse. So, we have to be able to encode and compress that to make it more efficient on the wire.

And, the illustration that I commonly use is, Patrick McManus, who is implementing SPDY and HTTP/2.0 in Firefox, did the experiment where he structured a page with 83 images, etc. on it. The browser gets the HTML, and the HTML comes back. And the browser has to do the 83 requests for the images, CSS and so on. Because of the way HTTP works now, actually the size of the headers, even if you do multiplexing, the number of packets that headers fit into, and because of the congestion window, you are looking at least 3 to 4 round trips, even with prefect network condition, for those 83 requests out.

Where as, if you can compress those headers down, you can get multiple requests in one packet, and thats huge. This was in SPDY 3 we used Zlib.

So Zlib was the SPDY 3 approach, but we do not use it now. The folks realised there was a attack against it, thats called CRIME.

Info: Bug 779413 - (CVE-2012-3977) SPDY compression infoleak (CRIME)

When you are doing this over TLS, you can exploit, you can inject things into the payload for example, cookies or injecting links in. You can make browsers do things pretty easily. You can even pull some text out, even if it is TLS. So, for example, you can get the session keys or the password. And that's unacceptable.

So, mostly the efforts around compression, is figure out, how do we deliver the efficiency of the headers, while still protecting these kinds of attacks. And, thats what we will talk about this week.

There are three or four different proposals on the table right now, I think we will have two or three more. We've got a testbed and figure out how these things work.

But the compression is exciting for me, because of the discussion about the mobile case, which is more prevalent, and it could really improve the performance. And, there is also a chance if we can do it right, it won't be so expensive to add new headers. For example, Ilya Gregoric from Google has a proposal out there to talk more richly about the capabilities of the client. Right now, folks are using JavaScript to query "my use agent has this screen size, and it's got these capabilities in the JavaScript engine". But, if you can express these in the HTTP headers, you can do a lots of interesting things on the wire within the intermediaries in the network. But, you don't want to do that in every request, if they are getting serialized in every request. But, with the compression, it would not be a big concern. So, now we can have much richer metadata without the cost of adding that metadata. If the compression works well. Which is exciting.

So, you got the multiplexing with the prioritization, and the flow control. You got the header compression and the considerations around that. Those are the two major lags.

TLS

There's been a long running discussion around making TLS mandatory. There are couple of reasons for that. One is that deploying new protocols in the web is hard. There are a lot of intermediaries out there who is badly written and does not understand the HTTP upgrade mechanism to a new protocol. We want to get the numbers on that soon, to see how that's going to work.

The other reason was, that a bunch of the browser guys feel that this is opportunity to upgrade security on the web. The encryption is CPU cheap enough, and the certificate is cheap enough too. People really shouldn't have deploy web sites without good security. Especially when you have attacks when have someone sitting in the WiFi point, and could capture session keys for most web sites, and login. Firesheep was a great demonstration.

Politically speaking, making TLS mandatory was not going to fly. Just because all the different parties were involved. Especially folks that's using HTTP on the backend and their APIs, or they were using it in places with non-traditional uses, or even intermediary vendors that didn't want the extra overhead. Having said that, it's not upgrading to one thing, we are now designing the protocol to be able to upgrade to number of things.

Now the browser and servers can negotiate, "I support HTTP 1.1, I support HTTP 2 over TCP, and I support HTTP 2 over TLS". Which one of those different mechanisms that browsers support is up to them. And I wouldn't be terribly surprised if some of them choose not to implement just HTTP 2 over TCP.

Personally I think that is a noble goal. But at the same time, the web security is broken so many other ways. Especially when we talk about TLS, there are man in the middle TLS in firewalls, national borders, companies and educational stations, and they can see everything going by. Some people are horrified when they find that out, but in the industry where I am in, it's very old news. I think that more people should know that too.

--- There was a news that some CA were cracked.

A lot of people doesn't know the news. The little lock mark on the browser does not mean what it used to mean. And I think we need to fix that.

There is a effort in IETF called DANE(DNS-based Authentication of Named Entities, http://datatracker.ietf.org/wg/dane/charter/). It is based on DNSSEC. The CA system, you root your trust to any number of different vendors, and we know it doesn't work very well.

In DNSSEC, trust is rooted in your DNS root. And it is a very constrained thing, it is like "This DNS comes for this party". Somebody had the idea, "What if we put the TLS key into the DNS". The cryptographic information identifies the use of the web site. So, at that point, you can lookup in DNSSEC, and you can find that it is signed in DNSSEC and is associated with the domain name. There are some idea that say the certificate itself does not have to be in the DNS, but that is an ongoing discussion.

A new CA can not be installed in the browser to get around that. Someone who has access to the Certificate Authority can not get around that.

The problem with DANE is, it is very difficult to deploy. DNSSEC is very difficult to deploy. There are bunch of other discussion about the web security, but this is one of the most important ones I think.

--- You have mentioned about prioritization and flow control. Does it mean that there might be a need for the HTTP server administrator to configure the priority of each content ?

It's very implementation specific. We've been talking about this, a bit in the working group. It is not something we are going to specify, we will look at the market if it were.

A very simple strategy might be like, "I will send JavaScript and CSS first, and everything else after that". Because those are very important for most browsers when they paginate. Some servers may choose to do a data driven approach, there are a lots of interesting things you can do there.

From Akamai's perspective, it is very interesting. Because, that gives us a chance to look into our network, and optimize, and continuously improve it.

--- When HTTP 2.0 start to fly, will there be any changes to the Web industry people? For example, will the Web design companies, need to learn a new technique or configuration ?

I don't believe that they would need to know that much, to that depth. I certainly hope not.

What would change is, there are whole bunch of techniques and work arounds on HTTP 1.1 and how it behaves. For example, we have CSS sprites. These techniques to get around HTTP 1.1 limitations can go away, and we can move to more sensible designs. I think that will be the biggest change people will see.

But people will need guidance. The working group has looked at writing documents and guidance around how to successfully transition from one to the next. The Web performance community will create guides and tools. And of course companies like Akamai will help their customers.

--- Would that be something like Akamai's Front End Optimization ?

Yes, this will definitely change how FEO works.

But this is like changing things, so that FEO can get rid of some annoying things out of the way. And work on some really exciting things in the future.

For example, when you are loading JavaScript right now, the browser must load the entire file before the parser starts parsing the JavaScript. Google folks has done some really interesting techniques on chopping the JavaScript, so you can evaluate it successfully and you can get faster user experience. But the way you have to do that is really ugly, its a multiple frames with multiple files. From a web developer experience, it's a nightmare.

There are some discussion going on at HTML5, and a JavaScript load can take multipart mixture. Maybe a new media type, to allow the parser to incrementally load what comes in. That's kind of next generation FEO, which is very exciting.

And, the other part of the conversation that you don't hear much now is, HTTP 2 and fixing things in the application layer. And soon as we do that, problem at the transport layer becomes much more apparent.

So, a lot of the attention is at the transport layer. There are bunch of new proposals out there right now. "How do we improve TCP?", and people are even talking about TCP replacement.

HTTP 2 and TCP is totally a different conversation, but, we are now also looking at how do we improve that layer. We are poking a stick at it, and see how we can fix it. Some of the discussion is, "we need to get rid of that, and replace it to something else".

That's exciting, and also very dangerous, especially talking with folks in the IETF.

--- I think TCP performance discussion has been there for decades.

Yes. There are the same conversation again and again.

And, I think new blood is very helpful in that. I talked to few of them, and I realized that something is coming.

And, it needs to be written, and they are excited about it. But some of them are not excited. You know.

--- In the introduction of the current draft, it says that SPDY's TCP connection will be "Fewer TCP connections". But in 4.1, it says "Clients SHOULD NOT open more than one HTTP/2.0 session to a given origin concurrently". How many TCP connections will there be in SPDY ?

Right now we see about 4 to 8 TCP connections for current HTTP. I think, what we will see with HTTP 2 will be like moving to 1 or 2 TCP connections. Which is much better for fairness.

You still get a little bit of parallelism there. But, if we improve the limitations of the TCP stack, maybe it will be one.

This is the tricky part, but along with the throughput you have the fairness issue.

Part of the problem right now, is that HTTP flows are pretty short. TCP never gets the chance to really engage. If SPDY flows are longer, and HTTP flows are longer, then TCP has the chance to operate properly. And that's how it should be. But, again, this is all very new.

Google is very comfortable with this because they have a lot of operational experience with it. But the rest of the world is just getting there. Getting the testbeds out, betas out, and looking at how it works in long term.

There are also network affects when everyone using it. It might be fine when Google and Google Chrome uses it, but when the entire world uses it, it's a different ball game. So, we have a lot of work to do. But, it's exciting.

--- I understand that HTTP 2.0 will enhance performance. But, as in the TCP throughput equation described in Sally Floyd's RFC5348, TCP has a limitation, and I guess that browsers can get more throughput with multiple HTTP 2.0 connections. Nowdays, there are browsers that have DNS prefetch mechanism for more performance. Wouldn't the browser developer implement with many HTTP 2.0 sessions ?

I haven't looked at the newest ones, but the previous versions of the browser only used one or two SPDY connections. The developers are honourable people, and they know the effects on fairness. And they want to make the world a better place.

But yes, you will always have the pragmatics there. For example, you have the guy writing the download manager making 50 TCP connections to download one thing. There are not much you can do about that.

But, I think the browsers understand that they have a place in the ecosystem.

For me, the interesting part of this is, when you are in the mobile situation, there can be a HTTP 2 proxy with 1 or 2 TCP connections, and it can keep the TCP connection hot with header compression. There are a lot of scenarios that would be on a typical browsers, we would have to think of about as well.

--- About negotiation of HTTP and TLS, would it be possible to use the same TCP port for with and without TLS ?

Probably not. Theoretically it could, but probably not.

What it might mean is for example, you connect to HTTP but under the cover it is TLS. Maybe you are not checking certificates in TLS, and doing opportunistic encryption only, it would protect only against passive attacks. It won't protect against an active attack. But it would prevent attacks like sniffing in a cafe. People talked about it. It's not clear if it' going to go. Having that flexibility is a nice thing.

--- In the current draft, the HTTP method uses "HTTP 1.1" and the upgrade header shows that the request can use HTTP 2.0. Will that mean that the newer HTTP versions might all use HTTP version number "HTTP/1.1" in the request line ?

It is not clear. What happens there is, you use upgrade headers.

Another is you use TLS for negotiation. So, you will use HTTP 2.

But we are also talking about putting a hint in DNS. Probably it would be a new record type. When you are doing your lookup, you can know if the server is HTTP 2.0 capable. The browsers seem interested in with that one.

And then there is the notion that SPDY had before, it's a alternate protocol mechanism, it will send a header back in response, and tells where to use SPDY. We are looking at 3 or 4 different proposals for getting to HTTP 2.0. The idea is, you want to avoid the round trip if possible.

For the common case, it would be requesting and getting HTTP 2.0, and it would be fine. But if you want to post something, or multiplex anything, it would waste a round trip there.

--- The flow control says "hop-by-hop" in the current draft. What would be a "hop-by-hop" mean ?

We haven't discussed the flow control much yet. We are going to talk about that a lot this week.

The draft you are reading there is the SPDY draft, with bunch of stuff thrown in there with preliminary discussions. I wouldn't read it too much any of that, and we are pretty much saying "don't implement that draft". We will come up with a draft pretty soon, maybe after couple of weeks, and we will say, "This is the first one you should implement, and give us feedback on this".

--- How about PING ?

We haven't discussed about PING at all. It's used for the keep connection alive or to test the connection before.

One scenario is, "I have a connection open, and it's idle for a little while, and I need to send a POST across it". A lot of implementation will not use a old connection for a POST. If the connection dies while you are sending a POST, you do not know what data was really sent to the server, and that's a bad situation. So, you use a new connection to reduce that problem.

PING allows you to check the connection before sending a POST. That's useful.

--- Please tell me about Akamai's SPDY implementation.

We have a SPDY implementation, it's in beta right now. Now we have a bunch of beta customers working with, and we are trying to get instrumentation out of that. My goal is to get numbers out of that. Because we are in a unique situation.

Google has their site, but it is Google specific workload. We have a very broad set. We can have different views on different sites. That is very valuable for us.

It is a slow process, of course. Because it's something new, we need to roll it out, and have the comfort that will not make our customers worse off.

--- How much time do you think it will take for HTTP 2.0 to be a RFC ?

No comment. :)

--- On the IETF working group charter page, the milestone says the working group last call would be in 2014.

Those charters are always aspirational.

RFC writing is very painful sometimes. Everybody has an opinion, and especially when thing are contentious, or valuable, it's difficult to get consensus. And that's how it works.

The most two involved organizations are the IETF and the W3C. So, we keep a foot in both.

I always question, "which one is more sensible to work in ?".

--- How does IETF and W3C collaborate ?

We have a conference call every so often. We have a very good working relationship. We point out thing that might need attention, we coordinate. For example, we had a transition from the IETF to W3C, because we thought it was a better place for the work to be.

But organizations are very resource constrained and sometimes politically constrained too.

--- Would some specification of HTML5 change because of HTTP 2.0 ?

I don't think so.

We had a pretty good working relationship with the HTML5 guys as well. They are very interested, they have taken a very unusual approach in the traditional standards world, but it's turning to be a very successful one.

We have changed some things in HTTP 1.1 to accommodate to HTML5. The working group I am working in also revises HTTP 1.1, and clarifying it.

The next logical thing we need to do, is take the WebSockets API and stick it into HTTP 2, and we can reuse that connection. And I think everybody agrees it makes perfect sense.

WebSockets API is interesting, but it's very low layer API. And it's good when you have only one server and client. But when you try to scale it, you have to build a proprietary infrastructure.

From a specification standpoint, HTTP 1.1 was written very badly, because they were done very quickly. But once you understand the architecture of HTTP 1.1, there is no reason you can not do something like WebSockets. It's just that, when the implementations were written, it was written with some use cases in mind, and they blocked out other use cases.

And one of the challenges in HTTP 2, is get the implementations written in such way that it is backward compatible with HTTP 1.X but is forward compatible with a lot of new use cases, many of we haven't thought of yet. That needs an abstractions of the APIs, the lower level APIs especially, have to be good enough, like being able to stick in WebSockets into it for example.

--- Would HTTP 2.0 make trouble shooting harder than now ?

The TLS can make debugging harder, but also, the header compression might make the debugging very difficult. This has come up a number of times.

It's very hard to work around once you are keeping even a little bit of state between the header sets and the connection, so they can remove the duplicates, then when you fire the debugger, tcpflow or wireshark or whatever, after the stream has started, you can't recover that state.

So far, people have grumbled about that, but, consensus is still that the benefits are worth while. It's a balancing act, judge the benefits against the drawbacks.

We will try to mediate that pain as much as possible.

--- Thank you very much.

Feb. 2013

Japanese translated version