HTTP? How it started, how it’s going

An interview with Roy Fielding, co-author of the HTTP protocol and inventor of REST, on the evolution of HTTP from early days to current work on QUIC & HTTP/3.

Continue the conversation in Experience League Communities.

Transcript

Hello, welcome to this interview. We hope it’s all going to work well, you know, having two people remotely connected means the bus factor is doubled or halved. I’m not sure how to say that. We’re taking more risks. We have Roy calling in from California. I’m in Switzerland myself. The servers are somewhere. We have no idea where. So we hope it’s all going to hold. So this is an interview with Roy Fielding on HTTP. I’m looking forward to that a lot. I hope you are also. I’ve been doing HTTP pretty much since it exists, I would say, or maybe a few years later. And Roy’s been doing that as well, but on a totally different level. So we’re looking forward to that. We will be taking all audience questions. We have Radu Kotesko in the background watching the chat. So the stage chat, make sure, you know, there’s the event chat, which is for general discussion about the conference, and the stage chat is especially for this talk. So that’s where you can ask your questions. And Radu will select the ones, depending on how time goes. Select a few ones that we can take. So I’m not sure if Roy Fielding needs an introduction, but I’m still trying to do one. So Roy works as a senior principal scientist at Adobe. And besides having invented REST, the REST architectural style, it’s part of his PhD, he’s been working on the HTTP protocol and specifications almost since the beginning of that. He is also co-founder and current chairman of the Apache Software Foundation. We’re also working together there. I’m also currently on the board of directors, so we also interact there, but that’s a different context, I would say. So hi Roy, welcome to this interview. Hello, Bertrand. Good morning. I guess it’s pretty early here. Yeah, so here, just getting knocked. So we’re doing really the global interview. So the first question, what’s the story behind you getting involved in HTTP? How did that happen? I think you know Tim Bernsley personally, so you were involved in the very beginning of that, but how did that happen? Well, what happened was I was a graduate student at UC Irvine when the web started and when I got involved. Really, the web started as a proposal in 1989 and really a project in 1991. And I did not get involved until 1993, but the first version of the web was very simple. The HTTP itself was just one line GET and then a URL. And the way I got involved, I was working as a graduate student. I finished my classes and basically just messing around with the internet. I had great access to be working at UCI as a direct connection to the internet and a Sun workstation on my desk and a lot of free time to play with the internet. So I spent a lot of time on NetNews and picked up the web when it first came out via NCSA in their alpha browser for NCSA Mosaic. So that was the beginning of 1993. Yeah, what was the geographical area that the web covered? Was it already global at the time or was it just California, Silicon Valley or what was that? It was already global. I mean, the first place was in Geneva at CERN and very close to you. And I think the second site was in Stanford at SLAC. So it started out in the high energy physics research community where there are very high end labs. There aren’t very many labs, but there are a few around the world. And other physics researchers that spread throughout the universities, they’d basically go to CERN, they’d find out about the web and they’d take it home with them when they left their tour at CERN. And so there was individuals active around the web, but not very many sites. I think when I started, there was 48 sites. In fact, I’m sure there’s 48 sites because one of the first things I did was try to find out what this thing was about. And I went through the page of all of the published websites out there and there were about 48. And I started one end of the page, looked at them all, looked at all their links, went down the bottom and finished it in one day. And it was totally cool. I watched the entire web in one day. And I got to work again right at the bottom of the page and all of a sudden there were two more sites because they announced that the 50th site had been published while I was reading it. That sounded like, oh cool. This is obviously growing. But it was a very new project there and it was all computer scientists and physicists who were working on it. There was nothing commercial about it at all. Right. And the physicists were interested in terms of sharing research data, research information. Yeah, basically the notes and the experiments. More than anything else, just their personal notes of what’s going on. But the first websites were things like the phone book. They were information that was already stored on systems inside CERN that nobody could access from the outside. And so they provided gateways into those systems as the easiest first problem. It’s the same with Slack. Slack provided access to their library system at Stanford. Right. Yeah. I remember you mentioned Geneva. So my first experience with the web, I will never forget that actually. I was dialing into CERN workstation in Geneva. I was paying for that access. I was dialing in by modem to that. And I remember very much at that time I was already doing some stuff with Usenet news and FTP to retrieve files around. There was Archi to search for files. And then I was on this workstation. I used the Lynx text browser and I didn’t click. I pressed enter on the Lynx. And then suddenly I was on the server on the other side of the ocean. And that was the magic of the web today. We don’t think about that anymore except when it’s slow. Yeah, it’s funny. Probably one of the most awesome browsers was Lynx and left under by Lou Montoli. And that was at the University of Kansas.

It’s still there. I used Lynx the other day because I wanted to extract the text of an HTML page and Lynx minus minus dump I think gets you the HTML without all the stuff that we put around it nowadays. You mentioned the simplicity of HTTP. And for me that was fascinating. I used to teach before I joined Data Software and then Adobe. And I was teaching Unix administration kind of classes. And I was teaching my students to implement the dead simple web server because the protocol was open a socket, receive one line with a get and the pass and send that. Was that simplicity by design? Was that a conscious decision to make it dead simple, I would say? Or was that because there was no time to make it more complicated? Yes. I mean, certainly the first conversation I had with Tim Berners-Lee about the design of the web was simplicity was the first thing he mentioned. More than anything else for low entry barrier for people editing for the web, for starting with the web. But also, just trying to simplify what he sort of an early exposure to the internet protocols was fairly complicated. The socket interface by that time was pretty well established for TCP.

And he had been using FTP as the first initial transfer protocol. But what he found was FTP, you first log in and then you list the sites or list the files on this page. And then you go down and set up directories to get to the directory you want. And you list those files. Then you’d retrieve them. And it’s just too many steps. So what he was interested in doing was reducing the perceived latency for people who are clicking on a hypertext link. They already know exactly where they’re going because that’s what the hypertext link tells you. It tells you all those steps.

The URL tells you all those steps. So what he wanted was the simplest possible direct low latency response. And at the time, this is back in 91, there were no other data formats to worry about. There was just HTML. So he’d send the document ID, which was the early form of the URL, and then get back an HTML page. And if the HTML page started with plain text as a tag, then the rest of it was plain text. So the whole of the protocol was described in one page.

Yeah, it’s fantastic. I think we can be very thankful for that design because otherwise we might have ended up with having to type FTP commands forever. Sometimes people see something and say, it works, why do you need to change it? So it’s great that we have said, no, no, that’s way too complicated. Let’s make that much simpler. Yeah, it really comes out of the URL design. That really hasn’t changed much in all this time. I also worked on the URL specs as well, but for that it’s more of trying to explain the philosophy than changes. HTTP has gone through a lot of changes to support, first of all, images, and then to support all the different things that we wanted to add to the web. Yeah, definitely. Speaking of changes, what are the… So okay, I’m a web developer. I also do some work in the in the service space via Apache Sling, where we do, you know, which is a resource request processing framework, we could say. So what are the next few important changes that a web developer like me should be keeping track of in the evolution of HTTP? I understand some changes are really meant for backbones or data centers, but as a developer, what should I care about? Yeah, I mean, right now there’s a huge culture change in the way the standards are developed, in the sense that, you know, when we started with HTTP 1.0, there’s a lot of different implementations of HTTP, different browsers and different servers. A lot of different companies were interested mostly from the user side, so you had people like Digital and IBM and Sun were interested in developments of the protocol because it was all new, and universities as well.

And really it was the open source folks who ruled the roost on that because we could share our tools and quickly collaborate across the internet. And protocol development is another form of open development. It’s not open source in the sense of code, but it’s open source in the sense of we write these RFC documents, we collaborate on their development, and we do that over email, almost entirely via email. And in fact, these days we do our editorial collaboration for the current version of HTTP specs is all done via GitHub, just like any software project. So we do these things, but the culture has changed quite a bit because the people who are mostly implemented HTTP, mostly impacted by changes to HTTP, are the large CDNs, the content distribution networks, like Fastly and Akamai and Cloudflare. And then the others are the huge mega sites. So Google and Facebook and Apple are very interested in improvements to the protocol because it can make a huge difference to how they deploy their systems and how efficient their entire revenue stream is.

And unlike before, it used to be open source, we would rule the roost, and now it’s just open sources everywhere. So all these developers, all the standards developers are used to working in the open, they’re used to collaborating via email. And right now what’s happening is the QUIC, which I’ll get to in a minute. QUIC is a replacement for TCP basically over UDP, and it’s just been approved by the IETF. And it’ll be published in a couple of months in the final version, but the drafts are done. And the HTTP3 is in IETF last call. So that’s the last stage of where the Internet Engineering Steering Group decides if the specification is complete enough to publish as a proposed standard. Right. So QUIC is a replacement for TCP that runs over UDP. Could it be used for different… are there any plans to use it for other things like file sharing or other protocols? Or is that currently just really focused on HTTP semantics? There are a lot of people who plan to use it for various things, but they haven’t been allowed to. Essentially in order to get QUIC done, in order to get the specifications out, they wanted to choose one protocol to deliver. And it was really like the reason Google developed QUIC in the first place was to deliver HTTP. So this is a protocol that the Google server teams developed called Google QUIC. They deployed it a long time ago, I think three years ago. And it’s very effective if you have control over both clients and servers particularly. In the case of Google, they have Chrome and their own services, but primarily services like YouTube where there’s a lot of large amounts of data being streamed and problems with head of line blocking if you do it all over TLS. So if you do encrypted data transfer, one of the problems that occurs is that if you lose one of those packets, you have to start the whole stream over. It doesn’t work very well in a lossy environment. And one of the primary goals for HTTP3 or HTTP over QUIC is to enable to do multiple streams, like HTTP2 multiplexing, but doing it over UDP with essentially multiplexed TLS streams below it.

So that if you lose one encrypted package of bytes, it only affects that one stream as opposed to all the streams on the same connection. So I guess I should describe sort of the overview of how these protocols work out. So we had HTTP0.9 to begin with. That was in 1993 or 1990 to 93.

And this was the one I was mentioning where you send just one line and you get your HTML back, yeah, yeah, that was the simplest possible two page spec. And it did very little for you except provide a connection via TCP. And so then we wanted to introduce in line images to HTML. And in order to do that, we also wanted to send more than just HTML pages over the web. So then the web became a multiple content types access. And the way to do that within the internet was using MIME types for email. The easiest way to, should I say the easiest way to get consensus on a decision within the ITF is do the same thing everyone else did and just tweak it slightly. Right. It’s not exactly good design in the sense that it’s not ideal for the application you’re working on, but it’s pretty good. What we’re doing is essentially reusing all of the engineering knowledge that’s been gained from those past systems and learning from it and extending it just a little so that we spend 99% of our arguing over little tiny things instead of massive design changes. Because it’s very difficult to obtain consensus if you’re not almost in agreement.

So HTTP 1.0 added the MIME format and header fields so you can send metadata.

And it was developed by the entire web. People, we frequently think of Tim Berners-Lee as the father of the web. But really he didn’t develop 1.0. It was the entire www-talk mailing list with 60 people talking continuously over the course of a year. And each of the groups plugging that into their software and trying out various things. And when I became the HTTP editor in 1994, 1.0 was just all over the place. There were extensions to do just about everything you could imagine via the web. And we actually cut HTTP in half. Basically all the deployed experiments, we picked out the ones that we knew would work between multiple servers that were developed independently. And that’s what was published as HTTP 1.0. And then at the same time, Henrik and I, Henrik Nielsen was working with me on HTTP specs. And we started designing HTTP 1.1. And 1.1 was a clean implementation of most of what people had experimented with, but done in a way that we could add safe caching and entity tags, persistent connections, the host header fields so you could have multiple hosts per IP address. Chunked encodings, things like that. Being able to frame messages so that you could transfer them effectively across the internet without at least knowing when you lose data on the receiving end. And that’s what ended up in RFC 2068, which I think was 96 or 97. And then it was republished at 2619. And then 10 years ago, we got together again and updated everything for RFC 7230 through 35. Basically we split the entire protocol out into five specs, got each of those concepts documented pretty well. People have been pretty happy with it. And then published those. And right now we’re finishing what turned out to be a three-year project to separate all of HTTP semantics from HTTP 1.1 messaging. So now we have HTTP core semantics document that all of the versions of HTTP can refer to. And that’s in working group last call right now.

You said there were about 60 people working on the spec in the HTTP 1.1 time. Has that number stayed pretty much constant or is that varying? No, it varies all the time. No, there were about 60 people working on the web project itself, on the public mailing list. So this was for 1.0. 1.0 came out as, oh, now we’re going to send MIME and to tell you that it’s MIME, we’ll stick HTTP 1.0 at the end. Whereas before it had nothing. At the end of the first line. And then it was just a matter of experimentation through that, the whole process of developing 1.0 amongst everyone. I added conditional requests. So the if modified since header field and also the date field value. It looks like an internet message format date, whereas it used to have a two digit year. Things like that. Those are that I added to HTTP about eight months before I became the HTTP editor.

Right. Yeah. So going back to the things you were mentioning about QUIC and the, you know, it being more efficient when you were using TLS and stuff. So as a web developer, I would pretty much get that magically. I mean, you know, once my server layer is updated, should I change the way I’m implementing my web servers or dynamic web serving? Or maybe it enables new things that we could not do in the past? Well, mostly the intention is that you won’t even notice. The intention is that all the features that you rely on now won’t change at all. But there are performance changes in the sense that with HTTP2 and much more so with HTTP3, you can do all of your requests on one multiplex connection instead of on multiple connections. So any time you want to do things like what people constantly ask for batch methods where you could send a special batch method where you could send 50 posts up one stream before getting the answer, things like that. You don’t need that anymore at all because you can send independent multiplex streams across HTTP2 or HTTP3 and it has the same effect. There’s still differences in the way things are inputted and awareness of backwards compatibility. So the most people who are implementing HTTP3 right now are the big sites and the big CDNs. So if you’re using Cloudflare or Fastly or Akamai, then you’re going to get it automatically. You just have to, in some cases, you can disable it if it’s not working out for you. But usually you can get the latest version of HTTP, whatever is currently possible with the client would come automatically. And it’s particularly true with Google services and Chrome for a long time has been implemented Google quick because they could look at the client and look at the server and since they knew both supported the protocol, whichever version of the protocol it was, they could quickly dynamically choose the highest best version of the protocol to use. So that’s most of what YouTube has been using for a long time now. Right, so maybe some workarounds that we did to cope with the problems of multiple connections. We can just forget about that but no other changes basically at the application level, right? In theory, yes. If it all works out and people don’t implement bugs, there’s no bugs, there’s no intermediaries in between, you’re not using a proxy. There’s a lot of coverage in the sense that things can go wrong or you may be stuck with a person using a modem from some strange location. There’s nothing we can do about that. But if you’re on a high-end connection and you are directly connected to Google fiber and all that stuff, then yeah, it’s very nice. The nice thing about it, from my perspective, what’s really nice is that even with all these additions to the protocols, they’re still carrying the same semantics. So anything you can say via HTTP3, any application you could build via HTTP3 can be sent via HTTP1 if you need to. So you can still support those older systems and the clients and the servers will adjust their behavior accordingly, hopefully. Of course, with any new protocol, there’s always a chance that we messed up. Yeah, maybe you should add that to the spec, you know, say you should not implement any bugs. I have one for you should not lie or you must not lie. Yes, right. We’re starting to get some audience questions. We have a question from John. What do you think our next wow moment will be with the web? It’s always hard to predict the future, but where do you see, like when I first clicked the hyperlink, for me that was a change, a big change. What’s next in your opinion or what would you like to be next? I’ve always thought of wow moments as, oh my God, we’ve done something wrong. No, the biggest wow moments for me have always been associated with content. So when I first started, what really made me want to work on the web as opposed to all the other research I was doing was some sites in Chicago about the Maori people of New Zealand because I’m part Maori and really the only access to that information I’d have is via the internet because there’s nothing in the libraries around here.

And so what my big wow moment was I can go to a site in Chicago and find out information about people in New Zealand and everyone can share that information all over the world instantly. In fact, what I used to say is that at any given time, there is one person in the world who knows more about the subject than anyone else and you’re guaranteed to find them on the web. That’s just where they are and a situation like that where you can put together all of humanity and access them is extremely powerful. Now it’s both powerful for good and it can be powerful for bad, but what I find is that most people who are collaborating together are excellent people. Yeah, I agree. That’s right. And yeah, I think you’re right to say that what makes a difference is the content and enabling people to create content and to share it freely. It’s interesting what’s happening here. I would say with sometimes the older people, they have time to spend on the web and they will do surprising things. You know, they’re the old folks, but it can be life-changing, especially as you get older and you may be less able to move around. That’s a big difference. But it’s also freedom. I mean, you know, one of the reasons I went to grad school was to have the freedom to choose what I wanted to do and a lot of people, especially people who have retired, they’ve been working all their lives and suddenly what are they going to do? It’s like, well, there’s one of the things they can do from the home is use the internet and create content of one variety or another. That’s right. It’s a very empowering feeling that you can do that. Yeah, I think we’ve seen that with the COVID situation where there were less activities, less travel. I’ve been doing some more music myself and yeah, if you do just for yourself, it’s a bit boring if you could share it on the web. Maybe you get two likes and that’s a good thing. I’m slowly improving my webcam setup. So, cool, great. So, Achim is asking how your typical workday looks like. So, how much coding, how much spec work, how much time you spend negotiating with people in this spec work, how does that look? Right now it’s a mess. I’ve got a 10-year-old son and so my daily schedule tends to revolve around interrupts, but, you know, mostly I spend most of my time doing email. I get about 1,000 emails a day. Half of those are spam. So, half of them are fairly easy to filter out, but the other half are a mixture of developers working at Adobe, developers working at the Apache Software Foundation, people who are trying to find a solution and think that maybe I have the answer for them and the rest of my time is spent on Twitter. No, not seriously. Actually, a lot of time is spent just trying to figure out what on earth is going on with the world, especially this year. Yeah, all right, and another question from Casimir. How did you end up at Day Software? How did it start? I don’t think I know even. Oh, day. Well, I was at UCI and I finished my dissertation in 2000 on REST and just before I finished in 1999, I got together with a group of local software developers in Orange County and we put together a company called e-built. e-built developed sites, basically developed new sites and services for other companies and for venture capital and so we quickly grew from zero people into I think 350 people in one year. In Orange County, that would be a fairly large company, particularly in software and at the same time, I was chairman of the Apache Software Foundation. So, I became their chief scientist because I was interested in the group and just because it was a great group of people and they had a lot of fun and everything was just going great until the stock market collapsed and suddenly all the venture capital money got pulled out of the startup’s hands and all the contract developers, which were us, you know, didn’t have money to prepare developers. So, we went from that 300 developers down to 120 in the matter of months after that, which was depressing as hell and what happens, one of those, I had during that process, I had met David Neuschler from Day, he was the CTO of Day and he asked me to help out with the Java server process and explain how he could start up, what eventually became JSR170 or the JCR API. So, I helped him get started. I taught him about the JSP and taught him how the Java server process works in reality as opposed to, you know, what was on their PR stuff and so that enabled Day to create the JCR interface. In the process, you know, I got to know the company, I got to know the developers, one of the guys that I helped found E-built with had gone to work for Day and so I knew it was a good group of people and so I shifted over to Day. We had a headquarters building, US headquarters in Newport Beach and the real headquarters in Switzerland and roughly 30 days after I was hired, the board of directors found out that the CFO and others were spending all of the money on advertising and they shut down the American subsidiary. So, it was one of those things where it’s uniquely 2001 experience of, you know, post-stock market collapse and all that stuff. But I did like working with David and the overall architecture exactly fit what I thought was needed for content management because that was the biggest, by far the biggest request of companies was to build content management systems at the time. So, I thought the software was good, the team was incredibly awesome and that’s how I ended up staying. The Newport Beach office turned into an eight-person office instead of a hundred person and we just kept plugging at it. So, was that eight technical people? No, I was, I think maybe two or three technical people and, you know, through the lean times of the 90s and 2000s. But the team in Basel, you know, Switzerland has this great idea of, you know, when times are bad in the industry, you can work or the company will pay you full-time but you work half-time and it’s essentially like extending unemployment except you’re not unemployed, you’re actually working just half the time. And through that mechanism, all of the, pretty much the entire original Day Software team was able to make it through the downturn in the stock market. And at the end of it, you know, we were well positioned to become the best content management system. And, you know, for a long time, the Day CQ was by far and away the best technologically but we didn’t have the name to go with it. We didn’t have a well-established, marketed brand. And so when Adobe purchased Day 10 years ago, it immediately made us a worldwide content management product or brand. Yeah, right. You know, we had sales worldwide already but we weren’t in the first three of every conversation. People would have to learn about us before they’d be willing to trust us with their software whereas now we’re the first pick generally.

Yeah, seen from the other side. So I was with Day Software just three days, three years before the acquisition. And then really, yeah, when we inherited the Adobe sales force, you know, the sales people and everything that made a huge difference because I agree the product was technically very good. And in Europe sometimes we’re a bit shy on sales. I think, you know, we’re, I know we engines, we’re happy with the thing and sometimes we’re a bit shy about selling it. So that was a great move. Yeah. But it’s a good balance though because in the States, we’re not shy enough. So having both, having the worldwide scope the way we did was very beneficial to how we approached everything, particularly in developers because we were doing open source projects the whole time. To do open development the entire time and to attract open developers like yourself into the company while we were still small was significant. Had a huge impact. Yeah. Plus we have, sorry. Yeah, good people to work with. Yeah. Yeah, I totally agree. Guna is asking about simplicity. So you said simplicity was a benefit to adoption. Is that still a design goal today? And is it still possible to remain reasonably simple in the new world of HTTP3? It is, but it’s more, this has more to do with the simplicity is in the interfaces. So if you think of the overall simplicity of the HTTP as an API, so HTTP has a set of methods.

Here are the header fields. All the resources have the same interface. They don’t have a different interface for each resource. Things like that. All of that design is still there. So we still focus on the simplicity of the interface as opposed to creating a different interface for every application. Yeah. It certainly, but in terms of implementation, HTTP2 is much more difficult than HTTP1. I mean, it’s a huge difference in the implementation cost of getting it right. It’s the same with HTTP3. HTTP3, you’re dealing with basically implementing the entire stack of HTTP and TCP and TLS within one implementation. And it gives you a lot of flexibility, less dependence on other people’s software. But it is much more difficult for the infrastructure to implement. Right. But, as it goes, fewer people are implementing the infrastructure anyways. Now most people are getting their HTTP services via those large CDNs. And the smaller sites can keep using HTTP1 if they want. There’s not a huge benefit in terms of technological advancement. There’s not a huge benefit over HTTP1 unless you’re doing authentication and fairly complex requests, multiple requests per site. Right. Yeah. So the simpler cases might use simpler versions of the protocol. And then if you’re really doing high traffic stuff, you need the latest thing, right? Right. Yeah.

Okay. There’s two minutes left. We have one question from Yuval. You think that the technology and standards can help protect the freedom of sharing of almost, I would say, almost freedom of speech in countries which are not generally willing to do that? Can the open standards help fight the closeness that some countries would want? Yes and no. I mean, yes in the sense that the fact that we’ve made these technologies ubiquitous and easily accessible to anyone who can download software, that has made it extremely difficult for individual areas to get cut off from the internet. But countries that are focused on doing that are well aware of how to cut off the internet. So there are countries who can flip a switch and all internet access into the country is removed. And there’s simply nothing we can do from a standards perspective to get around that. The only real way that will change is satellites that can’t be interfered with being used as a replacement for the higher speed cables.

Because it’s just more difficult to interfere. But again, it’s like most large social problems can’t really be solved via technology. Sometimes they can bring people together who would want to solve them together. But really it’s fairly rare that they change things. It’s still human beings on both sides of the connection. Yeah, that’s right. Okay, so it looks like Hoppin is going to… Hoppin is working like a Swiss clock. It’s going to cut us off in 18 seconds. So any good last words that you can say in 12 seconds? No, thank you, Bertrand. And like you said, we’re still working in Apache and at Adobe and having a great time. So good luck with the rest of the conference, with the rest of Meetin. And wish I could be there in person. But of course, we’re all stuck in our little places. Yeah, that’s right. But it’s been great. Thank you. Thank you very much, Roy. And the conference is just beginning. So people should, if they’re still hearing us, I’m not sure, they should still stay around. Thank you and bye bye. Bye bye.

recommendation-more-help
3c5a5de1-aef4-4536-8764-ec20371a5186