First Things First, What is the Web?

What is the World Wide Web? Well, that’s easy. It’s the Uniform Resource Locator (URL), the Hypertext Transfer Protocol (HTTP), and Hypertext Markup Language (HTML). Three acronyms loosely connected on a decentralized string of servers and clients all bound together so that we can watch cat videos and check email and read the news and post pictures of food that is either about to be eaten or has just been eaten, among many, many other things.

That’s it. Thanks for reading! See you next time!

You want more? Fair enough, let’s dive deeper. The web is, fundamentally, a stitched together collection of what were at the time emerging technologies, only a handful of which were invented for use solely by the web, but all of which were extended through key innovations and insights that ultimately made up by the web. Tim Berners-Lee, the creator, took a few great ideas others had already built and transformed them into something incredible, decentralized, and fully open. The web really came together when he took hypertext, put it on the Internet, and made it simple. Extremely simple. So simple that many thought the web was fairly useless. And then, in a few years, it took over the world.

I. Hypertext: The Door to a Digital World

If you pick up the strands of technology laced under the network of the web you’ll eventually find yourself at hypertext, a concept difficult to explain or comprehend before computers though, as a roughly sketched concept, it predates them by at least a century. So, the tricky bit about discussing hypertext is knowing where to start.

Does one, for instance, begin with the 1968 software demonstration by Douglas Engelbart, a demo that would eventually go by the moniker “The Mother of All Demoes” and contained one of the first ever hyperlinks out in the wild. Or, do we travel back to the final days of World War II, as Alex Wright did in his article The Secret History of Hypertext,  when Vanneaver Bush imagined a way to catalog and link together the entirety of human knowledge on an elaborate collection of microfilm, in 1945. Or do we dive into the world of fiction, the annotated novels of Thomas Pynchon, for instance, or the winding stories of Jorge Luis Borges or, more recently, the digitally disconnected poetry of Judy Malloy which articulate the idea of hypertext through fragmented prose.

The Garden of Forking Paths by Jorge Luis Borges is considered an early precursor to hypertext

Let’s start with something entirely uncomplicated. The link. The link is everything. If you’ve only ever experienced hypertext through the web, you’re very familiar with the link, that bit of underlined blue text that jumps you from one website to another with the click of a button (like this link to the definition of hypertext on Wikipedia). This is, however, the diet, watered-down version of the possibilities of the hyperlink and, by extension, hypertext.

Hypertext is, essentially, linked text. The word itself was coined in 1968 by Ted Nelson, working on a way to index and store the disparate and forking pathways of the human mind. Nelson observed that in the real world, we often give meaning to the connections between concepts, it helps us grasp their importance and remember them for later. The proximity of a post-it to your computer, the orientation of ingredients in your refrigerator, the order of books on your bookshelf. Invisible though they may seem, each of these signifiers hold meaning, either consciously or subconsciously, that is only fully realized when you take a step back. Hypertext was mean to bring those same kinds of meaningful connections to the digital world.

So, hypertext describes documents (or media) that is, in some way, linked to other documents (or media). The way in which they are linked, the very point of their connection, adds value that the individual texts could not possibly convey on their own.

In practical terms, it means you can boot up some type of software on your computer, create a quick blob of text, and then reference another blob of text, tell the computer why you made that particular connection, then easily jump back and forth between the two blobs by clicking a few buttons. Then you can create another link, jumping off from the second. Take it one step further and you may even get around to organizing your links, or separating them out into different sections, or creating one set of links to help your students learn about the history of Greek mythology and another to aid your colleagues as they trace constellations across the sky.

My point here being that the kinds of software hypertext enabled was some real blue-sky stuff. By the mid-80’s, an entire hypertext community of software engineers and designers was positively buzzing with new software, and had begun to take the idea of hypertext and make it a reality. They created programs and applications for researchers, academics, even for off-the-shelf personal computers. They let users organize and arrange content in a way that made sense. Every research lab worth their weight in salt had a skunk works hypertext project. Together they built entirely new paradigms into their software, processes and concepts that feel wonderfully familiar today but were completely outside the realm of possibilities just a few years earlier.

And they were doing some really interesting things with links.

At Brown University, the very place where Ted Nelson was studying when he coined the term hypertext, Norman Meyrowitz, Nancy Garrett, and Karen Catlin were the first to breathe life into the hyperlink, introduced in their program Intermedia. At Symbolics, Janet Walker was toying with the idea of saving links for later, a kind of speed dial for the digital world – something she was calling a bookmark. At the University of Maryland, Ben Schneiderman sought to compile and link the world’s largest source of information with his Interactive Encyclopedia System.

Dame Wendy Hall, at the University of Southhampton, sought to extend the life of the link further in her own program, Microcosm. Each link made by the user was stored in a linkbase, a database apart from the main text specifically designed to store metadata about connections. In Microcosm, links could never die, never rot away. If their connection was severed they could point elsewhere, since links weren’t directly tied to text. You could even write a bit of text alongside links, expanding a bit on why the link was important, or add to a document separate layers of links, one, for instance, a tailored set of carefully curated references for experts on a given topic, the other a more laid back set of links for the casual audience.

Dame Wendy Hall working on her Microcosm hypertext software, one of the more advanced of its kind. Image from Department for Digital, Culture, Media and Sport.

There were mailing lists and conferences and an entire community that was small, friendly, and fiercely competitive, locked in an arms race to find the next big thing. It was impossible to not get swept up in the fervor. Hypertext enabled a new way to store actual, tangible knowledge; with every innovation the digital world became more intricate and expansive and all-encompassing.

Then came the heavy hitters. Under a shroud of mystery, researchers and programmers at the legendary Xerox PARC were building NoteCards. Apple caught wind of the idea and found it so compelling that they shipped their own hypertext application called Hypercard, bundled right into the Mac operating system. If you were a late Apple II user, you likely have fond memories of Hypercard, an interface that allowed you to create a card, and quickly link it another card, and another. Cards could be anything, a recipe maybe, or the prototype of a latest project. And one by one, you could link those cards up, visually and with no friction, until you had a digital reflection of your ideas.

Towards the end of the 80’s, it was clear. Hypertext had a bright future.

II. The Promise of Xanadu

I’d like to pause for a moment and come back to Ted Nelson. Nelson, who you may recall coined the term hypertext, was not a programmer. He was a man of ideas, and his ideas all seemed to coalesce into a single project that has dominated his entire career: Xanadu. There is no more definitive retelling of the storied and fraught history of the Xanadu project than Gary Wolff’s exhaustive article in WiredThe Curse of Xanadu. A story that tracks Nelson’s hypertext project as its team moves from a university, to Nelson’s own house and straight through to one of the largest software companies in the world Autodesk. A story that, for all its dramatic ups and downs, in-fighting, and massive investment, has very little to show in terms of actual hypertext software. Wolff called Xanadu “the longest-running vaporware project in the history of computing – a 30-year saga of rabid prototyping and heart-slashing despair.” Consider this. Wolff published his article in 1995. 20 years later, Xanadu is still in semi-active development.

So we consider Nelson, and Xanadu, not for its tangible contributions to the hypertext community. There are others that have added projects and codes and working ideas for use by the public. No, Xanadu is important because of its idea.

Nelson believed that the human mind could be mapped. Not with the circles and arrows of today’s mind mapping software but through an intricate lacework of links and documents connected in ways previously unknown and undiscovered. Nelson was fascinated with the concept of an original. He wanted Xanadu to have an entry for every single topic you could ever imagine, like a vast and immersive version of today’s Wikipedia, and a network of two-way links that criss-crossed over one another, each link giving meaning not to just to where it was linking to, but where it was linked from, representing how topics were connected and the ways in which we, as human beings, think about them.

Wolff called Xanadu a curse. But it was also a promise. It was to be the ultimate realization of the hypertext dream. All knowledge, indexed and linked together to create a meaning we can only find in the synaptic firings of our mind. In an interview with the BBC, Nelson said of the web,

It’s massively successful. It is trivially simple. Massively successful like karaoke – anybody can do it.

But even today, we have nothing but the smallest prototype of what Xanadu could do. The web, as an idea, doesn’t hold a candle to Xanadu, or any of the other hypertext projects that actually came into being. Berners-Lee stumbled onto something much more important, much more fundamental. Complexity is the enemy of ideas. If you want to connect the world, you have to think simple.

III. The Web Meets Hypertext

I tell you all of this, without much mention of the web, to ground you in the world before it ever existed. I think there’s this myth that goes around that the web was some magical tabula rosa born into an otherwise disconnected Stone Age which shook the world and ushered it into the Information Age. Truly, it’s the opposite. The web was rudimentary by hypertext standards and, initially, couldn’t live up to the hype.

It’s a story you likely know — it’s one I’ve told before — Tim Berners-Lee was toiling away at CERN, a research lab mostly known for high-level physics and its Large Hadron Collider, when, through a bit of corporate maneuvering, he managed to get internal funding for a project promising to deliver a cross-platform system for sharing and compiling and indexing the research notes of the many scientists that made up the lab. This was all, it would turn out, a bit of a ruse. From the beginning, Berners-Lee planned on creating a read-write system for the Internet, one that would allow for the free distribution of text and media over the wires of a global network.

Tim Berners-Lee, sitting in front of the NeXT computer he used to build the web

Berners-Lee was obsessed with this idea. But if he was going to share documents on the Internet, he needed a standard way to not only define them, but to link them together. Which, of course, brings us back to hypertext. He decided that his new project, what would ultimately become the World Wide Web, would use an almost impossibly simple language for its documents. A markup language of his own creation based on SGML, itself a standard mostly used for digital documentation and formatting documents for cross-platform printers, with a dash of hypertext mixed in. He called it HyperText Markup Language, or HTML.

HTML is, essentially, a collection of building blocks that can be used to create a structured document, what we now call a webpage. The first version of HTML allowed for headings, paragraphs, lists, things like that. Each had a specific tag (i.e. <h1>, <ul>) which could be read by a client, the browser, and output onto a screen. Then, of course, there was the link tag, (<a href="#">Link</a>), that magic bit of hypertext technology that would soon become the defining element of the entire web. But more on that later.

As Claire Evans points out in her book Broad Band: The Untold Story of the Women Who Made the Internet, HTML, and the web, didn’t offer very much to the hypertext community. There were plenty of hypertext projects that were lightyears ahead of where the web was at its conception. Links could only point one way, and they were inextricably tied to the document they linked from. On the flip side, if the destination of a link ever disappeared, the link would remain, left to rot away. There was no structure or hierarchy or centralized system of control. There was no way to organize links or add meaning to their connection beyond the text contained within it. There were just links. One-way links that could point to anything.

HTML, and the web, was so underwhelming that its proposal was rejected for a demo at the Hypertext conference in 1991. But Berners-Lee grabbed his computer and brought it to the conference anyway, cornering anyone he could talk to up his new web project. When he got there, he realized the conference wasn’t set up with proper access to the Internet so all he could really show off was a bit of HTML in a un-networked vacuum which, as we’ve seen, wasn’t all that interesting.

Berners-Lee hoped that his demo would convince some of the more seasoned hypertext veterans that the web was a platform worth building for. But all they could see was an elementary hypertext system running on an incredibly powerful but prohibitively expensive NeXT computer with none of the features the pioneers of the field had worked out years ago. They were used to using FTP to transfer files, downloading software on computers, and creating advanced software for single users with revolutionary applications.

It was hard to see that the web was not software. It was a platform. It was simple by design, so that anyone could use it. The web didn’t need to be downloaded to expensive NeXT machines. In fact, it didn’t need to be dowloaded at all. It all lived on the Internet. Anyone could create a website, and link to anyone else, and build whatever they wanted. They could create an entire page filled with just dead links. They could build a page that was just purple. And then, and this is kind of the point, they could link to other websites. No one was in control. Berners-Lee was imagining a hypertext system made up entirely of people, built on top of cutting edge technology. But that was extremely hard to see at the beginning.

At the conference, and really anywhere he could, Berners-Lee began passing out the code for his own NeXT client, which generated HTML, as well as the code for HTTP, code that connected the web to the Internet. And its connection to the Internet gave it life.

IV. The Web Comes Online

The problem was operating systems. CERN was filled to the brim with researchers using different computers, applications and operating systems. Berners-Lee wanted to transcend that. When Microcosm or NoteCards was released to the public, it had to be built and rebuilt for Windows, Apple computers, and Linux operating systems. That was just how things were done.

Computers with different operating systems, they don’t communicate well. So the question was, how can we get all these different computers talking to each other? The answer: talk slowly and talk simply. Guarantee the messages are straightforward. Build in redundancies. Prepare for failure. Make it universal. Turn any computer into both a transmitter and receiver. That was the elegance of HTTP.

HTTP, Hypertext Transfer Protocol, sits on top of the Internet, and it’s used to exchange files between computers. Some of these computers are servers, they broadcast out the content of the webpages. Most computers, though, acts as clients, they read the webpages from servers and display them on the screen. HTTP is the glue. It’s how Berners-Lee managed to get different computers with different operating systems talking over a massive, decentralized network.

Except for a few key standardizations, HTTP added very little on top of TCP/IP, which was, at the time the dominant transfer protocol of the Internet. In fact, it strips quite a bit away. A client, e.g. the browser you’re reading this on, can make only one of a limited set of requests, and only one at a time. It can, for instance, ask to GET a webpage. It can also ask to PUT a new webpage on the web, or DELETE a page entirely. It’s up to servers to react and respond to these requests. Once it does, it passes some content or data back to the client. In most cases, this is a fully rendered webpage. In some cases, it’s nothing at all.

The very first web browser out of CERN

That’s all there is to it. HTTP is an incredibly straightforward protocol for sending and receiving webpages on the Internet.

HTTP is also stateless. Stateless means forgetful, essentially “dumb.” Each time a client makes a request to a server, the server responds to that request. It has no memory of previous requests and no way of storing information for future requests. Browsers say, “I want this page,” and servers say “Ok, here it is.” The next time around, the whole process starts again from scratch.

FTP, the dominant protocol for exchanging files before the web, is very much stateful. It can hold a connection between servers and clients while making subsequent requests. It’s incredibly robust, and near limitless in capacity. It can also be real slow which is, of course, only when it works right.

Why make HTTP so bare-bones? Why give it a forgetful brain with no memory? Well, it bought the web two things other protocols lacked: speed and adaptability. For the web to be impressive, it needed to be fast. Pages loading in milliseconds fast. Removing all the cruft from HTTP was the wings beneath the web, it could make requests quickly, even when internet requests were still churning through phone lines. And paring its implementation down to a few simple requests meant just about anyone could use it. A programmer fairly well-versed in Internet technology could build a web server or client in an afternoon. If you followed the few rules HTTP laid out, you had everything you needed. That’s extremely important when you’re planning on giving the whole thing away for free. But I’m getting a bit ahead of myself.

Like HTML, HTTP’s design was practically an exercise in restraint. On the surface, it felt like it did very little. But it gave the web a solid foundation, a thin layer of abstraction resting gently on top of the Internet. Perhaps Berners-Lee had some idea that we’d use that foundation to build an entirely new world. But there were still many that couldn’t quite see it.

V. Finding Your Way on the Web

All these wonderful technologies mixed together would mean very little if people couldn’t find where they wanted to go once they got to the web. In order to do that, Berners-Lee needed some way for people to give their browsers directions. Something that, and I’ll say this once again for posterity, was simple.

Out of this necessity came the Uniform Resource Locator. The URL. A specification that, in its very first draft, took up no more than a single page. A technology that Berners-Lee has gone on record on more than one occasion to state is the single most important piece of the World Wide Web.

It is, essentially, an address. We used to call it that, even. A web address. It’s nothing more than a string of words and slashes that tell a browser where we want to go.

These words and slashes aren’t random. The URL format is intentionally predictable and carefully constructed, but plain enough that you can instinctually recall its structure. You likely don’t even think about it much. But it’s actually made up of a few different conventions, each of which predate the web. Every time you hit a slash, you reach something else that Berners-Lee borrowed from around the Internet. The URL takes domain names, the part of the URL they print on billboards (thehistoryoftheweb.com), combines it with what was at the time of the web’s conception an established and fairly standard file syntax format (/timeline) and adds a scheme to the front of it (http).

Berners-Lee could have invented his own addressing schema. He didn’t. He stuck to common best practices, conventions that were already bouncing around in the heads of programmers. It made the URL feel familiar to those who were expected to build with it, even to somebody who had never seen one before.

It also is flexible enough to leave some room for interpretation. That scheme at the beginning, the one that usually see as http or https? That can be anything. So if a browser decides it wants to support FTP, that can easily be changed to ftp://. Web browsers didn’t have to just be web browsers. A good amount of early browsers supported Gopher, a long gone competitor to the web, that way with gopher://.  And websites didn’t have to link to just websites, they could link to any common Internet protocol. The file path at the end of a URL is wholly up to servers to decide how to handle. There are a number of agreed up best practices, but you can do whatever you want with that as long as each page is unique.

This gave browser makers and web developers options. Berners-Lee originally wanted URL to stand for Universal Resource Locator. It was only after a bit of contested debate that it was officially backpedaled into Uniform, but it speaks to the URL’s intention. It was imagined as a common address for every Internet resource.

The URL changed everything. It closed the hypertext loop. Each page on the World Wide Web has a single address you can link to. It gave everyone a unique corner of the web they could call their own.

VI. Giving It All Away

Now here’s the twist. Here’s the part that changed everything. It’s the reason why Berners-Lee opted for versatility offer complexity.

The web, from day one, was free. After the first couple of years, the web team at CERN managed to convince their employee to enter it into the public domain. That means nobody owns the web. Anybody can use it, or build on top of it. You don’t need permission to create a website, like you do with apps in the app store or pages on a social media site.

The first International Web Conference, held in Geneva in 1994. Image from ABC.net.

Upon its release, the web was even small enough and simple enough to fit on a floppy disk. Imagine that, the whole web on a single floppy. Every time Berners-Lee was at a conference, he’d pass them out. Every mailing list he was on got sent a link to download it. The building blocks of the web could be downloaded quickly and understood in a matter of days.

Making the web open was its last great innovation. It’s the reason that people began to gravitate towards it even when there were more robust alternatives. It put the web in the hands of people, all people. At CERN, Berners-Lee gave the web solid footing. Within a few short years, people had already begun building new features and applications he hadn’t even dreamed of. The next stage of the web, it’s movement outside the hallways of CERN and into a much larger community, would be precisely because of its default openness. But that’s a story for next time.

Sources