Death To "Link Rot": Here's Where The Internet Goes To Live Forever

According to one recent study, half of the links in Supreme Court decisions either lead to pages with substantially altered content or no longer go anywhere, at all. Perma.cc, a startup based out of the Harvard Law Library, wants to see more work immortalized.

The phrase "link rot" probably summons many images for you--none of them good.

And while clicking on a dead link isn't quite as physically unpleasant as, say, touching a piece of slimy, disintegrating wood, bad links are weakening the web as surely as bad beams can compromise a building.

When websites disappear or change, any piece of work--be it a blog post, book, or scholarly dissertation--that linked to those resources no longer makes quite as much sense. And some of these now-moldering links are structurally important to the fragile, enduring edifice of human knowledge: in fact, according to one recent study, half of the links in Supreme Court decisions either lead to pages with substantially altered content or no longer go anywhere, at all.

In the face of this decay, the authors of that paper, the legal scholars Jonathan Zittrain, Kendra Albert, and Lawrence Lessig, floated one possible fix: create “a caching solution” that would help worthy links last forever. Now, this idea is being in practice by Perma.cc, a startup based out of the Harvard Law Library. Old-school institutions like law school libraries, it turns out, may be perfectly positioned to fight against the new-school problem of link rot. Libraries, after all, are “really good at archiving things,” as Perma's lead developer, Matt Phillips, puts it.

“We have quite a history of storing things safely that are important to people for a really long time,” says Phillips, a member of Harvard's Library Innovation Lab. “It's a failure if we're not preserving what's being created online.”

To start with, Perma.cc's small team of developers, librarians, and lawyers has designed an archiving tool that's as easy to use as any link shortener. Stick in a link, and you'll get a new Perma-link--along with an archive of all the information on the page that link leads to. Anyone can sign up as a user, and create links with a shelf life of two years, with an option to renew. A select group of users, though, can “vest” links--committing Perma.cc to store their contents indefinitely.

Since launching last fall, the project has grown rapidly, signing up a couple thousand users and recruiting 45 libraries and dozens of law journals as partners. But only a fourth of Perma.cc's users--472 “vesting members” and 113 “vesting managers,” at current count--have the power to grant links immortality (or as close to it as Perma.cc can manage).

“The problem is, in practice, it's a very serious commitment to say this will be kept forever,” says Jack Cushman, who started contributing to Perma.cc as volunteer, before joining formally as a Harvard Law School Library fellow. “It's not something that we can promise to everyone in the world to begin with.”

Instead, they began with legal scholarship, the section of the world they knew best. At the top of Perma.cc's power structure (modeled, in part, after how the Internet's Domain Name System operates) are law libraries, already trusted resources, full of people who have years of practice deciding what sort of intellectual work should be kept around over the long run and who should have access to it. Libraries pick “vesting organizations”--mostly law journals, the sort of publications these libraries already commit to preserving as print resources--to anoint with the power to make a Perma-link a permanent part of the project. Those publications each have vesting managers, who can confer Perma power on individual vesting members. The idea is that those vesting member will create Perma-links and build them into law review articles at the time they're being written, edited, and finalized.

“In the legal field, specifically, it's all about showing your work, showing your reasoning, and showing the evidence you're marshaling to support a particular point of view,” says Adam Ziegler, Perma.cc's project manager (who once worked as lawyer himself). “When the resources aren't available to your reader, you're not doing your job.”

When a user puts in a link into Perma.cc, it creates a perfect recording of what a web browser would see. Once that recording is made and archived, anyone looking to re-trace the author's logic has access to an exact copy of the web page that author was looking at when she constructed her argument.

That copy, ideally, will be stored at a host of libraries, on the theory that the best way to ensure the project endures is to build redundancy into the system. For centuries, Cushman points out, important documents have survived because people have valued them enough to make duplicates. Now, it might be possible to access the only copy of a particular document from anywhere with an Internet connection. But if something happens to that one copy, then that information is gone forever. If it's important information, it's safer to keep multiple copies, in various form, in a number of places, far away from each other. That's exactly what libraries do, and have always done.

Perma.cc is also banking on the relationship that the legal world already has with its law libraries, as trusted repositories of essential information. It's easier to convince people to store important information in the same place they've always stored it.

“When judges write their opinions, they're not just thinking about whether the information is reliable, but whether the web page is still going to be around in a year,” says Cushman. “The books that they cite are in stacks at the library and have been there for a hundred years. The Internet is a second-class citizen as a source of information. We're legitimizing the Internet as a resource for courts.”

And, potentially, for the wider world, too. "The problem is not unique to the legal world," says Ziegler. "We have a big, big vision to really take a bite of the link rot problem on a much larger scale." Outside of libraries, the team's gotten inquiries from curious law firms and legal publishers, and the next logical step for expanding would be into other fields of scholarship. On the grandest scales, though, this type of archive should be of interest to any publisher that wants to improve its readers' experiences--anyone using the Internet who cares about the long-term integrity of the text they're creating.

But libraries--even the Harvard Law Library--don't necessarily last forever. And, as with any startup, there's a relatively high chance that Perma.cc could fail. There is, however, a contingency plan to keep any links entrusted to the project from starting to slowly decay, like any old, rotten bit of the Internet. While the project is alive, the Perma team plans to submit or make their archives available to services like the Internet Archive. If it ever shuts down, the team is committed to maintaining already-vested links for at least two years, during which they'll search for a new host and give users tools to move their archived links elsewhere.

After that, there will be a copy of the entire database at the Harvard Law Library, where they will stay, mummified, for as long as anyone remembers they're there.

But when a startup is embedded within an institution--such as the New York Times' more innovative projects or the data team within the Obama campaign--success depends not on buyers but on buy-in. Perma.cc's funding comes both from the library's budget and from grants supporting the project (the budget, says Ziegler, is "very lean, well under $100,000"), and one of its most valuable resources is access to coders, designers, archivists, and lawyers already working at law school and its library--people who might have plenty of other work to distract them, but who contribute to Perma.cc because they're excited about the project. Ultimately, the Perma.cc team has to keep justifying to Harvard that it's a valuable use of resources, and, if it wants to survive, start amassing long-term commitments from other libraries to serve as depots for the Perma.cc archive.

[Image: Getty Images, Logan Mock-Bunting]

Add New Comment

5 Comments

  • Um... ever hear of Archive.org? The Wayback Machine? Am I ringing any bells here? For this entire article to be written without referencing the main system we have had for almost two decades for archiving web sites (yes including all their links, just go to the Wayback Machine and paste in the broken link to see what it historically pointed to), is... well it's like writing about a new encyclopedia startup without mentioning Wikipedia.

    I'll give you this: I can certainly see where 'fast' in fastcompany.com comes in to play here! Maybe you should add 'loose' - fastandloosecompany.com -- would be even more on point here!

  • Sarah Laskow

    Perma.cc is working with the Internet Archive. The Wayback Machine has limits -- it's a broad archive but it doesn't go particularly deep, and there's no guarantee any particular bit of the internet will be archived there. This is a more targeted effort, in which people make decisions about what specific pages should be saved.