Rolling Stone on Internet Archive

Jon Blistein, Inside the $621 Million Legal Battle for the ‘Soul of the Internet’, Rolling Stone, 29 September 2024

Major record labels have sued the online library Internet Archive over thousands of old recordings, raising the question: Who owns the past?

In the old chapel of a former Christian Science church in San Francisco, late-afternoon sun pours orange through the windows, and several giant servers are hard at work. The tall, black towers fill two large alcoves, and their cooling fans emit a serene, industrial hum as blue lights blink. Each flicker, says the man in charge of this operation, Brewster Kahle, is a virtual patron of the enormous digital library called the Internet Archive.

Across the room, near where the pulpit would be, sits a sculpture of chunky, red, antiquated computer monitors flashing bygone web pages — a snapshot of the World Wide Web in 1997, and the earliest entry to the Internet Archive’s enormous digital collection.

But it’s the statues — lined up against the walls and in the pews — that grab your attention. Hundreds of miniature replicas of past and present Internet Archive employees. Three more just arrived yesterday, Kahle, 63, tells me last fall as he points out his own. The statue’s wire-frame glasses match his; the white hair, however, is more tight and curled than the tufts that halo his actual head, making him look a bit like Doc Brown from Back to the Future. Kahle’s statue carries a book in one hand and a computer mouse in the other, the latter held out like an offering.

“Nobody makes any money here, right?” says Kahle. “So you do this for some other reason. Why do you do it? Because you want to be proud of what you do.”

I ask what you have to do to earn a statue.

“You have to work at the Archive for three years,” replies Kahle, who founded the Archive as a nonprofit in 1996. “You have to spend your time doing a public service. Oh, did I tell you?” he adds as a casual aside. “The major record labels are trying to destroy us.”

Ceramic sculptures of Internet Archive employees at the company’s headquarters in San Francisco. Jason Scott/Courtesy of the Internet Archive.

To many, the Internet Archive is its own kind of sanctuary — a vestige of a bygone internet built on openness and access, a Silicon Valley standout interested not in series funding or shareholder value, but the preservation of any piece of the cultural record it can get. But to the corporations and people that own the copyrights to large swaths of that record, the Internet Archive is like a pirate ship stuffed with digital plunder. Two lawsuits have brought these long-simmering tensions to the courts and public consciousness, with financial repercussions in the hundreds of millions that could bring down the internet’s greatest library.

“The Library of Alexandria for the Digital Age”

Before founding the Internet Archive, Kahle worked as a computer scientist, making major contributions to personal computing and the early internet during the Eighties and Nineties.

With the Archive, he says, “The whole idea was to build the Library of Alexandria for the digital age. To build universal access to all knowledge.”

The Archive is best known for its preservation of the ephemeral expanses of the World Wide Web, available through its one-of-a-kind archive/search engine, the Wayback Machine. But this is just one facet of its collection: Working with museums, libraries, and individual donors and contributors, the Archive has amassed more than 145 petabytes of material (if you took more than 4,000 digital photos every day for the rest of your life, you might end up with 1 petabyte). Much of this material is obsolete or out of print — books, microfilm and microfiche, old software, video games, obscure VHS tapes, TV news programs, historic radio shows, and hundreds of thousands of concert recordings.

“It’s a research library. It’s there to record and make available an accurate version of the past,” Kahle says. “Otherwise, we’ll end up with a George Orwell world where the past can be manipulated and erased.”

But this work has long rankled one of the most powerful forces in the United States — rights holders — and the threat of copyright lawsuits has always loomed over the Archive. Lawrence Lessig, the legal scholar and Archive ally, even predicted Kahle would wind up in court in a 2001 New York Times interview, days after the Wayback Machine launched.

It took nearly two decades — during which the Archive occasionally faced smaller legal challenges — but Lessig was right. In June 2020, several book publishers sued the Internet Archive following the launch of its pandemic-era National Emergency Library, which made its collection of scanned books available to borrow freely and without restrictions amid school, university, and library closures. The publishers claimed mass, willful copyright infringement and won a summary judgment in the lower courts last March. (The Archive appealed, but lost again earlier this month.)

Internet Archive founder Brewster Kahle Rory Mitchell/The Mercantile 2020/CC-by-4.

The same day the district court settlement was announced in August 2023, a set of music-industry clients — led by major record labels Universal Music Group and Sony Music — filed their own copyright-infringement lawsuit over another Archive endeavor, the Great 78 Project: an unprecedented effort to digitize 78 rpm records, the obsolete shellac discs that emerged in the 1890s and remained the dominant format for audio recordings until vinyl surpassed them in the 1940s and 1950s.

The Great 78 Project bills itself as a “community project for the preservation, research and discovery of 78 rpm records”; the labels, in their lawsuit, call it an “illegal record store.” They claim the availability of these digitized 78s constitutes “wholesale theft of generations of music,” with “preservation and research” used as a “smokescreen.” They further argue that the project “undermines the value” of the original recordings, and “displace[s]” authorized streams that actually generate royalties and revenue.

Attorneys for the labels forwarded interview requests to the Recording Industry Association of America, which declined to comment for this article. Ken Doroshow, chief legal officer for the RIAA, previously said that the suit was meant to address “the industrial-scale infringement of some of the most iconic recordings ever made.”

If you want to explore the old, weird expanses of recorded-sound history, there’s no better resource than the Great 78 Project. To build it, the Internet Archive contracted the services of expert audio preservationist George Blood, whose team has digitized and uploaded (with detailed metadata) more than 400,000 recordings since 2017. Click around, filter by year, genre, language, and you’ll find an infinite scroll of discs — most issued by long-defunct labels like Victor, Vocalion, Edison, Oriole, Okeh, and Brunswick — their front labels photographed and laid out in a grid, each one leading to a web page with a straight rip of the crackly recording. A folk, blues, or country tune, a lost jazz gem or minor big-band hit, a Yiddish comedy bit, Hungarian opera, Argentine tango, polka, foxtrot, gospel, hymns, or even just the sound of a person laughing because that’s what people wanted to hear when it became possible to record a human voice.

Blood, who is also named as a defendant in the lawsuit, calls this “preservation of the cultural record” one of the “great accomplishments” of the Great 78 Project. “Probably 95 percent or more of this content is not available anywhere,” he tells Rolling Stone. “Whether they were small labels, or obscure pressings, they have been lost to time.”

Digitizing a 78 for the Great 78 project George Blood/Courtesy of the Internet Archive.

Of these hundreds of thousands of recordings, the record labels sued over the uploading of 4,142 (an amended complaint from earlier this year added 1,393 recordings to the initial 2,749). Most are by recognizable legacy acts whose music is still widely available: Billie Holiday, Louis Armstrong, Elvis Presley, Chuck Berry, Hank Williams, Frank Sinatra, Benny Goodman, Ernest Tubb, and Peggy Lee. (These recordings are now no longer available on the Great 78 Project, per Kahle.) The potential damages are staggering — $150,000 per recording (the statutory maximum for an infringing incident), with a possible total of more than $621 million. If the labels win, with a broad enough judgment, it could end the Internet Archive. (The most recent action in the case was a private mediation session between the parties earlier this week. As it stands, the case is moving forward with the discovery phase scheduled to last through most of 2025.)

Above the doorway to Kahle’s office is a street sign: Librarian Place. With a self-deprecating drone, Kahle ticks off some of the achievements hanging on his wall: “I’m in the Internet Hall of Fame, American Academy of Arts and Sciences, American Antiquarian Society.” Not missing a beat, he marvels at the new distinction these lawsuits seem to have foisted upon him: “And suddenly we are bringing down capitalism.”

“Openness Is the Way to Go”

Kahle grew up in Scarsdale, New York, the son of a mechanical engineer who instilled in him a post-WWII ethos — “You can build things, you can try to make things better” — that later cross-pollinated with hippie idealism as Kahle studied engineering and computer science at MIT in the late Seventies and early Eighties. Kahle bolstered his studies with courses in history, Buddhism, and library science, even as many of his MIT peers treated the humanities like gym class.

Kahle was enamored with libraries, and they informed his two major contributions to the early internet era. In the Eighties, he joined the supercomputer company Thinking Machines, where he helped develop the Wide Area Information Server (WAIS), an early online publishing system and search engine modeled after the way people asked questions of librarians. His next company, Alexa Internet, founded in 1996, was named for the Library of Alexandria and crawled the web for information to create a quasi-card catalog for the internet.

During our conversations, Kahle name-checks a variety of 20th century texts such as “As We May Think,” Computer Lib/Dream Machines, Practical Digital Libraries, all of which conjure similar visions of a future where information, and people, are liberated through technology and libraries. Kahle believed the internet could replace the world he grew up in, where information was confined to a few TV channels, textbooks, and newspapers. That was a “game of very few winners,” he likes to say. He wanted to make a game with many.

When WAIS spun off into its own company in the early Nineties, Kahle had the chance to develop it further with Steve Jobs at NeXT. He declined. Jobs, says Kahle, “was not interested” in building out WAIS with search tools that were fully open to the public.

“Openness is the way to go,” Kahle says, “even though I won’t become as rich, because who cares about getting rich? How do we make it so there’s lots of writers, publishers, booksellers, and libraries that have their own niches? How do we make it so it’s many-to-many-to-many, without any central points of control?”

Kahle still got rich. WAIS sold to AOL for $15 million in 1995. And Amazon, enchanted by Alexa’s web-crawling capabilities, bought it for a reported $250 million in stock in 1999. (As part of the deal, Amazon agreed to keep donating those web crawls to the Internet Archive for preservation.)

Though Kahle’s ideals have never wavered, his creations were subsumed by a Silicon Valley behemoth feeding off all things antithetical to his vision of an open internet: advertising models, insane capital markets, and the ultimate “poison” (as he calls it), monopoly power. This was how you got tight controls on information, locked up behind towering paywalls. A game of few winners.

“We’ve taken the promise of the internet and shafted it,” Kahle says. “We convinced people — I was one of them — to turn to their screens to answer questions.”

Tamiko Thiel, Danny Hillis, Carl Feynman, and Brewster Kahle (clockwise from top left) at Thinking Machines, May 1985. Tamiko Thiel/Courtesy of the Internet Archive.

So, as the internet zagged, Kahle took his millions and brilliance and built his bastion. And though he spurned Silicon Valley’s hypercapitalist bloodlust, he continued to embrace its swashbuckling, sometimes heedless pursuit of a goal. To grow the Internet Archive, that meant dancing around and prodding the limits of copyright law. Though to the Archive’s detractors, this often looked like blatant disregard.

In their lawsuit, the labels hit the Archive for its “long history of opposing, fighting, and ignoring copyright law, proclaiming that their zealotry serves the public good. In reality, Defendants are nothing more than mass infringers.”

Still, “serving the public good” seemed to earn the Archive some leeway. Jessica Litman, a University of Michigan Law School professor and copyright expert, notes that the Wayback Machine was able to skirt major challenges because it “became a really well-accepted resource,” and no one else (including the Library of Congress) was willing to put up the money, or take the copyright risk, to index the web.

When rights holders did demand something be taken down, the Internet Archive obliged: “Always with respect and in conversation,” Kahle says. Sometimes, a compromise was reached, like in 2005, when the Internet Archive and the Grateful Dead found a solution to keep the band’s myriad concert recordings available on the “Live Music Archive.”

Read/view more