What Ever Happened to Google Books?

It was the most ambitious library project of our time—a plan to scan all of the world’s books and make them available to the public online. “We think that we can do it all inside of ten years,” Marissa Mayer, who was then a vice-president at Google, said to this magazine in 2007, when Google Books was in its beta stage. “It’s mind-boggling to me, how close it is.”

Today, the project sits in a kind of limbo. On one hand, Google has scanned an impressive thirty million volumes, putting it in a league with the world’s larger libraries (the library of Congress has around thirty-seven million books). That is a serious accomplishment. But while the corpus is impressive, most of it remains inaccessible. Searches of out-of-print books often yield mere snippets of the text—there is no way to gain access to the whole book. The thrilling thing about Google Books, it seemed to me, was not just the opportunity to read a line here or there; it was the possibility of exploring the full text of millions of out-of-print books and periodicals that had no real commercial value but nonetheless represented a treasure trove for the public. In other words, it would be the world’s first online library worthy of that name. And yet the attainment of that goal has been stymied, despite Google having at its disposal an unusual combination of technological means, the agreement of many authors and publishers, and enough money to compensate just about everyone who needs it.

The problems began with a classic culture clash when, in 2002, Google began just scanning books, either hoping that the idealism of the project would win everyone over or following the mantra that it is always easier to get forgiveness than permission. That approach didn’t go over well with authors and publishers, who sued for copyright infringement. Two years of insults, ill will, and litigation ensued. Nonetheless, by 2008, representatives of authors, publishers, and Google did manage to reach a settlement to make the full library available to the public, for pay, and to institutions. In the settlement agreement, they also put terminals in libraries, but didn’t ever get around to doing that. But that agreement then came under further attacks from a whole new set of critics, including the author Ursula Le Guin, who called it a “deal with the devil.” Others argued that the settlement could create a monopoly in online, out-of-print books.

Four years ago, a federal judge sided with the critics and threw out the 2008 settlement, adding that aspects of the copyright issue would be more appropriately decided by the legislature. “Sounds like a job for Congress,” James Grimmelmann, a law professor at the University of Maryland and one of the settlement’s more vocal antagonists, said at the time. But, of course, leaving things to Congress has become a synonym for doing nothing, and, predictably, a full seven years after the court decision was first announced, we’re still waiting.

There are plenty of ways to attribute blame in this situation. If Google was, in truth, motivated by the highest ideals of service to the public, then it should have declared the project a non-profit from the beginning, thereby extinguishing any fears that the company wanted to somehow make a profit from other people’s work. Unfortunately, Google made the mistake it often makes, which is to assume that people will trust it just because it’s Google. For their part, authors and publishers, even if they did eventually settle, were difficult and conspiracy-minded, particularly when it came to weighing abstract and mainly worthless rights against the public’s interest in gaining access to obscure works. Finally, the outside critics and the courts were entirely too sanguine about killing, as opposed to improving, a settlement that took so many years to put together, effectively setting the project back a decade if not longer.

In the past few years, the Authors Guild has usefully proposed a solution known as an “extended collective licensing” system. Using a complex mechanism, it would allow the owners of scanned, out-of-print libraries, such as Google or actual non-profits like the Hathitrust library, to make a limited set of them available with payouts to authors. The United States Copyright Office supports this plan. I have a simpler suggestion, nicknamed the Big Bang license. Congress should allow anyone with a scanned library to pay some price—say, a hundred and twenty-five million dollars—to gain a license, subject to any opt-outs, allowing them to make those scanned prints available to institutional or individual subscribers. That money would be divided equally among all the rights holders who came forward to claim it in a three-year window—split fifty-fifty between authors and publishers. It is, admittedly, a crude, one-time solution to the problem, but it would do the job, and it might just mean that the world would gain access to the first real online library within this lifetime.

The New Hillary Library?

According to statistics furnished by the American Library Association for 2012, academic libraries spent $2.8 billion on information resources, of which half was for electronic serial subscriptions. Stanford University pays $1.2 million for annual subscriptions to four hundred RELX journals, which contain a large number of articles written by its own faculty.

.. Of course, there is no escape from the costs of publishing journals. But nonprofit, professional associations, such as the American Historical Association, have demonstrated that the costs can be covered by publishing excellent work at reasonable prices. The problem concerns commercial journals that have a monopoly on the literature in specialized fields, and they can be combatted by competition—that is, the creation of open-access journals.

.. Libraries lend themselves to utopian fantasies. They can be places for combining endless, unexpected trains of thought, as in Alberto Manguel’s The Library at Night. They can also set off nightmares, as in “The Library of Babel,” a dystopian fantasy by Jorge Luis Borges, one of Manguel’s predecessors at the National Library of Argentina. Borges’s vision of a hopeless search for truth in an infinite world of books suggests the sense of helplessness that can overcome anyone lost in cyberspace.

.. Hayden’s main problem concerns copyright, which covers most books published after 1923 and all books published after 1964. An unknown number of books published between those dates are orphans—that is, books covered by copyrights whose owners, if they exist, cannot be identified, even by long and costly research. No one knows how many orphan books exist—probably hundreds of thousands, perhaps more than a million.

 

The Oracle-Google Case Will Decide the Future of Software

But since the appeals court has already ruled that APIs are subject to copyright, that could open a whole new frontier of lawsuits aimed at startups and open source projects that have copied APIs in order to ensure their products are compatible with popular commercial products.

For example, several companies have built open source software that works with various cloud services in an attempt to make it easier for customers to easily move their applications from, say, Amazon to their own data centers. Basho and SwiftStack, to name just two, each offer storage products that are compatible with Amazon’s cloud storage service S3. Since APIs are subject to copyright, Amazon could in theory go after both companies for copyright violations.

Meanwhile, many open source operating systems, such as FreeBSD and those based on Linux, use a standard API called POSIX, which is based on the API of AT&T’s Unix operating system. Under the appeals court’s ruling, AT&T could go after the makers of POSIX operating systems.

“Both of those scenarios are more likely after Oracle v. Google,
regardless of how the jury decides,” says Mitch Stoltz, a senior staff attorney at the Electronic Frontier Foundation.

.. Many newer development platforms, including Google’s Go language and Apple’s Swift, are licensed under more liberal terms than Java and allow for-profit companies to use and modify them.