The Magazine

Google and Its Enemies

The much-hyped project to digitize 32 million books sounds like a good idea. Why are so many people taking shots at it?

Dec 10, 2007, Vol. 13, No. 13 • By JONATHAN V. LAST
Widget tooltip
Single Page Print Larger Text Smaller Text Alerts

In 1998 Larry Page and Sergey Brin founded a company called Google, about which you likely know quite a bit. The outgrowth of work Page and Brin began in 1996 on hypertextual search engines, Google has moved from darling little high-concept innovator to Microsoft-like behemoth in record time. Google employs over 15,000 people, has a stock price hovering near $700 a share, and is the all-powerful advertising and search force on the Internet. It is gradually pushing and purchasing its way into entertainment, business software, and even the cellular telephone market.

Before Page and Brin started Google, however, they were graduate students working on Stanford's Digital Library Technologies project, which sought to digitally store and catalogue books, newspapers, and scholarly journals. Page, in particular, seems to carry a torch for this endeavor. In 2002 he approached his alma mater, the University of Michigan, about digitizing the library. It was the birth of the Google Library Project, one of the most ambitious undertakings in the history of the written word. It was also a move that would create for Google--a company obsessed with its own beneficence--a crowd of enemies.

In July 2004, Google began quietly scanning and digitizing Michigan's library. Five months later, in December 2004, the company officially announced the "Google Print for Libraries" project. (After the effort hit snags and received some bad press, it was rebranded "Google Book Search.") Google partnered with five major libraries--Michigan, Stanford, Harvard, Oxford's Bodleian, and the New York Public Library--in an attempt to scan the pages of 15 million volumes. These digital books would be kept and indexed in a Google database, which would be made available, for free, to the public.

The scope has changed in the intervening years. Initially Google planned to scan the 15 million books in six years. That projection was revised upwards to more than 20 million books, and the New Yorker recently reported that Google is now aiming to scan at least 32 million books, besting the number of titles in the largest bibliographic database, WorldCat. It hopes to finish within ten years. As one Googlehead told the New Yorker's Jeffrey Toobin, "I think of Google Books as our moon shot."

It remains to be seen how realistic this goal is. Google will not divulge how many books it is scanning currently, or how many titles are already in its database, which went live to the public in May 2005 at books.google.com. To get a rough sense of things, the University of Michigan library has 7 million volumes and Google estimates it will have annexed them all by 2013, noting that it is scanning tens of thousands of books each week. Google will not reveal how it scans the books. As for the cost, this too is closely guarded by Google. In a similar venture, Microsoft is spending $2.5 million to scan 100,000 books; if that scale were to hold, Google might spend as much as $800 million.

Google has also expanded its list of library partners to include 13 additional libraries, ranging from the Bavarian State Library to the University of Virginia. Most of the agreements are private, so it is unclear what the participating institutions get from the deal, other than a digital copy of books they already own. For Google, the potential upside must seem enormous: The ebook movement of a few years ago failed but the Holy Grail of the digital library movement remains a massive archive of books, all searchable, which can be accessed from anywhere on the planet. Already a company called OnDemandBooks has created a machine called "Espresso" which can take the digital text of a book, print it, and bind it into soft cover in about four minutes. The commercial promise--and downright coolness--of Google's undertaking staggers the mind. Which is why many recent accounts of the project, from Toobin's to Jason Epstein's in the New York Review of Books to Michael Hirschorn's in the Atlantic, vibrate with fidgety, egg-headed excitement.

Not everyone is thrilled, though. As a class, users seem underwhelmed by the product itself, poking fun on blogs at the page-scans, the titles included, and the odd results that appear in response to search queries. Google's book-reader interface is unwieldy: It is difficult to navigate through the books; what may be read is full of poorly explained limits; and "page unavailable" messages often appear in the middle of books. Some books are presented without advertisements. Others have ads embedded in the browser window, which appear to run on a keyword algorithm similar to Google's Ad Words service. The entry for Mark Twain's Life on the Mississippi, for instance, carries ads for sightseeing tours on the Mississippi River and a volume from Twain's collected works.