Subject: the man behind project gutenberg an interesting article, despite the author's attempts to sound like holden caufield on speed... ----------------------------------------------------------------------- from Wired magazine, Issue 5.02 - February 1997 Hart of the Gutenberg Galaxy What kind of a man wants to put the 10,000 most important books online by 2002 and make them available for free? (Hint: the kind of man who puts sugar on his pizza.) By Denise Hamilton I am sitting with Michael Hart at Garcia's, a pizza place near the University of Illinois at Champaign-Urbana, where we are having a perfectly normal conversation about digital libraries and are preparing to tuck into our dinners. Hart is an e-text visionary and cofounder of Project Gutenberg, whose aim is to put copies of the world's greatest books on the Net for free. Hart is also a world-class eccentric with the social skills of a cranky 2-year-old. As I look up from my glistening pepperoni, Hart, who is 50, does the damnedest thing I've seen done to a pizza. Tearing open two dozen sugar packets, he sprinkles them methodically over his deep-dish pie until it is layered with white crystals. Then he digs in. "This may seem weird to you," he confides between mouthfuls, "but it's the only way I can get enough fuel to keep working. This right here is 2,000 calories, and it should keep me going for a while." Then the sugar begins coursing through his system, and Hart takes on the bug-eyed look of someone who has just hoovered up a few fat lines of Peruvian flake. The words tumble out. Sweat beads his upper lip. "Aaaah," he says, leaning back in the wooden booth. "I love working till I drop. I love looking back on the day and knowing I put another book online. That's what I'm here for. If I feel like I'm crashing, I eat. If that doesn't work, I sleep. I live by the nanosecond and burn the candle at both ends. With a blowtorch." For more than a quarter century - eons, in Net time - Hart has made Project Gutenberg his mission in life. Or, more accurately, his fervid, manic obsession. While other geeks have grown rich mining the silicon motherlode, he's been sitting like a troglodyte in his Urbana basement, tending a utopian vision named for the 15th-century German printer whose amazing invention first made books accessible to the masses. As the 20th century wanes, Hart is using the Internet to try to finish what Johann Gutenberg started, pumping into cyberspace digitized copies of Rebecca of Sunnybrook Farm, Moby Dick, and Orlando Furioso and the works of Henry James, Plutarch, and Dostoyevsky - basically, any and everything that's ever been published between hard covers. It's one of civilization's oldest dreams: the universal library. And in an era of ever slicker Web graphics, blinking GIFs, and Java-powered billboards, he's doing it with militantly plain ASCII text files - just the words, please - downloadable anytime anywhere in the world by anyone with a functioning computer, a phone line, and a modem. No ads, no credit card numbers, no charge. Even by the Net's generous standards, Hart is a wild romantic. He refuses to keep usage logs. He's not a big fan of copyright extension laws. He considers the Web an appalling waste of bandwidth - Gutenberg's main files reside on an FTP site. What passes for organization is a loose network of volunteer typists, scanners, editors, and proofreaders, plus a couple of pro bono attorneys, all held together by email. Other than that, Hart operates with something pretty close to no strings attached - political, institutional, or otherwise. Not that a lot of the groups that share his aims would have him. Since Hart first typed the American Declaration of Independence into a U of I mainframe 26 years ago, Gutenberg has put about 1,000 titles online, on a budget smaller than some big-time electronic publishers allocate for market research. His self-imposed goal is to reach 10,000 books by the end of 2001, Gutenberg's 30th anniversary. And with 9,000 to go, the questions rise like the greasy steam from our pizzas: Is Hart a deluded crackpot, a high tech saint, or just another Net eccentric? And what is a project with seriously historic ramifications doing in the hands of someone who puts sugar on his pizza? To find out, I'm pulling up outside the big, century-old, ivy-covered brick house that Hart bought years ago with a modest inheritance. In my honor, he's wearing slacks but quickly changes into the uniform he will sport for the rest of my three-day visit: black bicycle shorts that emphasize his pot belly, a red T-shirt, a baseball cap, and Nike Air Jordans two sizes too big that cost US$1 at a garage sale. Beneath the cap is short, thinning hair. He has a booming laugh and a large, clean-shaven fleshy face with graying sideburns. When I ask to use the bathroom, Hart gestures into the basement, past peeling plaster and ancient caulking. "Don't bother flushing," he hollers down. "I've got a garden hose, and I'll just run some water through it." I begin to realize that this is a man with clearly defined priorities. "Other than to redesign democracy, I can't think of anything more important than Gutenberg," Hart tells me when I return. "It's the Archimedes lever: Give me a place to stand, and I'll move the world. Well, that's what I'm doing. Our goal is to give away 1 trillion e-text files by December 31, 2001. That is 10,000 titles each to 100 million readers, which is only 10 percent of the present number of computer users. Since that number will double by 2001, we will meet our goal if only 5 percent of all users download a file." Easy as that. Hart calls himself an electronic Johnny Appleseed, sowing the Internet with books for the global proletariat. He foresees the day when, armed with the trashiest laptop and a modem, tribesmen in Borneo's rain forest will be able to click onto Gutenberg and download texts. But why would anyone want to read a book on a computer when they can hold a bound copy in their hands, I ask. Because it's fast, convenient, and free, Hart shoots back, along with a list of the people who are using Project Gutenberg already: Kids doing research for classes. People in foreign countries who want to practice English. Grandmothers turning their computer-savvy grandkids on to The Jungle Book. Folks looking for rare Robert Louis Stevenson volumes - Gutenberg has the author's entire 30-plus volume oeuvre - not carried by their local library or bookstore. And people like Jo Churcher of Toronto, who is blind. She downloads Gutenberg texts and runs them through a speech synthesizer that reads them out loud, like books-on-tape, only free. "Gutenberg has finally allowed blind people to begin building up libraries of their own," says Churcher, who likes the project so much she has helped scan in 12 books, including The Pickwick Papers, for Gutenberg's files. We move into Hart's living room, a dimly lit cavern piled with modems, computer cards, hard drives, videocards, and power supplies, all leaning drunkenly against each other. There are hundreds of boxes of software; thousands of records; CD-ROMs and CDs; moth-bitten beaver, mink, and monkey coats; hippie op art; what Hart says is an original Modigliani; a gaily painted child's hobbyhorse; a Cable piano from the 1910s; a Thomas Edison Dictaphone from the '30s; and rooms filled with books, including 50 complete dictionaries, three copies of Chaos by James Gleick, and nine of Mistral's Daughter by Judith Krantz. "She's usually a ditz," he explains, "but this was a good book." Hart is a garage sale fanatic, hitting up to 100 each weekend as he criss-crosses the leafy suburbs of Champaign-Urbana on a bicycle (he doesn't have a car). He also forages daily among neighborhood dumpsters, where U of I students routinely discard stereo components and new floppy disks. But his favorite dumpster is located behind the U of I computer lab. We drive there, and Hart hops right in, wallowing thigh-deep amid pizza cartons and plastic milk bottles because he knows the school often throws out old equipment, stuff he can use for Gutenberg or trade at the weekly "geek lunch" he attends with other tech-heads in the U of I orbit. Today, he fishes out a set of Unix manuals, an 8mm tape cartridge, and two copies of Mathematica, for Sun's Solaris operating system. Slim pickings. Nothing like the time he scored an entire ATT 7300 minicomputer. Hart doesn't need high-end stuff to run Gutenberg - it's just nice to have around. Every few years, he gets a computer from Apple, NeXt, IBM, or Hewlett-Packard. Bell & Howell once donated a $50,000 scanner, to help volunteers input books quickly. And he's thinking about setting up an email server, so he can offer free accounts to Gutenberg volunteers. Just talking about the idea gets him giggling with glee. As a child, Hart's idols were Peter Pan and Albert Einstein. In adulthood, the two have twined like a double helix to define his personality so that within one conversation, he can go from visionary brilliance to bluster to bumble. "Michael is one of the smartest guys I've met in my years," says Greg Newby, an assistant professor and assistant dean at the School of Library and Information Science at the U of I and a senior research scientist at the National Center for Supercomputing Applications. "He's also one of the least mature. That makes for an interesting combination." Hart grew up in Tacoma, Washington, where his parents worked as government codebreakers during World War II. In peacetime, his father was a Shakespeare professor and a CPA. His mother was a math and education professor and ran a women's clothing store. Dyslexic but precocious, Hart was messing with algorithms while other kids were fumbling with Lego. He plowed through the U of I in two years, graduating first in his class with a self-created major in man-machine interfaces. Hanging around the university's computer lab after graduation, he dreamed of a universal digital library and posted an e-text manifesto that grew into Project Gutenberg. With the US bicentennial looming, he chose the Declaration of Independence as his inaugural text - he also happened to have a copy of the text in his backpack. Next he did the Bill of Rights, then the Gettysburg Address. But the project moved sluggishly because Hart inputted everything manually after his day job of selling stereos. In those early mainframe days, Hart had to operate within the impracticable bounds of a 10-Kbyte donated storage space. But as computers got faster and smaller, Gutenberg began to look more like a real possibility. Things really broke loose in 1988, when Hart typed in his first full novel, Alice in Wonderland. Mark Zinzow, a senior research programmer at the U of I who met Hart that year, remembers thinking that the project - which at the time had 10 books online and a 1200-baud connection - was completely harebrained. "But it was also a noble goal," says Zinzow, "and I thought, If he wants to tilt at windmills, I'll help him get on his horse." Although Hart no longer had a formal affiliation with the university, Zinzow gave him access to email and mailing lists, and more important, an FTP server. Soon, he was watching in amazement as Gutenberg pumped out a gig of data each day. Around the same time, Hart was invited to place his digital collection under the official auspices of the Benedictine University, a tiny Roman Catholic seminary in nearby Lisle, Illinois. The symbolism did not escape Hart: monasteries had once been repositories of knowledge in the Dark Ages, copying and preserving books for posterity. Wasn't Gutenberg doing the same thing electronically, preserving e-texts long after paper and microfilm would crumble into dust? The monks went even further, naming Hart an adjunct professor of electronic text and giving him a $12,000 annual salary. They kicked in some extra money for living expenses, most of which eventually came from sales of a Gutenberg CD-ROM that has hit 100,000 copies. (Regularly updated, the current edition boasts more than 500 titles, on a single disc.) As the project grew, Hart's management style stayed laid-back. He refused to draw up a master list of books for the Gutenberg pantheon, preferring to let volunteers input their favorite tomes. Initially, Gutenberg got the usual suspects - the Bible, Virgil's Aeneid (in English and Latin), and Hamlet. Then came more unusual fare: The Book of Mormon, Herland (a 19th-century feminist novel), and Flatland (science fiction, about 4-D travel). Deadlines were flexible. There was no follow-up. Sometimes the e-texts arrived months or even years late. Other times, they just dropped into limbo. What keeps things going is Hart's fanaticism. A DOS man and a neo-Luddite when it comes to GUI interfaces, he says he's never even used the World Wide Web - he doesn't like the graphics. But the five machines that make up Gutenberg Central boot so fast you can barely see the start-up box flash by. And when a fresh new book from a volunteer comes in on a winter night, he'll wake at 3 a.m. and sit before the glowing screens in a sweatshirt resembling a hooded monk's cowl he got from the Benedictines. In summer, he works in nothing but biker shorts, blasting classical music or classic rock. To get his Vitamin D and avoid rickets, Hart relies on a full-spectrum lamp that mimics the sun's rays. Nearby is a rumpled mattress where he sleeps when felled by exhaustion. When I visited, he was proofing The Violet Fairy Book - a 1901 collection of stories edited by Andrew Lang - and checking that the format, spacing, and margins conform to official Gutenberg style. After scanning for errors, he adds a header - "Welcome to the World of Plain Vanilla Electronic Texts. Readable By Both Humans and Computers, Since 1971" - includes some boilerplate legal information, and updates his directory. Then he hits Enter. Bing - the book is in an FTP site on PrairieNet, a community access computer system in the Midwest. From there, it will ripple around the world, to interested individuals, literary Web sites, and other digital libraries. Hart has to keep working, or he'll drown in data. Each day, he receives up to 400 pieces of email, chats with dozens of volunteers, and works on upcoming books. Besides running Gutenberg, he's also the central figure behind "Ask Dr. Internet," a free service run by a Gutenberg-like group of tech-heads that has evolved into its own full-time job. Even the most clueless AOL newbie gets an answer - often with a liberal icing of Hart's own curious views about things like the Web and graphical interfaces. With Gutenberg, he gets help from 750 volunteers around the world. Lawyers work pro bono, researching a book's copyright to ensure it has entered the public domain. Tech-heads like Zinzow provide sysadmin consulting and tend computers held together with gum and baling wire. Scholars input and proofread e-texts. A group of 50 Russian academics, for instance, recently did Webster's Unabridged Dictionary by hand. The 45 million keystrokes took them six months, for which they were paid $5,000 by one of Hart's financial supporters. Some texts are labors of love, by volunteers who tap along for years to finish one title. But most of the heavy lifting is done by a hardcore group - mainly academics - with access to state-of-the-art laser scanners. Geoffrey Pawlicki, a longtime supporter who put Shakespeare's Antony and Cleopatra online, recalls meeting Hart in 1980: "He had files in a backpack and was always running around hooking people up with modems. Initially he was dismissed as a crackpot, but the same could be said of Ted Turner." Not that Project Gutenberg's founder and a media mogul are likely to ever be confused - for starters, Hart's totally uninterested in tracking Gutenberg usage. "I don't care where a book goes," he says, "I just want it to sprout legs and run." He knows that 10,000 files are downloaded daily on the U of I server, but that doesn't give the whole picture since the archives are mirrored on hundreds of sites the world over and heavily redistributed. (The New Zealand Digital Library, for instance, boasts 492 Gutenberg titles in easy-to-use HTML.) Many users post Gutenberg texts on their own Web sites, so Hart's work can be found on Tarzan sites, Shakespeare sites, and The Jungle Book sites, to name just a few. The only thing Hart asks is that anyone who uses a Gutenberg text tack on a "small print" header that says in part: "Why is this 'Small Print!' statement here? You know: lawyers. They tell us you might sue us if there is something wrong with your copy of this e-text, even if you got it for free from someone other than us, and even if what's wrong is not our fault. So, among other things, this "Small Print!" statement disclaims most of our liability to you. It also tells you how you can distribute copies of this e-text if you want to." One result is that Gutenberg texts reach people who have no idea what an FTP server is, let alone how to use one. Hart gets a lot of email from the United States, Britain, Canada, Singapore, and Germany, so he assumes those are big markets. Many people comment on the CIA World Facts Book, so he thinks that's his most popular book. (It's updated annually, and he can put it online immediately because government publications are public domain.) Other favorites seem to be the Bible, Alice in Wonderland, and the collected works of Shakespeare. But even broaching the idea of doing more to find out what people might want from Project Gutenberg makes Hart see red. He imagines the sinister implications of tracking folks who want to download, say, Salman Rushdie's The Satanic Verses in a country like Iran. And besides, any effort to "steer'' the project misses an essential point. "Once we put a book out," he says, "it goes everywhere there are computers and readers. That's unlimited distribution. And in a world based on competition for everything, that's the biggest threat." So in one sense, Hart is an electronic David, striking a literary blow against an establishment Goliath that tries to control information through restrictive copyright law, downloading fees, and red tape. And for that, he has a lot of fans. "Michael's one of the few people I know who's not motivated by greed," says Zinzow. "He's trying real hard to do a good deed for the world, and he's really underappreciated. If Michael wasn't here, access to books on the Internet would be only for the rich. He's an electronic Robin Hood, keeping the Sheriff of Nottingham from creating a monopoly." Newby agrees. "When you have Microsoft buying up the right to thousands of art images, and the federal government taking away copyright, going backward and copyrighting things that weren't copyrighted before, it's not just paranoia; he's trying to fight the evil forces." Hart is happy to agree. "Some people say, 'I am the most powerful because I have the most power.' I say, 'I am the most powerful because I give the most power away.'" And there's no reason to doubt that his dogged pursuit of that ideal is one of the things that has enabled Gutenberg to survive all these years. The project has doubled each year since 1991, when he had 12 books online, and the way Hart sees it, Gutenberg should continue to do so. That means 1,600 books for 1997, 3,200 for 1998, and so on. But there are limits to what he can do without institutional backing and only a handful of hardcore volunteers. Even supporters like Newby say the man can barely handle his current workload and will be hard-pressed to take on more unless he delegates authority. Lately Hart's been trolling the Internet for someone to pass the torch to. But it's hard to imagine him letting go. "No one wants to be a supervisor, so I have to do it all myself," he says plaintively. Hart also refuses to waver from his initial vision, and this has limited his scope. If he made some concessions, it's possible Gutenberg might have 100,000 books online today - a respectable library - instead of 1,000. Indeed, Hart says, various academic institutions and even some Texas oil interests have offered to bankroll Gutenberg over the years, in exchange for control. One university offered him a six-figure salary, he says, to bring the project to their campus. He turned them all down flat. "Almost everybody out there wants to charge for books, and they want real control over which books we do and which edition comes out," Hart grouses. "They want a bit in my mouth. I don't trust them." He's right to worry. In recent months, complaints about Gutenberg's use of over-taxed U of I computing resources led to an ultimatum from school officials: find a formal university sponsor or get off. Hart tried the library, the School of Library and Information Science, and a computer network run jointly by the Big 10 football schools. No luck. "The library, which would be the logical place, wasn't interested in sponsoring it - they might have been concerned about the project's academic credentials," says Bob Penka, an associate director at the U of I's Computing and Communication Services office. "Whether that's snootiness or what, I don't know." Hart has lost his U of I email account, but so far the project is getting a stay of execution - at least until the next round of complaints. Is Gutenberg too important to be left to a solitary eccentric? Maybe it should be organized by the Library of Congress, the National Endowment for the Humanities, or some such organization. But herein lies another problem. Hart isn't the only one staking out literary dibs in cyberspace. The Georgetown Center for Text and Technology counts more than 300 online library projects in nearly 30 countries, including the Dartmouth Dante Proj-ect, with 600 years of commentary on Dante's Divine Comedy, and the Oxford Text Initiative, which charges users to download its scholarly publications. And for all its high-minded goals, Gutenberg's simplicity works against it. It lacks the bells and whistles, the flashy graphics and sophistication that grant-givers love. Instead, Hart plods along in ASCII, doesn't give a hoot about market research, and wants to give his product away. The lack of recognition rankles. "There are literally a billion dollars of grant money out there, and I'm never going to get any of it," he says. And those 300 competitors? "All those projects will never produce even one mainstream book that you or I will ever see." Digital library experts say he's got a point. "Funders tend to be interested in projects that use state-of-the-art technology or advance knowledge, or that create something new and exciting," says Ann Bishop,a coprincipal investigator with the U of I's Digital Library Initiative project, which is working with commercial publishers to put academic journals online. "Projects that aren't high tech tend to get lost in the shuffle. Perhaps if he played it up as something vital to national education, he would get more funding. But I don't know if he's interested in twisting it that way." Ultimately, Hart's biggest impediment may be US copyright law, which pretty much prohibits Gutenberg from publishing anything written after 1920. The law now protects a work for 50 years after the author's death, which means that Hemingway, Genet, and Garcia Marquez won't be online anytime soon. Even The Odyssey and Plato's Republic are off-limits if Gutenberg wants to post a translation published after 1920. One exception is when an author gives special permission, such as cyberpunk author Bruce Sterling, who approached Hart to release an e-text of his novel Hacker Crackdown. The copyright problem isn't getting any easier. The first US copyright law covered 14 years, with a possible 14-year extension. A 1909 amendment doubled that to 28 years, and in 1976, the law was extended again. Now Congress is considering going to life plus 70 years. "Do you realize that under the proposed law, the blueprint for the Wright Brothers' airplane would still be under patent?" asks an incensed Hart, who has testified in Washington against the new bill's passage. "If they win, we'll have to get a certain number of titles and then quit. Unless we want to do the Robin Hood thing. And I'm too old to be a revolutionary." A glint appears in his eye. "But I'll do Gone with the Wind on my deathbed," he vows. "The book and the movie. It should be out there." I leave Urbana feeling that Hart just might meet his short-term goal. If he were any less obsessed, he would have given up a long time ago. Instead, he is doubling each year. But I also wonder how long the project can keep expanding exponentially. Unless Hart can draft reinforcements or hook up with a sponsor, eventually it probably will stall, and that's a shame. As the plane takes off, I recall Hart's description of the mad rush that comes from sitting in his basement, watching the newly transcribed titles come in, working on them, and then launching books into cyberspace. His words echo in my head: "Bennett Cerf never published a book a day, but I'm doing it. Do you have any idea of how powerful that makes me feel? But sometimes I go to bed at night, and I don't know what I'm going to do in the morning because no one's sent me anything. That's kind of scary." Denise Hamilton is an obsessive reader who writes regularly about business and culture for the Los Angeles Times, New Times, and other publications.Michael Hart and Project Gutenberg can be reached at hart@pobox.com. Copyright 1993-97 Wired Magazine Group, Inc. Compilation copyright © 1994-97 HotWired, Inc. All rights reserved. Ann K. Parsons Professional Tutor email: akpgsh@rit.edu akp@vivanet.com http://www.vivanet.com/~akp/index.html "All that is gold does not glitter, Not all those who wander are lost." J.R.R. Tolkien .