Whitman Archive at Ten: Some Backward Glances and Vistas
Whitman famously said, “Missing me one place search another, / I stop somewhere waiting for you.” These days he seems to be waiting everywhere: a recent Google search yielded 805,000 hits for the phrase “Walt Whitman”; Yahoo! claimed over a million. The Walt Whitman Archive was the first hit in both searches; also highly-rated were the Library of Congress's site devoted to the recovered notebooks and the homepages of the Walt Whitman Arts Center and Walt Whitman High School. I wonder what sort of content is featured on some of the lowest-rated sites, too. Unfortunately, both search engines link only to the first 1000 sites. Of course, there's plenty of variety in the one-tenth of one percent of the sites that are linked. We can view photos from a "walking tour of" the poet's "old haunts in Manhattan," read an essay on "Walt Whitman, Prophet of Gay Liberation," find out the latest activities of the Walt Whitman Sailing Society, or download seemingly countless term papers (including a "6 page analysis of 'Leaves of Grass'"). On the web Whitman is interpreted, summarized, marketed, defended, and vilified. Some sites provide miniature lessons in collecting Whitman or in History of the Book scholarship; some use his image to sell clocks, lapel pins, coffee mugs, and mouse pads; others invoke Whitman in the creation of original art. On the web we find poems, paintings, and engravings inspired by Whitman. One of the more audacious artistic uses of Whitman is the Flash animation “Walt Whitman” by performance artist My Robot Friend, in which a techno-pop song with a raucous beat is joined with flying images of Whitman, excerpts from "Salut au Monde!" and photographs of nude teenage boys. And now we have new possibilities for Whitman studies with the Mickle Street Review. That is, with an online journal we have opportunities for born-digital critical and creative responses to Whitman, responses that can be expressly designed for web presentation and that take full advantage of the medium.
The web serves many functions. It creates fresh opportunities for expressiveness; makes locating an elusive quotation relatively easy via simple string searching; and allows ordinary people to participate in critical debates. This is all valuable. But there is also an extraordinary amount of junk on the web. Typically what appears on the web is idiosyncratically built and irregularly maintained. Stuff moves around, stuff vanishes, stuff can’t be trusted: it’s a questionable environment, at best, for serious scholarship. Little of the academic material on the web has undergone peer review. Yet online publication has advantages, too, and I hope to clarify some of them in these reflections on the Whitman Archive as a scholarly research tool.
The Whitman Archive
ranks highly on Google because we were an early web presence—early
adopters seem to have a huge advantage in Google rankings. Ed Folsom and I have described the Archive as a “large electronic research
and teaching tool that sets out to make Whitman’s vast work,
for the first time, easily and conveniently accessible to
scholars, students and general readers.”
The Whitman Archive was begun in 1995 and
has had a long-standing affiliation with the Institute for
Advanced Technology in the Humanities at the University of
Virginia (IATH). A large team (located primarily at the
The project began fairly early in the development of
the web. Tim Berners-Lee
had been creating HTML, HTTP, and the first web pages for
a couple of years at CERN, a particle physics laboratory,
when, in August 1991, he first publicized his new World Wide
Web project. In 1993 the Mosaic web browser 1.0 was released,
and soon public interest was sparked. That same year,
1993, I had a brief conversation with a colleague at Texas
A&M, Jimmie Killingsworth, about the possibilities of
electronic media for presenting the multiplicity of Whitman’s
texts. Ed Folsom was
also attentive to the implications of new technical developments.
When he drafted his 1994 essay “Prospects for the Study of
Walt Whitman” he included a discussion of the possibilities
of electronic editing. None of our musings took concrete form,
however, until after I had moved to the
One key early decision, made just a couple of months after we began work, was to contact Ed Folsom and suggest that we get together at a conference to talk. I knew the project would be stronger if it could be collaborative across institutions, and I couldn’t think of anyone better than Ed to help envision its possibilities and help direct its development. Having two established scholars involved would help our professional credibility—not a small matter when working in a medium that even now struggles for acceptance in the academy (witness, for example, the resistance of tenure and promotion committees to online work and the MLA’s refusal to include online archives in its bibliography of scholarship).
The project began with no funding whatsoever. In those stone soup days we worked opportunistically with files we happened to have in electronic form: Ed had placed online all of the photos of Whitman for storage, so we built a framework for displaying them. I had recently published a book, Walt Whitman: The Contemporary Reviews, and, since this material was all out of copyright and Cambridge University Press was willing to provide me with the electronic files, we were able to put the reviews online quickly too. Meanwhile we began to publish electronic versions of Leaves of Grass, providing both page images and text.
We developed an HTML site, what I’d now call a prototype for a serious site, though at the time we didn’t think of it as a practice run. As is now widely recognized, HTML is not well-suited to scholarly purposes. There are many problems with HTML: one key problem is that it is a “display-descriptive” markup language that tells a web browser whether to make something italic or 14 point type or blue but does not declare what the structure of a text is. To render something in italic is not to say that it is a foreign word, or a word of emphasis, or a title. If you don’t declare what a thing is (rather than how it looks), you can’t retrieve it in searching, you can’t easily compare it to other things of the same kind, and you can’t redisplay it in a different way for a different purpose. Our first undertaking did not rigorously adhere to best practices in humanities computing, nor were we much attuned to international standards. There were also problems with the navigation of the site. For example, once you entered deeply into, say, the Works section of the Archive, there was no quick way to move to another section. You had to back out via several clicks of the back button because there was no navigation bar present on all the pages. On the positive side, we made a fair amount of content available, and despite flaws behind the curtains, we gained positive publicity in the Chronicle of Higher Education and Washington Post.
In 1996 Primary Source Media, a commercial publisher, approached us with plans to accomplish in roughly a year what we had planned to do much more gradually. They could invest in the project resources that we didn’t have. They were interested in producing a marketable product quickly rather than a painstakingly accurate archive—full of annotations, introductions, and other scholarly features—more slowly, as has always been our inclination. We decided to work with Primary Source Media, and the result was a useful product, Major Authors on CD-Rom: Walt Whitman. The CD was marketed primarily to libraries and is now no longer being sold. There were some important consequences from this undertaking. We persuaded Primary Source Media to donate the out of copyright texts of Leaves of Grass to the Whitman Archive by arguing that their sales would be based on making available, in searchable form, the New York University Press Collected Writings of Walt Whitman. The Primary Source Media alliance has been invaluable in our editing of Whitman’s manuscripts.
Another key step in our development occurred in 2000,
just before I moved to the
As of April 2005 we have all six American Editions of Leaves of Grass completely transcribed and posted as XML files. (XML—or more specifically the Text Encoding Initiative implementation of XML—is the de facto international standard for serious humanities computing projects.) Users can access the entire file or particular chunks—individual titles or clusters. Each page of transcription is accompanied by a reproduction of the text in facsimile, allowing users to check our transcriptions and to study what Jerome McGann calls bibliographic codes, the way a text makes meaning through non-linguistic textual features such as margins, typeface, ornamentation, and so on. When time allows, we expect to add introductions to the various editions that will provide information on the composition and reception history, variations among different issues of the same edition, and explanations of key features that our work has uncovered. For example, it was the need to categorize the material on the page in XML encoding that led us to the realization that what had long been regarded as twelve untitled poems in 1855 were not in fact untitled.
We also intend to make the British editions of Whitman’s poetry readily available. In fact, they have been available for quite some time, though they’ve been relegated to the back scenes. Some of you may know that the brown site featured William Michael Rossetti’s Poems by Walt Whitman and Ernest Rhys’s edition of Leaves of Grass. These volumes were originally contributed to the Archive by Ed Whitley. We are in the process of adjusting Whitley’s valuable work so that it conforms to the current encoding practices of the Archive. The Rossetti edition has been converted to XML and will be mounted in the coming months, after being thoroughly vetted. The Rhys edition is also in our future plans, though that conversion work is not yet scheduled.
In addition to these American and English editions of Leaves of Grass, we have gathered and processed approximately 4000 high-resolution, archival quality image files of Whitman’s poetry manuscripts. These working drafts are documents of rare importance. By the end of the summer we expect to have a complete archive of images. One of the surprising facts is that after a half century of work on the NYUP Collected Writings and after a great mass of peripheral material had been meticulously edited and annotated, Whitman’s poetry manuscripts remain unaccounted for in that print edition—left uncollected, neither transcribed nor annotated, not even listed anywhere. We feel fortunate to be the editors who are able to give these documents sustained attention for the first time.
Of course having digital images is one thing and transcribing, encoding and annotating them is another. We have a manuscript tracking database that helps us keep track of the flow of work through various stages: transcribed and encoded; checked; edited; and “blessed”—the last term meaning that a text has gone through every stage of checking, conforms to our project’s highly articulated encoding practices and displays properly on the website. This process is slow and painstaking because careful transcription of messy documents is by its nature time-consuming. Moreover, we are dealing with unusually complex material for web presentation. We have a growing list of publicly available transcriptions of poetry manuscripts, currently numbering over 80. These are fully transcribed, encoded, and proofread multiple times. An additional twenty-three manuscripts have been completed; we are withholding them until we can fix bugs that keep them from displaying properly. Approximately 250 additional manuscripts have been completely transcribed and encoded and are now in various positions in the pipeline as we do the methodical checking before public presentation.
of the manuscripts we’ve edited thus far are brief, rarely
more than one or two leaves in extent.
But there are important exceptions. Whitman’s notebooks
constitute a special category of manuscripts. A team at
order to gain bibliographic control of our manuscript editing
project we developed, thanks to funding from the
Significant progress has been made on additional areas
of the Archive as
well. We have added a section with texts by Whitman’s disciples,
with plans to present many of the key contemporary accounts
written by those closest to the poet (sometimes with the poet’s
own active involvement). This effort is being headed by Matt
also continues on other fronts: periodicals, interviews, reviews,
and images of Whitman. Susan Belasco, my colleague at the
and his team at
Whitman Archive presents a modest amount
of recent criticism: currently we display 80 entries from
the Walt Whitman Encyclopedia, two essays by
Martin Murray, and an account of the controversy over the
sequence of love poems known as “Live Oak, with Moss.”
Most recent criticism is entangled with copyright issues,
so rapid development of this part of the site is unlikely.
There are some opportunities, however. In the future we’d
like to make available all back issues of Walt Whitman Quarterly Review, for example.
We also plan to offer online some full-length critical
books for which we have secured copyright. We’ll start with
books written or edited by the Archive
staff. A book Ed Folsom
and I have forthcoming, Re-Scripting
Walt Whitman: An Introduction to his Life and Work, will
appear, as will my own To Walt Whitman, America. We’ll
also add Leaves of Grass: The Sesquicentennial Essays,
a volume emerging out of the
will be some of the key challenges for the Whitman
Archive in the future?
Sustaining ourselves financially is crucial. This work
won’t progress adequately unless we are able to keep our talented
staff. The challenge
is to pay people when we offer a “free” site that brings in
no revenue. It’s good to remember that the site is free to
the end user, but it certainly isn’t free to produce. Hardware,
software, digital scans, travel, phone, consultation with
technical experts—not to mention salaries—all add up. We have
thus far managed to develop the Whitman
Archive via the strong support of several universities
and the generosity of federal funding agencies and one private
foundation. Whether we will be able to sustain this economic
model into the future is an open question. One promising development
is that the
We’ve now been five years in phase two of the Archive, 2000-2005, and again arguments for a redesign are becoming compelling. The reason for even considering a redesign boils down to the difference between frames and tables. In the current view, one of our typical pages is actually made up of four distinct frames. The frameset is stable, allowing the navigation bar to remain in one constant place on the screen as the text scrolls down. But there are disadvantages to this design. Printing is a problem since a printer ordinarily will print a sheet for each frame. Searching is an even bigger problem: a person who uses an internet search engine to find, say, Pfaff’s at the Whitman Archive will get the page shorn of the navigation bar. Another problem is that all the poetry manuscripts, or all the reviews, have the same URL, thus making it difficult for people to cite a particular spot in the Archive. (There is a way to get a more specific URL by right clicking, but most users are probably unaware of this possibility.)
Most of these problems could be resolved through the use of tables, which are becoming a more common way for large web sites to deal with similar display issues. Each page will print more easily, have a unique and visible URL, and will overall be more easily navigable. Older browsers do not support all of the newly developed features of tables, such as having an unmoving header that stays at the top of the page as the user scrolls through the material. We are currently trying to craft a layout that will improve upon our current one for users of newer browsers without sacrificing navigability for those with older browsers. This difficulty of design is one that we consistently face: what sorts of technological capabilities can we reasonably expect from our users? It is important to move our design forward with technology and tastes to keep current. But at what point can we fairly put the onus on the user to keep up with developments in browsers and monitors without alienating the very people we hope to serve?
What are the areas of greatest potential for the Archive? A good goal for us, I think, is to make the Whitman Archive not only a rich resource but an enabling interpretive tool that advances how analysis itself is done. We need to find a way to trace poetic change as it ripples through versions, to make this visually clear. Exactly how this will be accomplished remains unknown. I can foresee how some kinds of visualizations could work. For example, I think we could take clues from an internet site called “Name Voyager” to trace changes in Whitman’s diction from the 1840s to 1890s. What editions have the most proportional use of key words—say comrade, black, lover, slave, bowels, soul, onanist, Walt, tabouschnik, mossbonkers?
like for the site to enable and to promote interpretations
that hadn’t been possible before. We’ve wondered about the
possible use of Geographic Information Systems or GIS. We’d
start with a base map with detail down to the block level.
Period maps no doubt exist for DC,
What are the greatest weaknesses of the Archive? When we ask where and how the NYUP Collected Writings will live, we have a predictable and more or less reassuring answer: on library shelves. All we know for certain about such projects as the The William Blake Archive and The Complete Writings and Pictures of Dante Gabriel Rossetti: A Hypermedia Research Archive is that they must live and thrive simply because they are too valuable to let die. None of us, though, can yet say with any certainty under what conditions some of the early breakthrough projects in humanities computing will survive. Fortunately, the library community is deeply committed to overcoming the challenges that are presenting themselves. I believe and hope that the loss to the humanities would be just too great to let major projects die. Still, in all soberness, we have to realize that this is an experimental time, and what may happen in the future is anyone’s guess. It has seemed to us nonetheless that these experiments had to be undertaken, because Whitman, like Rossetti and Blake, presents intractable problems for conventional editorial approaches. His work has characteristics— for example, an obsessive, seemingly ceaseless revisionary effort that has been called a “fluid text”—that despite gargantuan editorial efforts have resisted adequate representation in conventional print formats.
Whitman Archive is providing a complete
record of the “American bard,” thus giving the general public
and scholars at all levels the opportunity to read and study
the work of this central spokesman for
Whitman used pens and pencils, paper and magazines, type and books to create Leaves of Grass. In the last 140 years we’ve used the same implements to study him. But in the last ten years we’ve brought other tools into play, and we should reflect on the consequences and implications of those tools. We are only beginning to grasp how electronic technology will impact our study of Whitman, how elements which are now invisible might become visible. At the Whitman Archive, we hope, paradoxically, that the place for rich exploration of Whitman in future years will be no place at all, but the fluid, expansive, exploratory realm of the web.