Join 3,424 readers in helping fund MetaFilter (Hide)

"Each volume completes its epoch and is an entity in itself."
June 13, 2007 8:14 AM   Subscribe

Charles Evans (1850-1935) created his American Bibliography as a labor of love. Evans, an orphan whose education ended at age fifteen, was fifty-one and unemployed when he began singlehandedly cataloging every printed document published in America between 1639 and 1820. At the time of his death thirty-four years later, he had set down 35,854 entries through 1799, twelve volumes totaling over 5,500 pages. It took two decades (1950-1968) for a team of bibliographers to transfer the pamphlets he cited onto microfilm, and three more years (2002-2005) to digitize them. The result, Evans Digital Edition, is a full-text searchable collection of 2.3 million pages of pamphlets. Some see it as a revolutionary innovation that will democratize the historical profession, but others are not so sure--the original cost $25 a volume, but Evans Digital Edition costs $20,000-$100,000 to subscribe.
posted by nasreddin (11 comments total) 2 users marked this as a favorite

The inaccuracy of the ocr is most likely why the promised—and powerful—feature of making the underlying text available in ascii form is missing. Any findings must be transcribed by hand rather than being copied and pasted. A second problem is that sessions time out without being saved. If one wants to explore the 8,670 instances of the word "freedom," she can forget about lunch unless it is at the computer. After fifteen minutes or so the session times out, saving neither results nor searches.

For this they want $20,000-$100,000?
posted by languagehat at 9:03 AM on June 13, 2007

Is it still possible to get it for 25 bucks a volume? That would only be $300 for the whole thing. Couldn't someone else make a Digital database, also, and then charge more reasonable prices for it?
posted by Greg Nog at 9:03 AM on June 13, 2007

Well, Evans usually does not include the full text of the documents, which is obviously important. Still, the microfilm version (which does include the text), was much cheaper, as far as I know.

A full edition of Evans today is available for $150.00.

For what it's worth, I've found Evans Digital to be extremely useful for research. Being poor and deprived of adequate research time, I've dug up enough material there that sometimes I barely have to dip into archives at all. There are some amazing things you can find there--like a six-part book which satirizes the events running up to the American Revolution by imitating Biblical prose, one whole book of the Bible.
posted by nasreddin at 9:11 AM on June 13, 2007

The bibliography itself is available for free at the first link.

What they did is scan all the documents and pamphlets that were referenced in the bibliography.

Personally I think that they would make a lot more money if they just gave them to Google and let searching be free, but then charge for the download of the pamphlet of interest.

Google may end up scanning a lot of these documents themselves anyway, in which case this company's reason to exist will be gone.
posted by eye of newt at 9:15 AM on June 13, 2007

I am a historian of early American history and I have mixed feelings about this commercial digitization project.

The first reproduction of the collection was not onto microfilm as stated above but rather microcards--opaque linen card stock with the page images reproduced on each in really tiny, tiny print. A massive machine of lights and lenses was needed to read them. It was a cumbersome technology that never caught on and the few institutions that had the microcard readers (and even more rare microcard printers) experienced tremendous difficulty and expense keeping them operating. Many have given up rendering their microcard collections inaccessible.

Beyond the difficulties of the obsolete media, finding anything in the cards could be cumbersome. It has been 15+ years since I wrestled with the system, but I recall that you had to look up a name of topic in one huge set of volumes, write down some reference numbers, and then look in another set of volumes to find out what those numbers meant. Something like that. It was a pain to find something specific.

But what a collection! It includes nearly every damn thing committed to print for the entire colonial and early national period in English speaking North America. Gallows narratives from New England criminals about to swing. Autobiographies from obscure and fascinating figures. Blood curdling descriptions of frontier warfare. Crackpot scientific pamphlets about how to clear disease-producing miasma from Boston's air. What a pleasure it was in graduate school to get lost in the Evans Collection.

So at first I was thrilled to see that the collection was digitized and searchable. But the expense is staggering. My little 4 year teaching college would have to pay upwards of $60,000 for access (and then a yearly fee of a couple grand forever), far more than the library budget for my whole department, for multiple years. And this for documents that are after all free of copyright and the originals held by not-for-profit historical foundations. There was a dust-up about the cost on the early American history discussion list where one scholar called the project a "seizure of the commons." If you call the company they will say oh yes, lots of colleges your size have invested in the collection. But if you look at their list of subscribers you will not find many smaller schools. Indeed, you won't find many schools at all.

I have a couple of large grants at my school right now and I am seriously considering concentrating all our resources and trying to purchase the Evans Collection. But to do so will ruin our book budget, our technology budget, and our professional development budget for years. I wish they had priced it for volume sales rather than for profit per sale.
posted by LarryC at 9:25 AM on June 13, 2007 [2 favorites]

Evans Digital Edition costs $20,000-$100,000 to subscribe.

USER = mefite, PSWD = mefite.
posted by StickyCarpet at 9:30 AM on June 13, 2007

Ah, larryc, you're right--it is microcards/microprint. That distinction is even made in the article I linked! Stupid mistake.

I have used the microfilm version (not the cards), though, and it's a hassle. Digital is much better. I guess I should count my blessings that I have access.
(larryc, if you ever need anything specific from Evans, feel free to email me)
posted by nasreddin at 9:30 AM on June 13, 2007

Here is a nice article from the estimable Common-Place about the joys of researching on the digital Evans: Tales from the Vault: From Movable Type to Searchable Text by
Cathy N. Davidson.

posted by LarryC at 9:32 AM on June 13, 2007

While I'm fascinated by projects like this, the price tag just makes me think of all those wonderful people who aren't going to be able to afford this at all. And while a few might be able to do so, libraries these days aren't always bringing people to knowledge, but rather acting as bouncers to keep the masses out.
posted by metabrilliant at 5:47 PM on June 13, 2007

Any findings must be transcribed by hand rather than being copied and pasted.

You know, they could solve this problem really quickly if they just put the scans up on the internet and let people provide textual translations—Wiki-style. Of course, they wouldn't be able to make any money that way...
posted by Civil_Disobedient at 2:06 AM on June 14, 2007

Interesting—my (New Zealand) university's on the subscription list. I'll be checking this out when I have time.

eye of newt: Personally I think that they would make a lot more money if they just gave them to Google and let searching be free, but then charge for the download of the pamphlet of interest.

Google may end up scanning a lot of these documents themselves anyway, in which case this company's reason to exist will be gone.

But that's not the way to go, either. Google's only direct interest is in ad revenue; the content being digitized itself is really incidental and not central to Google's business at all. And this shows in what they've done so far. Their scans are often poor-quality and the scope is patchy and incomplete. But once Google's done their (half-arsed) work on something, it's assumed 'complete', and no-one can get funding to do it properly.

Ideally, what needs to happen is for a consortium of universities and research libraries that hold this material to get together; get funding; pay someone competent to scan the texts; send the images off to some nice, cheap content transformation company in Delhi or Hyderabad for live-body, non-OCR transcription and conversion to TEI-lite; and then make the texts and images as freely and widely available online as possible. It's so frustrating that this new technology and the means for disseminating it exists alongside such a lack of imagination and basic mercenariness over what to do with it.
posted by Sonny Jim at 3:59 AM on June 14, 2007

« Older Red State Update...  |  Yeah, you better walk away bef... Newer »

This thread has been archived and is closed to new comments