No more "It was the best of times, it was the blurst of times"
June 20, 2017 11:51 AM   Subscribe

Public domain ebooks with modern typography, full proofing, complete metadata, and version control. Ebook projects like Project Gutenberg transcribe ebooks and make them available for the widest number of reading devices. Standard Ebooks takes ebooks from sources like Project Gutenberg, formats and typesets them using a carefully designed and professional-grade style guide, lightly modernizes them, fully proofreads and corrects them, and then builds them to take advantage of state-of-the-art ereader and browser technology.
posted by Cash4Lead (85 comments total) 157 users marked this as a favorite
 
OH MY GOODNESS.
I do love a well formatted ebook, especially the classics.
I am awful at formatting my own; good formatting is a science and an art.
Off to download!
posted by Major Matt Mason Dixon at 11:57 AM on June 20, 2017 [1 favorite]


Great post, great title.

"Stupid monkey!"
posted by history_denier at 12:01 PM on June 20, 2017 [9 favorites]


oh this is swell. I recent read Moby Dick and I bought it from Amazon because that was easier to read than the scans.

Can anyone recommend some of these other books?
posted by rebent at 12:01 PM on June 20, 2017


Can anyone recommend some of these other books?

I read Scaramouche literally a couple of weeks ago, in the usual crappy ebook version, and it rocked the house. I wish I had known about this then!

And if you haven't read Dubliners, run don't walk.
posted by doubtfulpalace at 12:05 PM on June 20, 2017 [1 favorite]


Can anyone recommend some of these other books?

Chekhov's short stories are bangers, in particular "Gooseberries" and "The Lady with the Dog".
posted by Cash4Lead at 12:16 PM on June 20, 2017 [6 favorites]


Nice.

I usually just buy things from Barnes and Noble. It's easier to just buy from my nook than to search and download on my computer, go upstairs and get the nook, and sync it. I will pay 99 cents for that.

But the worst is paying AND getting a shitty formatted book with typos.

These people care a lot more than me I'm realizing. The typography section in particular.

Older books often contain archaic spelling and hyphenation that can be distracting for today’s readers.

I like that part. Hmmm...
posted by bongo_x at 12:19 PM on June 20, 2017 [4 favorites]


I once read an entire public domain ebook that had somehow replaced all the double-Ls with a space, so it read something like, "Are you wi ing to put the  ama's bu et at the top of the hi ?"

It was maddening. Thank you, Standard Ebooks.
posted by Rock Steady at 12:22 PM on June 20, 2017 [16 favorites]


The "light and tasteful modernization" disturbs me a little.
posted by We had a deal, Kyle at 12:22 PM on June 20, 2017 [28 favorites]


I'm a bit confused. Are they doing this because they're awesome people or because there's money to be made in making better free books?

Either way, I suppose I'm happy, but I'm curious about the motive.
posted by Phreesh at 12:35 PM on June 20, 2017


Metafilter: Light and tasteful modernized theme.
posted by Kabanos at 12:36 PM on June 20, 2017 [5 favorites]


Wow nice! ManyBooks has also been doing this for years, formatted ebook editions of out-of-copyright works mostly from Project Gutenberg. But the site's gotten kinda spammy in the past couple of years, it looks someone's tried to leverage it into a business with freemium come-ons for new books. You have to hunt past the promotions for the classics. Also the ManyBooks conversions are pretty rough, often you'll find a bunch of Gutenberg boilerplate up front and no effort at proper typesetting.

I compared editions of Huckleberry Finn: ManyBooks, Standard Ebooks, and a couple of free Kindle versions on Amazon. I think Standard Ebooks is a bit of a nicer page and I really appreciate the links to a GitHub repo. ManyBooks supports more formats but Standard EBooks has the important ones, although it's a shame they don't have a simple PDF.

I had a hell of a time viewing the Standard Ebooks edition on my iPad. The email-to-Kindle gateway at Amazon doesn't recognize the .azw3 file I downloaded, and it won't even show up in the iPad Kindle app after I transferred the file over with a cable. Neither the ePub nor AZW file renders in GoodReader either. I did finally get the ePub to show up in iBooks and it looks quite nice. The two free Amazon editions also look nice. The ManyBooks version has the janky stuff up front but is otherwise perfectly readable.

Free books are a wonderful thing. One last tool that's nice to have in your pocket is the The Magic Catalog of Project Gutenberg E-Books. In a twist that would make Borges proud it's an eBook that is full of download links for other eBooks. Install it on a Kindle and you've got a lifetime's reading at your fingertips for free. IIRC the editions you get are very rough, probably just straight text file dumps, but they are readable and free.
posted by Nelson at 12:43 PM on June 20, 2017 [9 favorites]


Can anyone recommend some of these other books?

A Room With a View is a personal favorite! It's very good.
posted by BungaDunga at 12:47 PM on June 20, 2017


The "light and tasteful modernization" disturbs me a little.

I was concerned about that, but the specific changes mentioned in their typesetting manual are reasonable enough ("to-night" to "tonight", "&c." to "etc.", &c.) I didn't see anything that could mislead the reader.
posted by doubtfulpalace at 12:49 PM on June 20, 2017 [10 favorites]


Very cool.
posted by solotoro at 12:52 PM on June 20, 2017


The "light and tasteful modernization" disturbs me a little.

Yeah. Internet people keep reinventing editing from first principles — one of those "now you have two problems" kinds of thing.
posted by RogerB at 1:02 PM on June 20, 2017 [6 favorites]


I have an ebook from Gutenberg or somewhere -- The Worm Ouroboros by E.R. Eddison -- that I've cleaned up considerably in Calibre; I'd like to submit it to these guys but to do it you have to join their mailing list which seems very high-traffic right now. :\
posted by edheil at 1:10 PM on June 20, 2017


... lightly modernizes them, ...

uh-uh.
posted by jamjam at 1:11 PM on June 20, 2017 [2 favorites]


lightly modernizes

Really looking forward to reading "The Keg of Barrel-Aged Dark Lord."
posted by uncleozzy at 1:15 PM on June 20, 2017 [7 favorites]


The "light and tasteful modernization" disturbs me a little.
Hamlet: To be or not to be, right? Should I just plug it? Literally a pain in the ass.
Ophelia: What up, G?
Hamlet: Chillin...
posted by Ogre Lawless at 1:18 PM on June 20, 2017 [6 favorites]


Ugh they totally nerfed the slings and arrows of outrageous fortune in the latest update.
posted by uncleozzy at 1:23 PM on June 20, 2017 [8 favorites]


For awhile now I've had a preference for The University of Adelaide's free public domain ebooks over Project Gutenberg's. I haven't made a close comparison, but the ones I've read have seemed to be better formatted in the Adelaide versions. Adelaide has stuff Gutenberg doesn't, and vice versa. Probably a majority of stuff I read on my Kindle comes from there. Standard Ebooks looks right up my alley.
posted by demonic winged headgear at 1:31 PM on June 20, 2017 [6 favorites]


Full proofing based on what text? I got into a fight a few years ago with Project Gutenberg over a Jane Austen text -- it had a word wrong ("brother" and it made no sense at all because it should read "mother"), and I cited correct texts and academic blah blah blah, but they said the book they had scanned (a Barnes & Noble cheap edition, I later found), said brother so they didn't care about the problem at all. So now I kind of hate ebooks even though I still read them.
posted by JanetLand at 1:34 PM on June 20, 2017 [6 favorites]


I'm a bit confused. Are they doing this because they're awesome people or because there's money to be made in making better free books?

They appear to license all their texts, even after they have edited and proofread them, under the CC Universal Public Domain license, meaning that YOU could take them, publish them, and sell them if you want to.
posted by muddgirl at 1:42 PM on June 20, 2017 [4 favorites]


The email-to-Kindle gateway at Amazon doesn't recognize the .azw3 file I downloaded

Amazon does make it difficult to transfer AZW files. For my iPad, I used Dropbox as my intermediary. In the Dropbox app on your iOS device: select the AZW file, bring up the dialog which lets you choose what opens the file, choose Kindle, and it should not only open but be automatically added to the library.

If you don't have Dropbox, I'm sure any app which lists files and provides a "share" button will work.
posted by honestcoyote at 1:42 PM on June 20, 2017


Amazon does make it difficult to transfer AZW files.

You need Calibre.
posted by signal at 1:45 PM on June 20, 2017 [10 favorites]


honestcoyote, signal: If you have a Kindle device (though I don't know if this works for Kindle Fire devices), you can just plug it in to your computer over USB. It will mount as a drive, and you can copy the AZW files into the documents folder.

That said, Calibre is useful for other things, especially if you have eBooks in ePub or formats you want to read on the Kindle, it can convert them to .mobi format. Very handy.
posted by SansPoint at 1:58 PM on June 20, 2017 [1 favorite]


Knee-jerk reaction: oh come on.
Goes to site.
Browses briefly.
Oh look Zuleika Dobson!
posted by chavenet at 2:05 PM on June 20, 2017 [2 favorites]


> You need Calibre.

Yup. Calibre changed my life. And this looks great; looking over their Browse E-books page, I saw:
The Gadfly

Ethel Voynich

The Gadfly is set in 1840s Italy, at a time when the the country was chafing under Austrian rule. The titular character is a charming, witty writer of pointed political satires who finds himself running with a crowd of revolutionaries. The plot develops as the revolutionaries struggle against the government and as the Gadfly struggles with a mysterious hatred of the Church, and of a certain Cardinal.

The novel, with its complex themes of loyalty, romance, revolution, and stuggle against both establishment and religion, was very popular in its day both in its native Ireland and other countries like Russia and China. In Russia, the book was so popular that it became required reading. Since its publication it has also been adapted into film, opera, theater, and ballet, and its popularity spurred Voynich to write sequels and prequels.
Sounded interesting, so I downloaded it, added it to Calibre, sent it to my Kindle, and clicked on it. Very nice job, a pleasure to read.

No Trollope yet, but give them time...
posted by languagehat at 2:09 PM on June 20, 2017


I've read more than a few ebooks which thanks to a combination of bad kerning and bad OCR are unintentionally hilarious. Imagine an article about how clickbait only cares about getting clicks, where 'cl' comes out as 'd'. That kind of thing.
posted by adept256 at 2:14 PM on June 20, 2017 [2 favorites]


I love what they've selected so far. Classic, classic, classic, "Space Viking", classic...
posted by Kikujiro's Summer at 2:14 PM on June 20, 2017 [2 favorites]


I've used USB for my eink Kindle with my AZW files. When I said "difficult", I was mostly just referring to the email gateway rejecting AZW, which makes things a little less convenient.

And I use Calibre too. Mostly for removing DRM. Which I highly recommend for everyone who uses Kindle in some way. I recently switched to reading on my iPad from using the eink exclusively. I normally like having my entire library on device, so I started transferring my purchased books over. For a few books, Amazon helpfully said: "You have exceeded the number of devices allowed by the publisher." So I grabbed those books from my books backup folder on Dropbox while holding up a metaphorical middle finger to Amazon and the publishers.

I expect most people here who have multiple devices will run into this problem eventually. Especially when books are installed on devices which you haven't used in a while. I suppose there's some way to remotely remove the books from old devices by using Amazon's web interface, but removing the DRM means not having to deal with it.

Calibre was also great for mass-converting the de-DRMed files into .epub if I ever get a reader which uses that format.
posted by honestcoyote at 2:14 PM on June 20, 2017 [3 favorites]


Amazon doesn't really make it hard to transfer AZW files, it's just none of the several ways I tried worked for this one file. Works fine for other files, like the one from ManyBooks. It may be a problem specific to the Kindle iPad app; I don't have an actual hardware Kindle handy to test it with. If you can get Huckleberry Finn azw3 to work let us know.
posted by Nelson at 2:20 PM on June 20, 2017


I suspected the "light modernization" would raise an eyebrow here. I have no idea how good a job they are doing of it, but I loved this exchange between someone who was vehemently against it, and someone involved with the project:
DoubleCribble 3 days ago | parent | flag | favorite | on: Standard Ebooks: Free and liberated ebooks, carefu...

Thank you for the reply. I can only imagine that the application of "Light Modernization" to this
masterpiece would not only change the pronunciation of the words, it would add the implied
missing words thereby disrupting the meter and rhyme and thus subsequently ruin the sonnet.
I understand the point of making things readable but when you start changing the spelling, you
actually ruin the art form. Sonnet > Light Modernized Poem

Let me not to the marriage of true minds
Admit impediments. Love is not love
Which alters when it alteration finds,
Or bends with the remover to remove.
O no, it is an ever-fixed mark
That looks on tempests and is never shaken;
It is the star to every wand'ring barque,
Whose worth's unknown, although his height be taken.
Love's not Time's fool, though rosy lips and cheeks
Within his bending sickle's compass come;
Love alters not with his brief hours and weeks,
But bears it out even to the edge of doom.
If this be error and upon me proved,
I never writ, nor no man ever loved.
And in reply:
acabal 3 days ago [-]

Ironically, you've posted the lightly modernized version of 116. :) In the 1609 printing it looks like:
  O no, it is an euer fixed marke
  That lookes on tempeſts and is neuer ſhaken;
  It is the ſtar to euery wandring barke
  Whoſe worths vnknowne, although his higth be taken...
posted by markr at 2:25 PM on June 20, 2017 [32 favorites]


> I suspected the "light modernization" would raise an eyebrow here.

I admit, I was suspicious, but if it is done by a careful human editor with good taste (as opposed to a brute-force algorithm), there's certainly a place for it.

Sometimes - most of the time, actually - I'm interested in reading for pleasure, as opposed to reading for close criticism of the author's original intent.
posted by RedOrGreen at 2:29 PM on June 20, 2017


This is excellent. There are many books on Gutenberg - not the big titles, usually - that suffer badly from poor OSR or coming from awful 19th century editions. The versions of Marco Polo available are hot garbage; barely readable and burdened by dull forewords or biographies of long-dead translators hundreds of pages long. I appreciate their presence as a historical curiosity but having someone clean them up is excellent.

Calibre too is absolutely fantastic but cleaning up epubs in it is still a chore. If it is a human editor going through the texts and fixing typographical artefacts like the ones markr pointed out (I've seen too many 'H's rendered as 'LL's), godspeed to them.

@ demonic winged headgear: I think there's some difference in when texts pass into the public domain in Austalia vs the rest of the world, which might account for their different holdings. I know James Joyce was on Adelaide's site much earlier than Gutenberg.
posted by ocular shenanigans at 2:35 PM on June 20, 2017 [1 favorite]


Yes!!! Wow thanks, this is a utterly fantastic project. As much as there is confusion about the world of books changing it's clearly inevitable and the epub/mobi formats are the best so far for just reading and there's a ways to go for satisfying functionality (flipping to an earlier chapter to review a detail, sigh) it's the direction. I may take a stab at "Two Years Before the Mast", one of my favorites.
posted by sammyo at 2:53 PM on June 20, 2017


I'm not a typographer or antique book specialist, but I've worked with manuscripts all the way up to incunabula, and while I developed a detached sensitivity to formatting, I never realized how much formatting really impacts one's ability to enjoy a book until I got an ereader. I was irrationally angry after about the 50th page or so of fighting with the book I was trying to read. A good epub is a glorious thing, and so I'm thrilled to hear about this, if only because I've done enough with trying to OCR my own digitized scans to know that the level of effort that goes into editing poorly processed scans (or poorly scanned pages, depending on the level of hell in which one begins) is immense. Bravo to these volunteers.
posted by eclectist at 3:28 PM on June 20, 2017 [3 favorites]


They're doing their editing on github, too, so if you're curious about what changes their modernization has done, you can go look: e.g. Pride and Prejudice
posted by vibratory manner of working at 3:46 PM on June 20, 2017 [3 favorites]


I don't know, I find it really interesting to see how the spelling and grammar changed when I'm reading old books. Unless it's just too hard to read, then it needs a translation, or I guess a blatant "this text was modernized". I don't see the point of "slight modernization".
posted by bongo_x at 3:49 PM on June 20, 2017 [4 favorites]


If you have an iDevce, I recommend trying out Kybook 2 in the App Store. It reads ePub, mobi, was, fb2, PDFs and mp3s. Supports Adobe's DRM and you can organize your library with folders. Mainly for the Russian-language market which is why you haven't heard of it. I gave up on Marvin's quirky UI and I'm really pleased.
posted by Jesse the K at 4:12 PM on June 20, 2017 [1 favorite]


I don't know, I find it really interesting to see how the spelling and grammar changed when I'm reading old books.

What they have to work with are Project Gutenberg texts, whose relationship to the spelling and grammar used by the original author simply can't be counted on. It's not as if they have a policy of only scanning first editions. So you're just getting someone's version of someone's version in any case. Given that constraint, an upfront policy of harmonizing their editions into a comfortable form for the casual modern reader is completely defensible.

I like to read Chaucer in well-researched Middle English as much as the next guy, but if I want that kind of experience I don't go looking for it in free e-books. Did I mention free? They're handing us something really nice for free.
posted by doubtfulpalace at 4:14 PM on June 20, 2017 [5 favorites]


Did I mention free? They're handing us something really nice for free.

No doubt. I really don't like to criticize free things, and just think they ought to save themselves some work here. It's obviously something they think is worth the extra effort.
posted by bongo_x at 4:32 PM on June 20, 2017


They have no Trollope. I may have to remedy that.
posted by SecretAgentSockpuppet at 4:34 PM on June 20, 2017


I thought the covers were clever, too. Like the Manet they chose for Zola's work. Thanks for the post.
posted by misozaki at 4:48 PM on June 20, 2017


I was concerned about that, but the specific changes mentioned in their typesetting manual are reasonable enough ("to-night" to "tonight", "&c." to "etc.", &c.) I didn't see anything that could mislead the reader.

To me, that kind if meddling is much, much more aggravating than non-curly quotes or hyphens for em dashes. I recognize that I'm in the minority here but it kind of bummed me out. It's like stumbling across a pizza restaurant that looks really good except then you realize that for some reason they top every single pizza with cheetos.
posted by No-sword at 4:50 PM on June 20, 2017 [8 favorites]


edheil, I have had pleasant results from sending a typos-cleaned-up version back to Gutenberg; it got posted as the new version. They didn't want to deal with anything I'd changed linebreaking on, but someone was happy to check my diffs against the scans. I wish I had a phone reader that let me edit the text files while reading. Later I spent a week with PG trying to figure out if a word was "woad" or "wood"; the letter was broken in the original folio, and either was plausible, and it made a difference, and I believe we finally put in a transcriber's footnote that it was ambiguous.

But then there's JanetLand's experience. Huh.

---

I have zero interest in light modernization and think we lose information when "week-end" is regularized into "weekend", but hey, I don't have to download them.
posted by clew at 5:29 PM on June 20, 2017 [5 favorites]


> To me, that kind if meddling is much, much more aggravating than non-curly quotes or hyphens for em dashes. I recognize that I'm in the minority here but it kind of bummed me out. It's like stumbling across a pizza restaurant that looks really good except then you realize that for some reason they top every single pizza with cheetos.

Seriously? Changing "to-night" to "tonight" and "&c." to "etc." is like topping every single pizza with cheetos? Not sure if you're exaggerating for effect, but that's a pretty off-the-charts level of reaction to exceedingly minor changes. Also, see the point above about the texts they're working with not being the original, first-edition texts. Unless you only ever read original, first-edition texts, I'm not sure why these are any worse than books you buy at the store. Which cost, you know, money.
posted by languagehat at 5:55 PM on June 20, 2017 [10 favorites]


For what it's worth, I myself would prefer unmodernized versions, especially if I had confidence in their provenance. But I'm a middle-aged native speaker who's read a lot of old books. These ebooks have a worldwide all-ages audience. It's a good place to remove barriers to access by making the kind of cosmetic changes that inevitably get made to non-specialist editions over time (see the Shakespeare upthread for an example of how these things naturally accumulate).

I'm also old enough to remember when PG was ASCII-only, and the meaning of the text was often compromised by the lack of such essentials as italics and diacritical marks. That puts this stuff in perspective.
posted by doubtfulpalace at 6:24 PM on June 20, 2017 [1 favorite]


In rebuttal, languagehat, I spy Right Ho, Jeeves upon the shelf and it would be rummy indeed to "correct" Wodehouse.
posted by SPrintF at 7:04 PM on June 20, 2017


I would encourage anyone who wishes to have an informed opinion about their modernization approach to read their typography manual and tips for editors and proofreaders. I don't see how anyone can accuse them of "correcting" an author, Wodehouse or otherwise.
posted by doubtfulpalace at 7:34 PM on June 20, 2017 [5 favorites]


I don't think anyone was suggesting burning at the stake, just expressing preferences.
posted by bongo_x at 7:41 PM on June 20, 2017


Aaaaiieeee the book details include word counts. And time estimates! YAAAAY!

I read fanfic. I read a lot of fanfic. I have been reading fiction on an ereader for almost 10 years, and I now think of "book length" as a matter of word count. And none of the major stores list word count; they list "pages," which is utterly meaningless for ebooks.

*downloads small swarm of classics I've been meaning to read but didn't like the Gutenberg versions*

(I have great love for Gutenberg - but the project is over 40 years old, and the texts they did first are often the worst. They hadn't yet sorted out their standards, and they often haven't gone back to correct errors in the early editions.)
posted by ErisLordFreedom at 7:54 PM on June 20, 2017 [2 favorites]


Also see the 88 words that receive special treatment in the modernize-spelling script.
posted by Phssthpok at 8:31 PM on June 20, 2017 [4 favorites]


Not sure if you're exaggerating for effect, but that's a pretty off-the-charts level of reaction to exceedingly minor changes. Also, see the point above about the texts they're working with not being the original, first-edition texts. Unless you only ever read original, first-edition texts, I'm not sure why these are any worse than books you buy at the store. Which cost, you know, money.

No, they're not worse than books you read at the store, but they're also not any better. Except for being free and searchable... but so are their source texts, and those have the advantage of retaining their &cs. That's my point—it sounds like a cool initiative, but (from my perspective) what they're producing is a bunch of texts that are easier on the eyes but less fun to read, i.e. yet another set of texts with unnecessary compromises, where they could have made something that only had improvements.

(I'm under no illusions that Gutenberg is a repository for first editions or whatever. That doesn't make the orthographic quirks they do preserve any less enjoyable.)

If the cheetos thing was too much, how about this: You find a new pizza restaurant run by the daughter of a guy whose pizza place was the only decent one in your area for years. You're excited because word on the streets is that she takes everything good about her dad's place and adds pleasant lighting, nicer furniture, etc. You go there and find that, yeah, it's a very pleasant space, but instead of inheriting her dad's international network of unique and distinctive ingredient suppliers, she just buys everything from an online supermarket in bulk because she thinks variety and character detract from the pizza experience. And everyone around you agrees! You go home to scowl at a pleasantly misshapen onion.
posted by No-sword at 10:09 PM on June 20, 2017 [2 favorites]


Also see the 88 words that receive special treatment in the modernize-spelling script

Might want to get a mod to fix that, you seem to have linked to the drain-all-joy-from-life-and-render-everything-a-drab-grey() function by accident.
posted by No-sword at 10:12 PM on June 20, 2017


xhtml = regex.sub(r"\b([Cc])lew(s?)\b", r"\1lew\2", xhtml) # clew -> clue


Belay that, matey.
posted by clew at 10:34 PM on June 20, 2017 [6 favorites]


The Magnificent Ambersons is a great read! Also Cranford. Also Picture of Dorian Gray if you've never read it, freaked my shit out.
posted by Eyebrows McGee at 10:42 PM on June 20, 2017


This is a great initiative. I like their attention to quality: editing, digital formats etc.

I once improved on an archive.org epub of By The Open Sea by August Strindberg. Basically just fixing OCR errors. I've just offered it to Standard Ebooks; people will find it so more easily there.
(In the meantime you can find my improved epub and its repo here)
posted by joost de vries at 12:10 AM on June 21, 2017


Ugh they totally nerfed the slings and arrows of outrageous fortune in the latest update.

Well, they had to. There was an exploit where a level 30 Dane could turn around and use them to tank an entire sea of troubles. It was totally OP.
posted by Mr. Bad Example at 3:42 AM on June 21, 2017 [4 favorites]


I don't think anyone was suggesting burning at the stake, just expressing preferences.

Allow me to be the first ...
posted by oheso at 3:56 AM on June 21, 2017


But honestly, if the "light de-colorization" is really a matter of a script, as the one Phssthpok linked, would it be so hard to release both the before and the after versions?
posted by oheso at 4:04 AM on June 21, 2017 [1 favorite]


It's an awesome project, although I'm not too happy about the light modernization. I like these quaint spellings like to-night and &c. Modernizing typographic antiquities like the long s (ſ), who might actually be confusing for readers who aren't aware that such a thing existed and read it as "f", would've been sufficient, imo. But then, it's their project, their rules.
Most of the books I downloaded from Gutenberg I found unreadable because of scan errors, extra letters or punctuation due to a spot of dirt on the page, and so on. So yay, proofread scans, at least!

But honestly, if the "light de-colorization" is really a matter of a script, as the one Phssthpok linked, would it be so hard to release both the before and the after versions?

That would be great.
posted by ojemine at 4:41 AM on June 21, 2017


For those looking for recommendations, I highly recommend Three Men in a Boat by Jerome K. Jerome. If you have ever been camping, you can relate. University of Adelaide has a version as well.
posted by fings at 6:44 AM on June 21, 2017


  Also see the 88 words that receive special treatment in the modernize-spelling script

Like all house style issues, this contains a bunch of arbitrary-and-capricious. I mean, naïvenaive but tete-a-tetetête-à-tête? You either go full-on Webster, or you leave alone.

Still, pretty books. Nice that other people care, too. I found the late (charming/infuriating/inspiring/sadly missed) Michael Hart's insistence on ASCII ALL THE THINGS! for PG tiresome, because better typography means better legibility. It was possible to attempt better typography even on the pre-2000 web without reverting to manually inserting ­s in every word.
posted by scruss at 6:55 AM on June 21, 2017


I don't see how anyone can accuse them of "correcting" an author, Wodehouse or otherwise.

They are normalizing punctuation and quotation from British to American, which presumably does result in changes to Wodehouse's grammar.
posted by We had a deal, Kyle at 7:41 AM on June 21, 2017


But honestly, if the "light de-colorization" is really a matter of a script, as the one Phssthpok linked, would it be so hard to release both the before and the after versions?

The script would have to be run before the proof-reading pass, so it's not that simple. But it's all in their git repo, so anyone who thinks that Ovid, Dante, and Shakespeare should be modernized to Edwardian standards BUT NO FURTHER GODDAMMIT can make their own fork. For that matter, they can add ugly Bodoni knockoff fonts, gigantic spaces between sentences, and bowdlerization. If you're going to pick an arbitrary printing period to be the e-book standard, why not go full monty?

I mean, naïve → naive but tete-a-tete → tête-à-tête? You either go full-on Webster, or you leave alone.

I can't find it, but IIRC their standard dictionary is Merriam-Webster, which (like, I would guess, every other modern dictionary) shows "naive" and "tête-à-tête."

They are normalizing punctuation and quotation from British to American, which presumably does result in changes to Wodehouse's grammar.

Since punctuation and quotation aren't grammar, and would have been handled by Wodehouse's publishers differently on either side of the Atlantic rather than by Wodehouse, my presumption is different from yours.
posted by doubtfulpalace at 8:03 AM on June 21, 2017 [2 favorites]


Well I struggle to read/understand classic literature as it is, so I'm all for light modernization. I know I've never been the best at reading classic novels, but I am a native English speaker and a former bookworm and I do have, like, a full liberal arts education so I can't possibly be the worst at it either. Think of us little unsophisticated readers when you're critiquing books aimed at a broad audience?
posted by R a c h e l at 8:13 AM on June 21, 2017


Yay! They have The Wind In The Willows!
posted by tallmiddleagedgeek at 8:26 AM on June 21, 2017 [1 favorite]


so anyone who thinks that Ovid, Dante, and Shakespeare should be modernized to Edwardian standards BUT NO FURTHER GODDAMMIT
I want the orthography &c. to be kept at the age of the translation. This is partly aesthetic and partly to remind me that the translation will have had its assumptions.

Given the parlous state of copyright law this will leave us with mostly Edwardian translations, true.
posted by clew at 10:08 AM on June 21, 2017 [2 favorites]


I want the orthography &c. to be kept at the age of the translation.

Well-researched and -proofed volumes of the sort you describe would be very welcome. Somebody should do them! But PG, as is, does not. Their versions are neither fish nor fowl. Standard Ebooks is cooking fish. People who want fowl are going to have to cook their own. People who are satisfied with fishfowl can mourn the loss of their diæreses right here, I guess.
posted by doubtfulpalace at 10:26 AM on June 21, 2017


Literally the first example I thought of checking was Defoe's Journal of the Plague Year. Here's Project Gutenberg:
It was about the beginning of September, 1664, that I, among the rest of my neighbours, heard in ordinary discourse that the plague was returned again in Holland;
Here's the original 1665 edition:
IT was about the Beginning of September 1664, that I, among the Rest of my Neighbours, heard in ordinary Discourse, that the Plague was return'd again in Holland ;
If the PG text happens to contain "coöperation," should that be left in to give a spurious old-timey impression given how much the text has been modernized already? I say no. Give me 1665 or give me 2017.
posted by doubtfulpalace at 10:49 AM on June 21, 2017 [1 favorite]


One more thought: original orthography goes best with original typography and design, rather than free-flowing ePub. Anyone not constrained by screen size, bandwidth, or eyesight issues should consider looking for facsimile editions on the Internet Archive.
posted by doubtfulpalace at 11:16 AM on June 21, 2017 [1 favorite]


Even that “original” Defoe is somewhat modernized. In a scan of an original 1722 edition (the book was published several decades after the 1665 plague) from Oxford University, the first paragraph is styled:
IT was about the Beginning of Sep­tember 1664, that I, among the Reſt of my Neighbours, heard in ordinary Diſcourſe, that the Plague was return’d again in Holland; for it had been very violent there, and particularly at Amſterdam and Ro­tterdam, in the Year 1663. whether they ſay, it was brought, ſome ſaid from Italy, others from the Le­vant among ſome Goods…
posted by mbrubeck at 11:21 AM on June 21, 2017 [2 favorites]


Whoops, not only did I mix up the date of the events described with the date of publication, I was taken in by a facsimile that wasn't.

Fortunately, my blunder supports both my point about modernization and my point that this stuff needs to be researched for real.
posted by doubtfulpalace at 11:33 AM on June 21, 2017


Standard Ebooks is cooking fish. People who want fowl are going to have to cook their own. People who are satisfied with fishfowl can mourn the loss of their diæreses right here, I guess.

Standard Ebooks is not cooking fish. It is cutting fishfowl into the shape of fish using a half-automated process. There are those if us who, if we can't get fowl, at least prefer honest Edwardian fishfowl, which has its own charm.

But in any case, Defoe is not really the right author to demonstrate our point. (Although for the record, I do put my money/time where my mouth is and pay for something as close as possible to fowl for works like this. Thank God that Penguin has started to see the light in this regard.) Anyway, something like Zuleika Dobson works much better. This was published in 1911. That really wasn't very long ago. You can still buy a copy that was published not long after that, if not an actual first edition. My people are not crazy or deluded for not wanting an editor in 2017 to change the original "every ones" into "everyones."

Now, if some readers prefer "everyone," that's great! I hope more people enjoy this masterpiece. Good on Standard Ebooks for making that happen. But I don't think we've done anything wrong by coming into a thread about the project and pointing out that their chosen approach lessens the value of what they do from our perspective.
posted by No-sword at 2:32 PM on June 21, 2017 [4 favorites]


Yeah, it's quite the straw man to point out Chaucer when we're talking about books written 100-150 years ago. Those books are not difficult to read, and if they are it's not this. As you get closer to the present you are making changes in style, not readability.

I applaud the effort by these people, it's certainly going to fill a need for some. I'm just not that concerned with the type of quotation marks or words being split in an older style. I just want the misspellings fixed and the format readable.

A pet peeve of mine, I hate, hate, hate, is writing that uses dialect written phonetically (is there a word for that?). But I'm not thinking that people should go back and correct all that text to make it more readable for me. I just skip those things, they are not for me.
posted by bongo_x at 2:52 PM on June 21, 2017


Standard Ebooks is not cooking fish. It is cutting fishfowl into the shape of fish using a half-automated process.

I can't speak to their success level before reading one of their books, but their stated goal is editing their books to reflect modern usage. That's cooking fish, in this metaphor. Its being half-automated is neither here nor there.

Obviously, Zuleika Dobson is closer to modern usage than Defoe, and is more readable in its original state. But the corollary to that is that the editing process changes it less. And there's no non-arbitrary way to decide how archaic is too archaic.

There are those if us who, if we can't get fowl, at least prefer honest Edwardian fishfowl, which has its own charm.

I can understand that. What I don't understand is how existing PG texts satisfy that preference. When I read them, they just seem like a mess, from the admittedly tweaky perspective we're using here. The above-linked modernize-spelling script is as much about repairing ASCII damage as about modernizing spelling.

I, too, would love to the book from which my nickname is taken in an electronic edition that matches the glory of the paper edition I own. But I'd prefer a Standard eBooks edition to the half-assed version you get by pouring PG into an ePub, by a long way.

I don't think we've done anything wrong

Who said you did? We just disagree about some things. (Not about Zuleika Dobson, though. As you say, it's a masterpiece.)
posted by doubtfulpalace at 3:24 PM on June 21, 2017 [1 favorite]


What I don't understand is how existing PG texts satisfy that preference.

Well, they don't! But Standard Ebooks don't either. They're a maddeningly sideways step where they could have been an unambiguously forward one. Maybe what I really object to is the name. If they were called "Beautiful Ebooks" or "Readable Ebooks" that would be a lot easier to take, and also less likely to attract unwelcome attention from the gods.
posted by No-sword at 3:59 PM on June 21, 2017 [1 favorite]


Well, the great thing about standards is that there are so many to choose from. *ducks*
posted by doubtfulpalace at 4:11 PM on June 21, 2017


writing that uses dialect written phonetically (is there a word for that?).


Eye dialect
posted by the man of twists and turns at 4:49 PM on June 21, 2017 [1 favorite]


I really thought you were fucking with me until I looked it up.
posted by bongo_x at 7:08 PM on June 21, 2017


I think instead the reference was to pronunciation respelling.
posted by oheso at 5:05 AM on June 26, 2017


Distributed Proofreaders doesn't seem to be mentioned on this page, which is especially dumb of me because I realized I've been conflating it with Project Gutenberg. DP does a great deal of the proofreading for texts that make it into PG.

Relevant; DP has been ongoing for years, has been run by a succession of volunteers, and I'm given to understand the codebase is a deep, deep fossil bed of experience. I mean, it took only one glance for me to find a Standard Ebooks overcorrection that changes meaning in some cases! the one I found was eponysterical, but that just means it's easy for me to see that error. I am hardly sanguine that there aren't equally bad ones I don't see. DP has ever so many more special cases, all of which got argued out at some point. (Greek in English-language texts. Math formulae. Tables. Maintaining some record of page numbers and print editions linked to the online text; I was an active volunteer during that discussion; IIRC the history of literally ancient location references in classical texts convinced enough of the people writing the code that it could be useful in the future. Also I think we needed it for DP versions of actual classical texts.)

On the other hand, the DP beginner's guidelines do refer to reproducing what the "author" "meant" when they clearly have access only to what the printer did. OTOH, that access is mostly what all of us have at best, so it still seems like a good guideline to me.

More subjectively, I totally fail to see how fancier online layout is less able to handle the standards of old typesetting -- I'm always happy when we get ligatures back. And Greek, and math symbols.

Standard Ebooks current-taste-ifying seems like an okay filter to run on top of DP-PG, but man alive is it inadequate to the general problem of "out of copyright texts", and the nice Git layout &c is work that should probably be lent to the codebase that has been working on the general problem.

Even tinier, I am in hindsight more confused by the stupid typo maintained because it matched a Barnes & Noble edition -- the latter sounds like it's probably in copyright and therefore absolutely can't be the official source of a PG text.
posted by clew at 3:30 PM on June 27, 2017 [4 favorites]


Even tinier, I am in hindsight more confused by the stupid typo maintained because it matched a Barnes & Noble edition -- the latter sounds like it's probably in copyright and therefore absolutely can't be the official source of a PG text.

You're probably right -- I don't know for a fact that that was the scanned edition, but after searching through every text of Sense & Sensibility I ran across for the next year, that was the one edition where I found the wrong word. But it certainly could be a reprint of an older, out of copyright edition.
posted by JanetLand at 6:57 AM on June 28, 2017 [1 favorite]


It's possible that the B&N edition was set from a PG text.

(Another of my interactions with PG was inquiring whether "typomancy" was an error for "tyromancy" when used to mean "divination from cheese". The final decision was, probably so, but it was an old enough error that PG was going to keep it in a PG-version of a book that seems to have been using it for many print editions. I think "typomancy" should mean "divination by misprints" now.)
posted by clew at 11:39 AM on June 28, 2017 [4 favorites]


« Older “The situations that cause outrage never go away...   |   'Cause ain't no such thing as halfway crooks. Newer »


This thread has been archived and is closed to new comments