I before E, except after... W?
July 24, 2017 5:56 AM   Subscribe

Most kids who grow up speaking English learn the "I before E" rule, complete with its subparts "except after C" and "or when sounding like A". And some people learn some of the major exceptions, like "weird" and "height" and "caffeine" (so many exceptions, in fact, that as Stephen Fry and QI point out, the rule is essentially useless). But not many people go as far as Nathan Cunningham and use their programming skills to see whether C is really the letter that should be cited as the main exception.

As it turns out, there are six consonants that follow the rule less (H, J, N, R, S, and W). Cunningham addressed some of the criticisms in a follow-up post -- particularly, that "when sounding like A" doesn't matter to this analysis, and whether frequency of word usage should matter.
posted by Etrigan (46 comments total) 12 users marked this as a favorite
 
The source is a txt file of over [Math Processing Error] words.

programming skills!
posted by thelonius at 6:06 AM on July 24, 2017


I have always pointed out that word is spelled "weird".
posted by oneswellfoop at 6:09 AM on July 24, 2017


The source is a txt file of over 350,000 words.

Works here.
posted by Etrigan at 6:09 AM on July 24, 2017


In conclusion, English is a language of contrasts.
posted by chavenet at 6:12 AM on July 24, 2017 [4 favorites]


Would be cool to weight words by how common they are; wondering if there's a lot of archaic word on one side of the divide or the other...
posted by kaibutsu at 6:37 AM on July 24, 2017 [4 favorites]


i before e
WORKSFORME
posted by mikelieman at 6:54 AM on July 24, 2017


It may be that my brain is pre-coffee, but even though I've always accepted this as gospel, I can't think of any words with "cei" in them right now. Please halp?
posted by Polyhymnia at 6:55 AM on July 24, 2017


Look up. What do you see?
posted by asperity at 6:57 AM on July 24, 2017 [6 favorites]


Would be cool to weight words by how common they are

Totally. A fun exercise, but given that it's advice aimed mostly at novice spellers, a lexicon of 350,000 words—when the average adult native speaker maxes out at a tenth of that—maybe isn't testing the spirit of the rule as well as it might.
posted by wreckingball at 6:57 AM on July 24, 2017 [1 favorite]


ceiling receive perceive deceive conceited
posted by kyrademon at 7:01 AM on July 24, 2017 [6 favorites]


receipt conceive transceiver
posted by kyrademon at 7:04 AM on July 24, 2017 [3 favorites]




The source is a txt file of over [Math Processing Error] words.

programming skills!


(Aside: I'm not sure why they represented that number with embedded TeX, but using TeX in a web page is a completely different skill from the text processing and data munging that the rest of the post is about).
posted by a snickering nuthatch at 7:12 AM on July 24, 2017


I before E, except after C, or when followed by G, as in neighbor and eight" is the version I learned.
posted by Carol Anne at 7:12 AM on July 24, 2017 [1 favorite]


My new rule- i before e, except after space.
posted by MtDewd at 7:16 AM on July 24, 2017


We don't even attempt any other rules like this, do we? O before U unless it's Latin/Italian...
E before A unless it's a dipthong or Latin...

We just learn the spellings. Why do E and I need a rule?
posted by Segundus at 7:30 AM on July 24, 2017 [2 favorites]


The follow-up post (the "addressed some criticisms" link) brings to mind an interesting alternative method. Consider separating words into frequency bins and finding separate rules for different bins. You don't need to require the rules used in less-common bins to be refinements of rules in more-common bins because it can be assumed that, when learning a less-common bin, the learner is already aware of the spelling of more common words.

So perhaps among the 500 most common words, X precedes Y except after Z, but among the next 20k Y precedes X except after Q.

Notice I just mentioned bin widths (1-500, 500-20k, 20k+) but of course it may be that certain bin widths result in "sharper" rules, so you might try to quantify the overall goodness of a set of bin widths and search for bin widths that maximize goodness. This in turn may reveal that (bullshitting here) the most common 730 words are mostly Saxon-derived words that hew to certain conventions but the next 10k are mostly borrowed words with a different set of conventions, etc...
posted by a snickering nuthatch at 7:30 AM on July 24, 2017 [1 favorite]


I read the Cunningham piece before watching the QI bit, and was confusing to me for a while, because Cunningham seems to be focused on the 'except after C' part (or at least I went in there with that idea because of the link text), but Fry was more concerned with the 'I before E' part.
That and 'ceiling!'
posted by MtDewd at 7:42 AM on July 24, 2017


On the one hand, I absolutely believe this so-called rule ought to be consigned to oblivion.

On the other hand this...
"You’re forgetting the rest of the rhyme, ‘and when said A as in neighbour and weigh’ - people really seemed to get annoyed when I told them that this is completely irrelevant to whether or not the ‘except after c’ rule works; if you have a ‘cei’/’cie’ word, it doesn’t matter how it’s pronounced, ‘cei’ is what the rule suggests is correct, and ‘cie’ incorrect.
... is just the rhetorical equivalent of sticking your fingers in your ears and saying "la la la I can't heeeeeear you." Picking half of a guideline and evaluating it while ignoring the other half is pretty pointless.
posted by Shmuel510 at 7:49 AM on July 24, 2017 [1 favorite]


From the article: "As far as I’m aware there is no regular expression for pronunciations (yet), so I’ll have to settle for interrogating the short form of the rule."

A quick Google finds the CMU Pronouncing Dictionary. It would still be some work to match up the pronunciations with the corresponding bits of the spelling.
posted by madcaptenor at 7:59 AM on July 24, 2017


Previously
posted by Wolfdog at 8:31 AM on July 24, 2017


I before E, except half the time
posted by FatherDagon at 8:53 AM on July 24, 2017 [4 favorites]


I before E, modulo transpositions.
posted by Wolfdog at 8:59 AM on July 24, 2017


The best part about this rule is that when you point out an exception to high school English teachers, they just pile on exceptions and act like they've always been there. "Or if it rhymes with weigh, or on the first Thursday after a full moon, or when using received pronunciation, but not North of the Mason-Dixon..."
posted by ckape at 9:03 AM on July 24, 2017 [2 favorites]


A rule-of-thumb is not an actual rule.
posted by Jode at 9:30 AM on July 24, 2017 [1 favorite]


animation
posted by Wolfdog at 9:31 AM on July 24, 2017


Science!
posted by gurple at 9:39 AM on July 24, 2017


I adore thee - except after tea;
Ye abhor I - except after pie!
I therefore flee, and thereafter die:
Ye restore I, with laughter and glee.
posted by the quidnunc kid at 10:26 AM on July 24, 2017 [6 favorites]


I before E except after C
We live in a weird society...
posted by Greg_Ace at 10:56 AM on July 24, 2017 [5 favorites]


Only knowing the first half of the i before e rhyme screwed over my third grade self, who was doing very well in the spelling bee until he was given "beige."
posted by ejs at 11:11 AM on July 24, 2017


We live in a weird society...

That's neither here nor there!

My pre-third-cup-of-coffee brain is having a hard time thinking of cie words though
posted by TwoWordReview at 11:12 AM on July 24, 2017


And now after googling list of words containing cie, I see that the vast majority of them are 'cie' by way of suffix such as 'chanciest' or 'brilliancies', so I wonder if the analysis would be different if applied only to root words?
posted by TwoWordReview at 11:14 AM on July 24, 2017 [1 favorite]


H, J, N, R, S, and W

- height (note: followed by g)
- ?????
- neighbour (note: followed by g and sounding like a)
- reign (g/a)
- seinfeld
- weight (g/a)

J? I literally can't think of any 'jei' words. And I've been trying for two full seconds! Help me, Wolfram Alpha!

And... nothing.
posted by Sys Rq at 11:19 AM on July 24, 2017


I wonder if the analysis would be different if applied only to root words?

Absolutely. See also abovementioned "caffeine": caffe+ine.
posted by Sys Rq at 11:24 AM on July 24, 2017 [1 favorite]


J? I literally can't think of any 'jei' words. And I've been trying for two full seconds! Help me, Wolfram Alpha!

Here's the word list that was used. "yajein" and "yajeine" are the "jei" words - this may be some sort of chemical. For "jie" I get two words: "geitjie" (Afrikaans for gecko, but appears in some English-language texts) and "uintjie" (some sort of plant and showed up in the National Spelling Bee once.)

These are, um, not real words.
posted by madcaptenor at 11:38 AM on July 24, 2017


i before e, except after Labor Day.
posted by Kabanos at 1:42 PM on July 24, 2017


Don't you mean Labuor Day?
posted by madcaptenor at 1:52 PM on July 24, 2017 [1 favorite]


Leighbour
posted by Kabanos at 2:01 PM on July 24, 2017 [3 favorites]


And now after googling list of words containing cie, I see that the vast majority of them are 'cie' by way of suffix such as 'chanciest' or 'brilliancies', so I wonder if the analysis would be different if applied only to root words?

If the rule were "I before E, when pronounced as long 'e,' except after C (unless in the endings of superlatives and plurals whose root word ended in CY)" I think that would work most of the time. Still basically a pointless rule. If it has a red line under it, you guessed wrong.
posted by Pater Aletheias at 5:11 PM on July 24, 2017


I before E, except in German when the opposite seems to be the rule.
posted by SemiSalt at 5:44 PM on July 24, 2017


German uses both permutations, and it's really easy to remember which one for which word: "ei" is pronounced like the letter "I" (when reciting the alphabet in English); "ie" is pronounced like the letter "E".
posted by one for the books at 6:41 PM on July 24, 2017 [1 favorite]


Note that German-speakers pronounce "e" like English-speakers pronounce "i", and vice versa. I had a German-speaking differential equations professor once whose English was very good, but he always got this backwards, and those letters come up a lot...
posted by madcaptenor at 7:25 PM on July 24, 2017


The effort here is so misguided because it misinterprets a rule of thumb for diphthong spelling as an actual spelling rule that would apply wherever the combination of letters occurs. This sentence in the article is the key to the folly: "As such, while ‘weightiest’ can appear in both the ie_words and ei_words list, the word ‘zeitgeist’ only appears once in ei_words." If weightiest counts as an ie_word, then so do comparatives like fleecier and juicier, superlatives like raciest and spiciest, plurals like prophecies and agencies, etc. Clearly, this will add a ton of noise to their word list and make the whole exercise pointless. The second part of this sentence highlights the other major methodological error: foreign loanwords.

In conclusion — no sir, I don't like it.
posted by stopgap at 7:38 PM on July 24, 2017 [1 favorite]


If this rule is looking slightly dubious, we should probably add more rules, just to be safe. I propose: "H before E, except after R or like eh as in 'feh' or 'meh'."

Why yes, I did run the numbers:
sfenders@paiva:~$ grep 'he' /usr/share/dict/words | wc -l
4048
sfenders@paiva:~$ grep 'eh' /usr/share/dict/words | wc -l
326
sfenders@paiva:~$ grep 'reh' /usr/share/dict/words | wc -l
115
sfenders@paiva:~$ wc -l < /usr/share/dict/words
99171

posted by sfenders at 8:14 PM on July 24, 2017


I before E except after B, E, H, W, and often also J, C, and O, I say, after scanning through some old e-mail for frequencies.
posted by enf at 2:42 PM on July 25, 2017


"Even Einstein got it wrong in his own name -- twice!" -- Gallagher (I think).
posted by TwoToneRow at 6:59 PM on July 25, 2017


« Older Food for Soul   |   That time the Great Crown of England was pawned Newer »


This thread has been archived and is closed to new comments