Join 3,516 readers in helping fund MetaFilter (Hide)


The Weirdest Language in the World
July 2, 2013 6:09 PM   Subscribe

Idibon, a company that specializes in language processing, decided to rank the world's languages to see which had the most unusual features. The winner was Chalcatongo Mixtec, a language spoken by 6000 people in Mexico. The most normal language? Hindi.

The remainder of the top 10 weirdest:
2. Nenets
3. Choctaw
4. Diegueño (Mesa Grande)
5. Oromo (Harar)
6. Kutenai
7. Iraqw
8. Kongo
9. Armenian (Eastern)
10. German

English comes in at #33, which ranks it as pretty weird. (via)
posted by Tsuga (95 comments total) 48 users marked this as a favorite

 
Strange that Mandarin is considered "weird". Beyond the Chinese characters and the pronunciation, the grammar is pretty simple, far simpler than English.
posted by KokuRyu at 6:21 PM on July 2, 2013 [2 favorites]


I'm not a linguist, but I'm surprised !Kung wasn't on the list.
posted by double block and bleed at 6:24 PM on July 2, 2013


Oaxaca represent
posted by hobo gitano de queretaro at 6:28 PM on July 2, 2013 [3 favorites]


Where's 'Gibberish'?
posted by mazola at 6:30 PM on July 2, 2013 [2 favorites]


All right, now explain why the criteria used to make this list make it meaningful and interesting.
posted by Nomyte at 6:31 PM on July 2, 2013 [5 favorites]


Looking at the full list spreadsheet, !Kung isn't there, but its language family Ju|'hoan is. It's 99th out of 1693 for weirdness. The actual report doesn't mention it because it's limited to "languages that have a value filled in for at least two-thirds of features (239 languages)"
posted by aubilenon at 6:32 PM on July 2, 2013


Some of the surprising rankings could be due to individual features being less important - Mandarin has pretty standard grammar, but maybe it scores "weird" on several vowel features which add up. !Kung has some really weird sounds but may score normal on all grammar portions, for example.

Are they taking writing into account at all? It doesn't look like it.
posted by 23 at 6:32 PM on July 2, 2013 [1 favorite]


You know what really makes sense about Hindi? The writing system. With (notoriously orthographically challenged) English as my native language, I was really just flat out astonished to discover just how precise and logical devanagari script is. I mean, aside from the fact that it describes a full range of consonant, vowel, and semivowel sounds with extreme clarity, the traditional order of the alphasyllabary has a logical phonetic basis. We should all be using it. Come on, Westerners: get with the superior writing program. We replaced Roman numbers with Indian numbers already.
posted by BlueJae at 6:33 PM on July 2, 2013 [11 favorites]


Ah, aubilenon did me one better. Also looks like writing is not a factor (hadn't spotted the raw data before).
posted by 23 at 6:34 PM on July 2, 2013


My mistake, Ju|'hoan does have enough information (20 features filled in out of a possible 21). But it's like 13th weirdest among those with enough information, is why it doesn't get a shout-out.
posted by aubilenon at 6:35 PM on July 2, 2013


One of the features that distinguishes languages is how they ask yes/no questions [...] The word order switching that we do in English only happens in 1.4% of the languages.

Wow, that's a lot fewer than I would have thought. I wonder how annoying it is for people learning English (or one of the other twelve) to get used to that.
posted by Peter J. Prufrock at 6:39 PM on July 2, 2013


With (notoriously orthographically challenged) English as my native language, I was really just flat out astonished to discover just how precise and logical devanagari script is.

So, when we write a consonant, we assume it just has a vowel built in. Let's call it a. But what if it's a different vowel that follows the consonant? Hmm, I guess we could have a thing that goes after the consonant it follows, or maybe below it sometimes. Oh, unless it's i, that one should go before the consonant it comes after. Seems clear enough. Oh, and let's have a letter for "no consonant."
posted by Nomyte at 6:39 PM on July 2, 2013 [1 favorite]


Are they taking writing into account at all? It doesn't look like it

Pretty large swaths of linguistics ignore writing systems completely. And I think that's probably a good idea for this project - loads of these languages don't even have a writing system at all.
posted by aubilenon at 6:40 PM on July 2, 2013


Yeah, the Latin alphabet hasn't aged well. In English and several Romance languages the random intermingling of historical and phonetic spellings creates most of the difficulties of Chinese characters without the advantages.
posted by 23 at 6:40 PM on July 2, 2013 [1 favorite]


Come on, Westerners: get with the superior writing program. We replaced Roman numbers with Indian numbers already.

Yeah, English is pretty messed up in this way. I've spent a bit of time in Malaysia in the past few years, and love the way they basically sat down and worked out the most logical, phonetic way to spell all the words using the latin alphabet. Look at the names for currency - ringits and sens. Not "cents", sens. Because "c" is used for words with a hard "k" sound, "s" is reserved for, well, an "s" sound. Like at the start of the word "cents".

I agree Mandarin has wonderfully logical and sensible grammar - but the lack of a phonetic link between the written and spoken language ruined it for me.
posted by Jimbob at 6:41 PM on July 2, 2013


Oh, and let's have a letter for "no consonant."

And also, when we have an r, let's write it somewhere near the consonant it's going to be said after. Or before. Hell, if we have any two or more consonants in a row, let's basically make up a new letter. How many different combinations can we come up with here?!
posted by jacalata at 6:43 PM on July 2, 2013


I wonder how annoying it is for people learning English (or one of the other twelve) to get used to that.

It's really bizarre. Makes one think of middle-school arithmetic and "factoring out" do from verbs, so that goes = ( do + 3RD) ·go, and then you "pull out" the ( do + 3RD) to the place in the sentence that follows the interrogative pronoun. Gah.
posted by Nomyte at 6:43 PM on July 2, 2013


Piraha down at #38 will get some folks riled up.
posted by escabeche at 6:44 PM on July 2, 2013


Why is German in there? English is basically German with some words spelled and/or pronounced differently.

Deutsch ist gar nicht so kompliziert!
posted by Hairy Lobster at 6:53 PM on July 2, 2013 [2 favorites]


(Okay guys, I'll admit the short i thing in written Hindi is a little weird. But it makes my name look cool when I write it.)
posted by BlueJae at 6:54 PM on July 2, 2013


You know what really makes sense about Hindi? The writing system.

Yeah, I like Hangul (Korean) too. Very well-thought-out system.

Japanese would be great if they stuck to hiragana/katakana, it's super easy to pronounce and read. The heavy use of kanji, however, catapults it to the "super annoying" end of the writing system difficulty scale along with Chinese. Memorizing thousands of characters is just crazy, especially when you already have a phonetic system (sure it helps with homonyms, but in English we just deal with it ...)
posted by wildcrdj at 6:55 PM on July 2, 2013 [5 favorites]


English is basically German with some words spelled and/or pronounced differently

No, English is German after getting caught with French in that machine from the Jeff Goldblum version of The Fly.

That's why it's so weird.
posted by escabeche at 7:08 PM on July 2, 2013 [17 favorites]


Japanese would be great if they stuck to hiragana/katakana, it's super easy to pronounce and read.

It's so easy to read that they decided to have *two* phonetic alphabets with the same sounds, but different letters, just so the language wouldn't get polluted by writing loan words using the wrong alphabet. yup. definitely a sensible way to go about things.
posted by ennui.bz at 7:08 PM on July 2, 2013 [5 favorites]


@ Bluejae: the ancient Sanskrit grammarians were mad prescriptivist scientists. I think a lot of the precision is due to them (i took Sanskrit and Hindi in college. cant remember a lick, but I do remember being impressed in the same way)
posted by jpe at 7:12 PM on July 2, 2013 [1 favorite]


Why is German in there? English is basically German with some words spelled and/or pronounced differently.

Deutsch ist gar nicht so kompliziert!


I could never keep all the pronoun/preposition/LET'S GENDER EVERYTHING stuff in my mind well enough to progress. Which is sad because I really like the language when I can stop worrying about that stuff and get a nice flow going. Also, who doesn't love the crazy word compounding?
posted by jason_steakums at 7:17 PM on July 2, 2013 [1 favorite]


This is some cool analysis they did, although I'm not sure if taking an average across a bunch of ranks really tells you anything, especially when so many of these languages are missing feature data. They restricted the list to those languages which have at least 2/3 of their selected 21 features, but they don't overlap between languages, so for all we know Hindi could have really weird "57A: Position of Pronominal Possessive Affixes" but since the data is missing it wins as "least weird."

I downloaded their sheet and restricted it to languages which have all 21 features filled in, to make it an even playing field, and Abkhaz, Spanish, and Chinese show up as the weirdest and Turkish and Hungarian the least weird. Go figure.

I haven't carefully read their description of the analysis, but I think the problem is that a language like Turkish or Hungarian could have a weird feature that is so uncommon that almost no other languages have it, and as such this feature is not considered as part of their 21 features. For example, the Turkish "miş" suffix which is used (among other things) to describe or infer something which the speaker does not have direct personal knowledge of, e.g. "He should have arrived at noon [but I am relaying information and cannot personally verify that he did, indeed, arrive." I've never encountered anything like it in any other language and for me that would put Turkish among the weirder languages of the world, but whatever.

The other thing is that their rankings of weirdness is going to depend heavily on their language universe; then depending on what's defined as a language vs. a dialect, certain features could be overweight or underweight. For example (and this is just finger in the air speculation), suppose some researchers of Amazon rainforest languages document 100 related languages which all happen to share one feature which English does not have. Then English appears weird relative to these languages, because the 100 languages get a score of 99% for that feature and English gets 1%. Obviously this is an extreme example, but simply by virtue of being consolidated and well-established, a language like English is probably going to do somewhat poorly in this type of ranking vs. the thousands of other language/dialects which have not undergone this consolidation.

I wonder what the results would look like if you weighted them by number of speakers? Probably, you would end up with a list that looks very similar to the list of languages by speaker population, but due to shared features it wouldn't be exactly the same.
posted by pravit at 7:24 PM on July 2, 2013 [7 favorites]


English is basically German with some words spelled and/or pronounced differently.

Well, except we use the same helping verbs for all verbs, rather than sometimes "is" and sometimes "has". Can't put main verbs at the end. No gendered nouns or declined impersonal articles. So fewer giant arbitrary lists (mass nouns, though...).

Of course back when English was starting out verbs still had a number for "two people" so there's been time for growth.
posted by 23 at 7:24 PM on July 2, 2013


Why is German in there?
German gets a high weird score in their analysis due to the feature "143A: Order of Negative Morpheme and Verb" (among others). In other words, because in German the negative ("nicht") generally goes after the verb, not before, like in most other languages of the world, e.g. "Das geht nicht" rather than "Das nicht geht."
posted by pravit at 7:29 PM on July 2, 2013


For example, the Turkish "miş" suffix which is used (among other things) to describe or infer something which the speaker does not have direct personal knowledge of, e.g. "He should have arrived at noon [but I am relaying information and cannot personally verify that he did, indeed, arrive."

That is such a cool feature! Sort of a built-in anecdotal evidence flag?
posted by jason_steakums at 7:32 PM on July 2, 2013 [3 favorites]


I'd like to nominate Korean for Best Language for Swearing. Although one particular word (YT, NSFWinKorea) seems to get overused.
posted by shortfuse at 7:36 PM on July 2, 2013


Strange that Mandarin is considered "weird".

I'm only a noob at Mandarin, but it seems to me that although the grammar is simple*, the language is incredibly idiomatic. Almost any time I try to string together a few words into a basic sentence, it's grammatically correct but still the wrong way to put something.

* no grammatical gender, no conjugation, no declension, no cases, word order is often flexible
posted by qxntpqbbbqxl at 7:40 PM on July 2, 2013 [1 favorite]


For example, the Turkish "miş" suffix which is used (among other things) to describe or infer something which the speaker does not have direct personal knowledge of, e.g. "He should have arrived at noon [but I am relaying information and cannot personally verify that he did, indeed, arrive."

German has something like this, although it's conveyed in the verb conjugation. You can use the subjunctive mood (I think that's what it's called?) to convey hearsay. "Er ist angekommen" means "he has arrived," while "er sei angekommen" is basically "he allegedly has arrived [but I'm just conveying information and cannot personally verify this]." It's used often in news reports to show that a person being quoted has claimed something, but the reporter doesn't know for sure that it's true.
posted by mandanza at 7:46 PM on July 2, 2013 [1 favorite]


For example, the Turkish "miş" suffix which is used (among other things) to describe or infer something which the speaker does not have direct personal knowledge of, e.g. "He should have arrived at noon [but I am relaying information and cannot personally verify that he did, indeed, arrive."

I've heard of this type of affix -- not the specific meaning, but the general idea that the affix tells you something about how you know what you are saying -- in other language families.
posted by jeather at 8:03 PM on July 2, 2013


That is such a cool feature! Sort of a built-in anecdotal evidence flag?
Yup, built in CYA flag. Though its grammatical use goes beyond that, that's really the one unique "mood" that I remember being unique to Turkish.

Strange that Mandarin is considered "weird".
Mandarin gets its weird score due to: "6A: Uvular Consonants"(German scores very weird on this too), "101A: Expression of Pronominal Subjects"(pronouns are often left out), and "111A: Nonperiphrastic Causative Constructions."(from my layman's understanding, this refers to Chinese using a "helping" verb like 给 or 弄 in sentences like "You made the carpet dirty" or "You made her cry". Apparently Chinese is unique in having to use a helping verb vs. other languages which modify the verb (e.g. Japanese has a verb suffix which is used to denote causative action). But I'm a tad confused on how English is really any different in this respect; if anyone wants to take a crack at it the WALS article is here).
posted by pravit at 8:23 PM on July 2, 2013


Ok, now do programming languages.
posted by radwolf76 at 8:25 PM on July 2, 2013 [1 favorite]


programming languages? I'd have to go with brainfuck or piet for "weirdest". Since C is the ur-sprach of like 75-90% of commonly used languages today, I'd have to say it is the "least weird"...

---------
I think the world would be a lot wiser if we in English had "miş", and it was like mandatory on talkshows and personal discussions amongst the general populace about politics, science, religion and... well pretty much everything.
posted by symbioid at 8:30 PM on July 2, 2013 [1 favorite]


VBA would probably count among the weirdest of the widely-used languages. Optional indexing of arrays from 0 or 1 (and all the joy that ensues when you use someone else's code module and they index their arrays differently from you), different syntax to make a call to a function with no return values as opposed to one which returns a value...
posted by pravit at 8:35 PM on July 2, 2013


I think the world would be a lot wiser if we in English had "miş", and it was like mandatory on talkshows and personal discussions amongst the general populace about politics, science, religion and... well pretty much everything.

I was thinking that too, but I worry it would still be easy for people to include the "miş" but still go on like it wasn't there, like how "allegedly" has lost all meaning other than as a technicality in a lot of the media.

Still, it would be an excellent language tool for the sciences.
posted by jason_steakums at 8:37 PM on July 2, 2013 [1 favorite]


Weirdest programming language is definitely Aheui because it's 2D and uses Hangul.
posted by 23 at 8:44 PM on July 2, 2013 [2 favorites]


I would be remiss if I didn't mention (previously) re miş:
For instance, some languages, like Matses in Peru, oblige their speakers, like the finickiest of lawyers, to specify exactly how they came to know about the facts they are reporting. You cannot simply say, as in English, "An animal passed here." You have to specify, using a different verbal form, whether this was directly experienced (you saw the animal passing), inferred (you saw footprints), conjectured (animals generally pass there that time of day), hearsay or such. If a statement is reported with the incorrect "evidentiality," it is considered a lie. So if, for instance, you ask a Matses man how many wives he has, unless he can actually see his wives at that very moment, he would have to answer in the past tense and would say something like "There were two last time I checked." After all, given that the wives are not present, he cannot be absolutely certain that one of them hasn't died or run off with another man since he last saw them, even if this was only five minutes ago. So he cannot report it as a certain fact in the present tense. Does the need to think constantly about epistemology in such a careful and sophisticated manner inform the speakers' outlook on life or their sense of truth and causation?
posted by Jpfed at 8:45 PM on July 2, 2013 [25 favorites]


Hindi's fluidity in expressing such a variety of sounds, particularly vowels, helped me figure out how to pronounce Suomi place names with all the pretty little dots and double oos and triple iiis
posted by infini at 9:10 PM on July 2, 2013


Japanese would be great if they stuck to hiragana/katakana, it's super easy to pronounce and read.

No, no, a thousand times no! Chinese characters ("kanji") make Japanese super simple to parse... My wife reads books in about 1/4 of the time it takes me to finish an English book of similar length.

After the first 1000, learning kanji is not so hard, since the radicals and other elements indicate, generally speaking, how the "on'yomi" or "Chinese pronunciation" will sound, and as for the "kun'yomi", or native Japanese pronunciation, these are words a foreign learner picks up along the way anyway.

I find it very difficult to read childrens books in Japanese, because they leave out the kanji and instead just use hiragana. It's difficult to latch onto the words.

Kanji is also ancient and poetic. Learning to write the damn things becomes a transformative, meditative exercise, similar to learning the "kata" of a Japanese martial art.

I sweated blood to learn kanji, and I am very glad that I made the effort.
posted by KokuRyu at 9:18 PM on July 2, 2013 [2 favorites]


Memorizing thousands of characters is just crazy, especially when you already have a phonetic system (sure it helps with homonyms, but in English we just deal with it ...)
Why would it be any harder then memorizing thousands of words with semi-illogical spellings, such as in English? Chinese characters are composed of a common set of sub-shapes, it's not like each character a completely unique image.
Weirdest programming language is definitely Aheui because it's 2D and uses Hangul.
Stranger then Peit, a 2D programming language that uses blocks of color?
posted by delmoi at 9:19 PM on July 2, 2013


I think people are missing the point here slightly, especially when talking about Mandarin. What makes a language weird (in the article's terms) is not that it's easy for an English speaker to understand the grammar but that it has features which are not found in the majority of the world's languages. So, just because English doesn't have noun classes or grammatical gender or a large number of verb inflections or whatever doesn't make it somehow normal. Many languages have those.

As the Language Log article about this site mentioned, the problem with this analysis is that we don't really know how to weight these different features in the relation to each other.


For example, the Turkish "miş" suffix which is used (among other things) to describe or infer something which the speaker does not have direct personal knowledge of, e.g. "He should have arrived at noon [but I am relaying information and cannot personally verify that he did, indeed, arrive."

I've heard of this type of affix -- not the specific meaning, but the general idea that the affix tells you something about how you know what you are saying -- in other language families.


This is called evidentiality. While it must be noted that not all languages have this as an obligatory part of the grammar, all languages have methods of expressing the same idea. In English you can use particles such as "like" or the favourite of the news media "allegedly".

What mandatory marking of evidentiality doesn't do is stop people lying about the source of information or using direct experience marking when they can only infer knowledge of the event.

In fact this thread's favourite evidential, "-miş", has been noted by psycholinguist Dan Slobin to be just like this. One of the extensions of meaning is that of surprise, even feigned surprise, something the speaker is unprepared for. Later even though the source of knowledge of the event hasn't changed, they may use "-di" the direct experience marker because the event is less marked.
posted by elephantday at 9:23 PM on July 2, 2013 [6 favorites]


I think Klingon is pretty bizarre, personally...
posted by These Birds of a Feather at 9:40 PM on July 2, 2013


My sister has lived in Armenia for multiple years now and is always talking bad about herself because her Armenian is still so basic. I sent her this article to make her feel better. It's not you, sis, it's Armenian.
posted by town of cats at 9:48 PM on July 2, 2013


Strine represent
posted by Kerasia at 9:51 PM on July 2, 2013 [2 favorites]


Still, Vogon poetry is the worst.
posted by not_on_display at 10:30 PM on July 2, 2013 [3 favorites]


I never before this moment considered the possibility that Vogon poetry being the worst poetry in existence was a function of the language it was being composed in.
posted by vibratory manner of working at 11:14 PM on July 2, 2013


I sweated blood to learn kanji, and I am very glad that I made the effort.

Stockholm syndrome?
posted by jacalata at 11:29 PM on July 2, 2013 [2 favorites]


Jimbob: "I agree Mandarin has wonderfully logical and sensible grammar - but the lack of a phonetic link between the written and spoken language ruined it for me."

There is, of course, a phonetic link between the majority of Chinese characters and their pronunciations. It just isn't all that obvious. But after a while studying Mandarin, you can start to guess that, say, 磅 is pronounced something like bang or pang. (It's actually bàng.) Even less accurate than English orthography, but still helpful.

(And this is where I link to Mark Rosenfelder's "If English was written like Chinese".)
posted by jiawen at 11:33 PM on July 2, 2013 [2 favorites]


I could never keep all the pronoun/preposition/LET'S GENDER EVERYTHING stuff in my mind well enough to progress.

Yeah, gender can be a PITA if you grew up speaking a more or less genderless language like English. I used to raise chuckles from my Parisian landlady who thought my French was fine except that I clearly was flipping mental coins for half my nouns. And if you speak more than two languages with gender, you have double or treble the fun... it is only through pure rote that I can keep in my head that the English face is in French the masculine le visage, in Spanish the feminine la cara and in German the neuter das Gesicht.
posted by ricochet biscuit at 11:38 PM on July 2, 2013 [1 favorite]


You can always use the masculine el rostro in Spanish to say face.
posted by Doroteo Arango II at 12:08 AM on July 3, 2013 [1 favorite]


> "Why is German in there? English is basically German with some words spelled and/or pronounced differently. Deutsch ist gar nicht so kompliziert!"

... No.

Separable verb prefixes.

Four cases, three genders, and singular vs. plural means that there are essentially sixteen different ways to say the word "the" (among others), depending on context. Except it only has six forms, distributed randomly among those sixteen variants.

Compounding resulting in words like Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz.

I will not defend the many ludicrous aspects of English, but German is its own special brand of insane.
posted by kyrademon at 12:59 AM on July 3, 2013 [1 favorite]


kyrademon: "Compounding resulting in words like Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz."
I simply don't see the problem in compund words (but then my native language does that as well). It seems so much more effective to just bunch it all together with a few splice s'es rather than trying to figure out the correct word order, prepositions and conjugation you would need to write the same thing in English.

Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz = Law for the delegation of monitoring beef labeling. Interestingly, the word order is the exact opposite in German and English.
posted by brokkr at 1:12 AM on July 3, 2013


> "... (but then my native language does that as well)."

Well, see, there's the thing.

I think people who grew up with certain linguistic conventions are often poor judges of how easy/difficult/useful/useless they are. If you grow up with it, it simply seems normal.

This applies to English as well as everything else. I've had people tell me that it is absolutely essential for English verbs to have multiple forms "to prevent confusion", as if somehow we would stop being able to hear whether nouns are singular or plural if that changed.

Not being a native speaker of German, having that massive compound word without any word start/stop signifiers seems unnecessarily confusing. (You could simply, as one example, use dashes, which would eliminate the word order/preposition/conjugation problems you bring up.) What are the internal words? Tierungsüberwach? Ungsüberwachung? While these things may be instantly obvious to someone used to the convention, it seems difficult to parse for those unused to it.

Admittedly, of the three linguistic hurdles I mentioned, this is the one I consider the most minor.
posted by kyrademon at 1:44 AM on July 3, 2013 [1 favorite]


Strange that Mandarin is considered "weird". Beyond the Chinese characters and the pronunciation, the grammar is pretty simple, far simpler than English.
posted by KokuRyu


I would hazard to guess that tone plays a big part in what non-native speakers would consider an extremely difficult aspect of Mandarin. Also, be careful… orthography <> language. This article, like most linguistics writing, seems to be rating languages based on sound, not on writing systems. Linguists (usually) don't care about orthography. They care about what comes out of your mouth (with an exception for sign languages), not what's written on paper.
posted by readyfreddy at 1:58 AM on July 3, 2013


Not being a native speaker of German, having that massive compound word without any word start/stop signifiers seems unnecessarily confusing. What are the internal words? Tierungsüberwach? Ungsüberwachung? While these things may be instantly obvious to someone used to the convention, it seems difficult to parse for those unused to it. - kyrademon


Luckily we don't have anything like that in English. (pneumonoultramicroscopicsilicovolcanoconiosis, Antidisestablishmentarianism, Supercalifragilisticexpialidocious, etc…). Word boundaries are always going to be a problem for non-speakers. That's what you know how to do as a native speaker, no?
posted by readyfreddy at 2:04 AM on July 3, 2013 [1 favorite]


I don't think what you're saying really contradicts my point.
posted by kyrademon at 2:16 AM on July 3, 2013


The r/d sound in the devanagari alphabet is native enough within specific languages that other Indians from different language groups are unable to pronounce it right. But then again, any country that has to put 22 different scripts on the national currency note is nothing but a hastily cobbled babel.
posted by infini at 2:23 AM on July 3, 2013


I could never keep all the pronoun/preposition/LET'S GENDER EVERYTHING stuff in my mind well enough to progress

Aren't gendered nouns basically the norm for Indo-European languages, with mostly ungendered languages like English being exceptions?
posted by acb at 4:43 AM on July 3, 2013


> "Aren't gendered nouns basically the norm for Indo-European languages, with mostly ungendered languages like English being exceptions?"

Well, that's still a minority of languages, but as far as I can tell the short answer is ... mostly yes.

Early Indo-European languages like Hittite have three genders. Some kept all three (like German, Greek, and most Slavic languages), some dropped down to two (like most Romance, Celtic, and Hindustani languages, and the odd case of Swedish), and some dropped it entirely (like English, Bengali, Persian, Armenian, Oriya, Assamese, and Afrikaans).

On the other hand, I've yet to hear an argument in favor of gendering nouns that makes any sense whatsoever.
posted by kyrademon at 5:07 AM on July 3, 2013 [2 favorites]


Oh, and Polish, for whatever reason, decided to add a few genders why not.
posted by kyrademon at 5:08 AM on July 3, 2013 [1 favorite]


While suomeksi you don't even differentiate between her and him ;p
posted by infini at 5:23 AM on July 3, 2013


Not being a native speaker of German, having that massive compound word without any word start/stop signifiers seems unnecessarily confusing. (You could simply, as one example, use dashes, which would eliminate the word order/preposition/conjugation problems you bring up.) What are the internal words? Tierungsüberwach? Ungsüberwachung? While these things may be instantly obvious to someone used to the convention, it seems difficult to parse for those unused to it.

I don't know what to say other than 'isn't the brain wonderful'? This is seriously something that has never been as issue for me in German. Reading an exceptionally long compound word (which mostly show up in the names of laws, in my experience) fluidly does kind of require knowing the constitution words. But even without all the words it works. Take this example. I didn't really know what 'Etikettierung' was. But I know the word that started after 'Rindfleisch' ends and, moreover, I know that if I hit, say, an 'ung' (a I did) it's likely just about the end of the word.

The German spelling reform in 1996 introduces some triple consonants (schiffahrt -> schifffahrt), at least partly in the name of making compound words easier to parse (and in the name of consistency) as that was the scenario where the word boundaries were disguised.

On the other hand, I've yet to hear an argument in favor of gendering nouns that makes any sense whatsoever.

I don't think people assert languages (other than conlangs, I guess) are meant to be either efficient (in whatever sense) or logical. Sticking with German as an example, I can't imagine the verb placement is either efficient or logical. But it still works.
posted by hoyland at 5:24 AM on July 3, 2013


I am flabbergasted that Pirahã didn't top the list. But I guess that many of the things that make it weird (very small number of color & number words, among other things) weren't the things they were looking at.
posted by Gordafarin at 6:08 AM on July 3, 2013


Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz

Turns out that one was too much even for the Germans: Germany drops its longest word.
posted by crazy_yeti at 6:25 AM on July 3, 2013


From what I remember from studying linguistics ten years ago (I was taught by Dan Everett but wasn't aware until later about his full experiences with Piraha, only that he spent six months a year in the Amazon) a lot of 'tribal' languages have small numbers of colour and number words. 'One, two, many' isn't uncommon.

Basque not being weird really surprises me, given that nobody's actually sure where it came from.

I'm surprised Finnish isn't higher too, given that everyone I've confessed my inexplicable yearning to learning it keeps telling me how terrifying and uninstinctive the grammar is.
posted by mippy at 6:39 AM on July 3, 2013


It's a bit lost in the rest of the blog post, but the World Atlas of Language Structures they're drawing data from is interesting in its own right. It even has a chapter (and map) on number of genders in languages.
posted by Panjandrum at 7:11 AM on July 3, 2013 [1 favorite]


> "It even has a chapter (and map) on number of genders in languages."

Interestingly, English actually counts as having three genders in their count, since it has three third-person pronouns that are gender differentiated (he, she, it). Which means the majority of world languages don't bother to do even that.
posted by kyrademon at 7:32 AM on July 3, 2013 [1 favorite]


telling me how terrifying and uninstinctive the grammar is.

Actually from what little I've been able to make out, not having studied formally, it seems as though some of the grammatical structure reminds me of Hindi - maybe only because it felt like an easier language from which to reach Suomi than to try it with clumsier third party English in between.
posted by infini at 7:41 AM on July 3, 2013


Having three agreements for number -- one, two, many -- isn't that weird. (We have two, obviously, one and many.) I always liked the expanded options for "we", me-and-you-and-maybe-others vs me-and-others-but-not-you.
posted by jeather at 7:47 AM on July 3, 2013


mippy: "I'm surprised Finnish isn't higher too, given that everyone I've confessed my inexplicable yearning to learning it keeps telling me how terrifying and uninstinctive the grammar is."

Tell it to Quenya.
posted by Chrysostom at 8:21 AM on July 3, 2013


Any language that requires gendered articles is automatically weird as shit IMO.
posted by grubi at 9:20 AM on July 3, 2013


I'm a math person. I loved learning Hindi. Learning Mandarin made me want to kill myself. Now I know why.
posted by seemoreglass at 9:30 AM on July 3, 2013


Wait, what? Hindi is the most logical language, with its gendered nouns and weird object-gender-dictates-verb behavior? "The (male) boy did the (female) bread (female) eat" but "the (female) girl did the (male) fruit (male) eat" - that's the most logical?

I have occasional stress dreams about Hindi exams where I don't know what gender any noun is, and I'm tossing a coin in my head as I go. You're on much firmer ground with closely related languages like Bengali where nouns are mostly neuter gender and verbs don't take gendered forms... So I feel like I'm missing something here.
posted by RedOrGreen at 9:47 AM on July 3, 2013


I wonder if the non native speaker version of learning Hindi is weird and odd due to the underlying roots of English attempting to make sense of it? How does it compare to learning it in the Indian context?
posted by infini at 9:58 AM on July 3, 2013


Wait, what? Hindi is the most logical language, with its gendered nouns and weird object-gender-dictates-verb behavior? "The (male) boy did the (female) bread (female) eat" but "the (female) girl did the (male) fruit (male) eat" - that's the most logical?

The statement wasn't that the most normal language was the most logical. 'Normal' was defined as 'sharing characteristics with a large number of languages'. So, for example, if you have a word order not used by other languages, you're going to be less 'normal', even if it could be decided your order order was objectively more logical than more common word orders.
posted by hoyland at 9:58 AM on July 3, 2013 [3 favorites]


I enjoyed Schnoebelen's comment on sentence structure regarding Hawaiian and the 8.7% of languages that start sentences with a verb: "I learned that verbs are a big commitment for me. I’m just not ready for verbs when I open my mouth."

Oddly enough the fact that Korean falls into the 41.0% of languages that are Subject Object Verb order helped me immensely. I still get stuck on verb conjugation (how polite to be, what level of certainty am I trying to express, yadda ya) and with SOV order I can just begin the verb and then trail off into inaudibility, hoping the listener will fill in the correct verb ending.
posted by spamandkimchi at 10:39 AM on July 3, 2013


I sweated blood to learn kanji, and I am very glad that I made the effort.

Heh, well I'm there now and maybe I'll see it on the other end. But when I talk to my Japanese friends about it, it often sounds as Stockholm Syndrome as your statement does to me :)

So far the only super-obvious benefit is that its usually shorter, but since I don't really read anything on paper these days the size-benefit of that is pretty minor.

Kanji is also ancient and poetic. Learning to write the damn things becomes a transformative, meditative exercise

Thats nice, but in the meantime it means my ability to speak is far beyond my ability to read, which is not my experience in other languages and more than a little frustrating.

It's so easy to read that they decided to have *two* phonetic alphabets with the same sounds

Fair, but it's still leagues easier to pronounce than English. It's more-or-less straight phonetic symbols, so it's not like English where you have to learn tons of exceptions to the supposedly-standard ways to pronounce things. ("why is that letter silent here but not here?", and so on).
posted by wildcrdj at 12:18 PM on July 3, 2013


23: Yeah, the Latin alphabet hasn't aged well. In English and several Romance languages the random intermingling of historical and phonetic spellings creates most of the difficulties of Chinese characters without the advantages.
... until the invention of the typewriter made a low-alphabet-count important.
posted by IAmBroom at 2:33 PM on July 3, 2013


hoyland: I don't think people assert languages (other than conlangs, I guess) are meant to be either efficient (in whatever sense) or logical. Sticking with German as an example, I can't imagine the verb placement is either efficient or logical. But it still works.
Interestingly, at least one study suggests that the rate of information flow may be fairly uniform in human languages (across the seven studied, at least); essentially, the simplicity of words in one language may be compensated by the need for fewer, but more complex, words in another (where complexity is measured in syllables).
posted by IAmBroom at 2:34 PM on July 3, 2013


Oh, and Polish, for whatever reason, decided to add a few genders why not.

Wait, what?

IIRC, Polish only has the standard Indo-European masculine/feminine/neuter trifecta.
posted by acb at 3:46 PM on July 3, 2013


Also, is the two non-sexual gender system in Swedish related to the gender system in Dutch (which, IIRC, also has two genders, not tied to sexes)? If so, how does this fit in with Icelandic having masculine/feminine/neuter (which suggests that Old Norse had the same)?
posted by acb at 4:05 PM on July 3, 2013


is the two non-sexual gender system in Swedish

Apparently (I just looked this up now) Swedish used to have masculine, feminine and neuter, but masculine and feminine merged into "common".

Dutch (also something I just now looked up) whether masculine and feminine are different seemingly varies regionally.

Old Norse totally had masculine, feminine, and neuter genders.
posted by aubilenon at 4:37 PM on July 3, 2013


A few years back, I read a book on gender in Germanic languages, but I couldn't tell you much about it (though apparently I'm about to tell you what I remember, which could be totally wrong). I'm also not a linguist. IIRC Proto-Indo-European had an animate/inanimate distinction, and then (in Germanic anyway) a feminine emerged and things out of one of the genders (I forget which) into the feminine. There was also some geographic pattern to how this shakes out in modern languages--as you head east from Icelandic to Yiddish, there's pattern of nouns migrating from feminine to neuter or vice versa (I really can't remember). Somewhere along the line, Danish and Swedish (and not so much Norwegian, I guess) collapsed back down to two.

As a random side note, when I was taking Danish, the textbook did not talk about 'common' and 'neuter' gender, but rather 'N-words' and 'T-words'. I think that was partly an intentional shift in how they wanted to talk about the gender of nouns and partly that the book was written to teach Danish in Denmark and thus had to be independent of the native language of the students.

IIRC, Polish only has the standard Indo-European masculine/feminine/neuter trifecta.

I think Polish cares whether a noun is human/animate/inanimate at least some of the time. But that's all I can tell you.
posted by hoyland at 5:33 PM on July 3, 2013 [1 favorite]


... until the invention of the typewriter made a low-alphabet-count important.

I wonder if the move towards touchscreens will have any effect on that? On phones / tablets / etc you can have different styles of input for different languages (the Android method for inputting Japanese is quite different than the standard romaji-based method on a fixed keyboard).

I mean, it will be a while before fixed keyboards go away if ever, but it's a potential future and I know some people who do the vast majority of their typing on touchscreens already (for fast typists like myself its no good though, as I can't even come close to my speed on a physical keyboard).
posted by wildcrdj at 5:47 PM on July 3, 2013


... until the invention of the typewriter made a low-alphabet-count important.

Which is completely irrelevant. The Japanese kana systems fit easily on a standard keyboard and are strictly phonetic (with three or so exceptions for particles). There are Hangul typewriters.

The problem is not "oh-ho-ho silly phonetic languages", the problem is that if you use one system for concepts, grammar, and sounds something's going to drift (in practice, usually sound). This is why there are mountains of words in English who have spellings that you must ignore in order to pronounce properly for mysterious historical reasons. And every time someone suggests spelling reform they get laughed out of the room, even though it worked just fine for Japanese and other languages.
posted by 23 at 6:42 PM on July 3, 2013


I wonder if the move towards touchscreens will have any effect on that? On phones / tablets / etc you can have different styles of input for different languages (the Android method for inputting Japanese is quite different than the standard romaji-based method on a fixed keyboard).

I've sometimes wondered if in the future English will be input in strictly phonetic forms which will turn it into the classical forms. In some sense, though, that's what's already happening with spell checkers.

You could even bundle translation into this - if phonetic input is translated into conceptual expression, why couldn't the conceptual expression be a common ground? This veers into science fiction, though.

the Android method for inputting Japanese is quite different than the standard romaji-based method on a fixed keyboard

Are you referring to the one-consonant-per-key input? That's just based on feature phones, and Android supports several input methods - my phone came with romaji-based wapuro input as the default for Japanese, I think. And don't forget that most keyboards in Japan still have kana on the keys if you want to type them directly, though I've never met someone who mentioned using them.

posted by 23 at 6:49 PM on July 3, 2013


Given that Nokia has even launched Amharic, their decision making for keyboard/UI design for languages would be interesting to learn. Especially for all the feature phones that need it most.
posted by infini at 10:09 PM on July 3, 2013


> "I think Polish cares whether a noun is human/animate/inanimate at least some of the time."

In Polish, masculine nouns are also divided into animate and inanimate in the singular and into personal and non-personal in the plural.
posted by kyrademon at 2:29 AM on July 4, 2013


hoyland: "As a random side note, when I was taking Danish, the textbook did not talk about 'common' and 'neuter' gender, but rather 'N-words' and 'T-words'. I think that was partly an intentional shift in how they wanted to talk about the gender of nouns and partly that the book was written to teach Danish in Denmark and thus had to be independent of the native language of the students."
Many Danes don't know they're called common and neutral gender either. I didn't hear that terminology until I was about 20, I think, and then entirely by chance. I also have no idea, being a native speaker, how to figure out the gender of a noun other than "that sounds right".

That said, Danish/Swedish/Norwegian grammar seems fairly uncomplicated to me. The hard part about Danish has to be the pronunciation. I've yet to meet a non-native speaker that can pronounce my last name correctly.
posted by brokkr at 4:01 AM on July 4, 2013


It's so easy to read that they decided to have *two* phonetic alphabets with the same sounds, but different letters, just so the language wouldn't get polluted by writing loan words using the wrong alphabet. yup. definitely a sensible way to go about things.
Well, Japanese doesn't have spaces between words, so kanji provides a nice contrast to the hiragana around it, improving readability. Loan words don't have kanji, but katakana serves the same purpose. Kind of like how italics are sometimes used for foreign words or onomatopoeia in English. Katakana is also used for onomatopoeia and writing out native Japanese names for pronunciation, not just loan words, so the "pollution" argument seems specious.

And besides, the Roman alphabet is basically two phonetic alphabets with the same sounds, too: uppercase and lowercase letters. In the same way that uppercase and lowercase letters are often visually similar, so too are the equivalent hiragana and katakana characters.

So I don't really see the problem. Seems pretty sensible to me.
posted by floomp at 7:11 PM on July 6, 2013


23: ... until the invention of the typewriter made a low-alphabet-count important.

Which is completely irrelevant. The Japanese kana systems fit easily on a standard keyboard and are strictly phonetic (with three or so exceptions for particles). There are Hangul typewriters.
Not irrelevant to the Chinese system you and I were actually commenting on, but thanks for changing the subject suddenly.
23:The problem is not "oh-ho-ho silly phonetic languages", the problem is that if you use one system for concepts, grammar, and sounds something's going to drift (in practice, usually sound). This is why there are mountains of words in English who have spellings that you must ignore in order to pronounce properly for mysterious historical reasons. And every time someone suggests spelling reform they get laughed out of the room, even though it worked just fine for Japanese and other languages.
You seem to speak only Japanese and English, from the arguments of this paragraph. French sound drifts have not created an a-phonetic alphabet; "eau" always indicates the same sound, as does a final "-ette" - even though they are no longer letter-by-letter phonetically represented. The same can be said of most Romance languages.

English doesn't have this problem because "languages drift"; it has this problem because it is the amalgam of two very different language groups. After 1066, all attempts to regularize spelling conventions for phonetic sounds became futile, and words like "dinghy" (of Hindu/Urdu origin, pronounced "ding-ee", not "din-jee" nor "din-fee") were brought in without much regard to regularization in spelling.
posted by IAmBroom at 10:55 AM on July 7, 2013


You can always use the masculine el rostro in Spanish to say face.

Finding more genders for a thing that I think of as genderless is not helping. If only all the gendered-language speakers could get together and decide that, for examples, tables are feminine and chairs masculine or whatever, my life would be easier. Get on that, would you folks?
posted by ricochet biscuit at 11:12 AM on July 13, 2013 [1 favorite]


« Older Many of America's biggest corporations rely on tem...  |  An old Stanford study famously... Newer »


This thread has been archived and is closed to new comments