the language boom
December 7, 2003 6:48 AM   Subscribe

You're only posting this to upset languagehat aren't you?
His opinion of this research can be seen here.
posted by thatwhichfalls at 6:58 AM on December 7, 2003

Istanbul not Constantinople...
posted by tcaleb at 7:18 AM on December 7, 2003

Gobble gobble gobble, shouldn't this have been posted last week?
posted by PigAlien at 7:23 AM on December 7, 2003

If the Migs-signal is a martini glass, what's Languagehat's hat-signal? A stetson with a schwa on it?
posted by DaShiv at 7:38 AM on December 7, 2003

Interesting. I've often wondered why such geographically distant places as Turkey, Finland, Korea, and Japan have very similar grammatical structures, and are all members (including Basque and Hungarian) of the Ural-Altaic language group. More information here.

One can only guess how the English language and Chinese developed similar grammar.
posted by hama7 at 7:48 AM on December 7, 2003

The out-of-Anatolia (in company with the first wave of farmers) idea was the thesis of Colin Renfrew's book Archaeology and Language which so put the cat among the historical-linguist pigeons not too long ago. Whatever the truth of the matter, Archaeology and Language is one of the clearest and most consecutively-argued works of social science I have read, entirely free of the usual fog of soc-sci bafflespeak. Indeed, I can't think of any other soc-sci opus that can be described as clear and consecutive.

Test: If you can work your way through, say, Bertrand Russell's Introduction to Mathematical Philosophy and find it challenging but, with effort, perfectly clear, and yet can not understand book X of social science or see how the book's claims are connected to their supporting empirical evidence (if any), you may be confident that the problem is with the book and not with you.
posted by jfuller at 7:57 AM on December 7, 2003

> Istanbul not Constantinople...

"Constantinople," said Father Chantry-Pigg, who did not accept the Turkish conquest.
"Byzantium," said I, not accepting the Roman one.

posted by jfuller at 8:03 AM on December 7, 2003

Hama7, the only features that all those languages have in common are the limitations imposed by government and binding. Or were you trolling?
posted by Mayor Curley at 8:04 AM on December 7, 2003

the only features that all those languages have in common are the limitations imposed by government and binding.

That, and grammar. I see this is more necessary than ever:

posted by hama7 at 8:23 AM on December 7, 2003

Aaargh! I call down the wrath of Cthulhu upon the fire you left me! As Phluzein says, "if this research is valid, why is it published in Nature and not in a relevant place, such as the Journal of Indo-European Studies?" You wouldn't want a linguist deciding which species were in danger of extinction, would you? Then don't let biologists confuse you about the history of language. The fact is that there are no definite links between language and anything else: not pottery, not population movements, not genes. Once we delve back before historical records, all we have to go on is linguistic reconstruction, and that can take us only so far (which is one reason I got out of historical linguistics). It's understandable that this frustrates people and that they try to use archeology and genetics and the like to provide more information, but it's a mug's game. Don't believe the hype.

hama7: Your first link is pure crackpottery, I'm afraid. The author of your second is a guy I happen to know (he taught me how to make good coffee thirty years ago, for which I will always be grateful) and a fine Uralicist (he wrote the article on "Uralic Languages" in Bernard Comrie's The World's Major Languages); you'll notice he's dubious about the putative relation between Uralic and Altaic ("the correspondences between the two groups of languages are unsystematic; they could be the result of borrowing or chance") and mentions further relationships only as hypotheses that have been suggested. As Andrew Dalby says, "scholars who worked on proving the wider relationship [between the Uralic and Altaic languages] had little success. The actual words of the languages could scarcely ever be related to one another. And it became obvious that language typology... was not in itself a firm guide to the genetic relationships between languages." (On preview: love the beacon!)

jfuller: Archaeology and Language may be "one of the clearest and most consecutively-argued works of social science I have read, entirely free of the usual fog of soc-sci bafflespeak," but it's still way over the edge, making unwarranted assumptions about how archeological digs must be related to the movements of languages. Clear writing is not necessarily correlated with dependable facts.
posted by languagehat at 8:30 AM on December 7, 2003

I'm not seeing any new assholes being torn in the linguists' blogs; I'm just seeing kvetching about an outsider moving in on their territory.

A notable exception is a comment to Languagehat's post, by carlos, who suggests that the data used are whack. However, I think that the issue merits some attention from experts in computational linguistics. Although their focus is primarily synchronic (as we in the more-or-less bankrupt structuralist business used to say) , they're probably the only academic circle who understand both traditional linguistics and information retrieval techniques to look at this research with the critical rigor it deserves.

Other than that, most of the criticism seems like academic territorial scent-marking. Now there's some biology that can inform the social sciences.
posted by condour75 at 8:31 AM on December 7, 2003

is there a good book that covers the current standard model (whatever it is) in historical linguistics? i've bought and read a couple of introductory books on linuistics (sorry, not at home, so don't have the titles), but they never had much detail (and were probably old).
posted by andrew cooke at 8:32 AM on December 7, 2003

Other than that, most of the criticism seems like academic territorial scent-marking.

i thought it was odd someone was being snide about bayesian stats. doesn't linguistics use statistics? (see plea for for basic intro to said field above!).
posted by andrew cooke at 8:34 AM on December 7, 2003

W00t! Thanks hama7!
posted by DaShiv at 8:46 AM on December 7, 2003

condour75: Ah, so it's just "territorial scent-marking." Then I assume you would have no problem with linguists deciding which species were in danger of extinction? (Hey, all academic fields are equivalent in this brave new world! Let's have sociologists do brain surgery while we're at it!) You don't seem to realize that "experts in computational linguistics" would have no way of deciding which words were historically related and which apparent similarities were the results of chance or borrowing. Historical linguists argue about this stuff all the time; others haven't got a hope of figuring it out.

I repeat, there are no definite links between language and anything else. I realize it would be pretty to think otherwise.
posted by languagehat at 8:47 AM on December 7, 2003

I'm waiting for my comp ling buddy to wake up -- mostly, my layman's impression is that comp ling uses similar techniques, but in a totally different way. For instance, bayesian filtering for the comparison of several texts (or, eventually, to eliminate spam) is a comp ling joint. It also deals with multidimensional vector spaces, similar to some of the techniques in collaborative filtering. (When amazon tells you that you're going to like a book, it's because they compared your n-dimensional book-liking vector to others in the system, and extrapolated) So generally, your comp linguist is using these techniques to analyse texts in the here and now, with nary a concern for the history of language.

However, the math is mostly the same, and a comp linguist worth his salt should have the mathematical background to follow the methods used in phylogeny -- they're the same methods. Additionally, the methods should be used on a smaller, less speculative scale, to see whether they jibe with a language group and era for which there's a solid written record. Another poster at languagehat's wrote: "Is it just me, or do these yahoos never turn their methods on language families with well-established dating? I bet turning 'em loose on various varieties of Romance would be hilarious." I'd actually like to see them try just that.
posted by condour75 at 8:56 AM on December 7, 2003

The real news to me was that the Kurgan may have had something to do with the spread of language.

"Nuns. No sense of humor."
posted by Ty Webb at 8:58 AM on December 7, 2003

W00t! Thanks hama7!

Thank you, DaShiv.
posted by hama7 at 9:00 AM on December 7, 2003

there are no definite links between language and anything else

Maybe I misunderstood, but I thought they were using language - they were doing a statistical analysis based on words. Is it just the story that's added on afterwards about farmers that you object to? I got the impression that you thought the whole approach was pointless (ie the analysis, as well as the interpretation).

have no way of deciding which words were historically related and which apparent similarities were the results of chance or borrowing

Well that's why you use statistics, right?
posted by andrew cooke at 9:09 AM on December 7, 2003

Languagehat, biology routinely uses computational linguistic techniques. Markov models, for instance, were developed to analyze language but are now a core part of bioinformatics. So to suggest that either discipline can't learn from the other is not really accurate. Yes, this work should be done by historical linguists. But if they're consigned to the notion that "linguistic methods have gone about as fur as they can go", then maybe they need another trip to Kansas City. And again, this technique is not non-linguistic. At worst, the scientists using it are non-linguistic, but nobody's perfect.
posted by condour75 at 9:18 AM on December 7, 2003

Well that's why you use statistics, right?

Wrong. Statistics have nothing to do with it. Historical linguistics is an art, not a science; the basic decision (are two words related or not?) can be made only on the basis of thorough knowledge of the languges involved, the reliability of the evidence, and the methods of historical/comparative linguistics. It took a long time to recognize that Albanian is an Indo-European language, because the overwhelming majority of the vocabulary is borrowed (a great deal from Turkish) and the sound changes in the few inherited words have been so extensive (eg, gjashtë not only means 'six' but is related to the word six, just like Latin sex, Greek hex, and other more obvious cognates). You're not going to discover such things through mechanical methods.

Now, to get around this problem, some scholars have decided it doesn't matter whether words and languages are historically related; they redefine the problem so that what matters is statistical correlation, using similarity defined in ways that computers can handle. That may be fun, and of course it allows you to use all sorts of models from biology and elsewhere, but it's not linguistics in any sense I care about.

nobody's perfect

Yeah, but that doesn't mean we have to give up all hope of accuracy.
posted by languagehat at 9:50 AM on December 7, 2003

hmmm. does accuracy matter in art?
posted by andrew cooke at 10:07 AM on December 7, 2003

sorry, maybe that was a bit of a cheap shot. thanks for the answer.
posted by andrew cooke at 10:10 AM on December 7, 2003

Maybe I'm just talking out of my ass (I am), but I understand that in biology, they use the fact that gene mutation has a pretty well set rate - X number of genes will mutate (or rather x% of the entire genome) in Y years. My (admittedly weak) understanding of language "evolution", however, is that some languages change much quicker than others - a Spaniard could easily read something produced in the Iberian Peninsula a millenium ago, but I'll be damned if I can read Chaucer without a steady stream of Advil.
posted by notsnot at 10:14 AM on December 7, 2003

> it's still way over the edge, making unwarranted assumptions

But, remarkably for a book of this sort, no hidden assumptions. All of the assumptions Renfrew's model depends upon are explicitly called out and announced as such in the text. Indeed, CR set himself up for attacks on his assumptions by being so forthright; he practically printed them in red, for slower readers. Renfrew's formulation is subject to revision and development on many points--all science is. But the various attacks on his assumptions followed so swiftly on the heels of the book's publication that I expect many of them came from critics who would not themselves have been capable of teasing out and displaying hidden assumptions, had they been hidden. Since everyone knows assumptions are a vulnerability and an obvious point of attack, and everyone also knows there are mobs of junior profs and persons-seeking-tenure just aching for a big name to show any chink in his armor, this took immense self-confidence on CR's part. Which most senior scientists have; but it also took severe self-scrutiny, which is a less widespread gift.

> Clear writing is not necessarily correlated with dependable facts.

Granted. But foggy writing is highly correlated with foggy thinking. I'm presently slogging through L'Origine des Manieres de Table in French, to see if it's any more compelling in the original than it was in translation. So far it is not (though it's vastly entertaining--all those gross, drippy myths.) Somebody dig this dude up and tell him the object of French prose is surtout la clairte.
posted by jfuller at 10:15 AM on December 7, 2003

Excellent post. Excellent discussion. Thanks for all the good info.
posted by anathema at 11:17 AM on December 7, 2003

I'm more or less with jfuller on the merits of Renfrew... It's worth checking out this Renfrew lecture, titled "At the Edge of Knowability:Towards a Prehistory of Languages" (2000) [PDF file] which discusses whether linguistic pre-history is knowable and comments on Archeology and Language acknowledging some of its weaknesses.
And languagehat: Certainly there are no definite links between language and anything else, the key word here being "definite"... The idea that language is a historic phenomenon totally unrelated (or necessarily unrelatable) to other historic processes occurring in the same time-frame seems somehow extraordinary... Let me quote a passage from Renfrew's lecture linked above:

The second piece of work, byLaurent Excoffier and his colleagues... used classical markers (gammaglobulin polymorphisms) sampled from a wide range of African populations, to establish a classificatory tree (dendrogram) for these African groups.
These were initially classified, at the sampling stage, according to the language family of the language spoken by each group. The dendrogram resulting from the genetic data yielded a grouping which had the effect of classing together populations which happened
to be speakers of the same language family. So a classification based on gammaglobulin analyses turned out to produce also a classification which was valid in linguistic terms.

So in this case (unless Renfrew is misinterpreting the data) we do see a link between language and something else... I mean wouldn't a test of this research be how well it reproduces known relationships between languages?
As Renfrew concludes:

The issue of knowability is less likely to be decided on a priori criteria than by the success or failure of endeavour in the above areas.

I guess I don't understand why you think it's in principle impossible to get something interesting from this sort of analysis...
posted by talos at 4:22 PM on December 7, 2003

I should clarify. Clearly, language is correlated with other things; Frenchmen overwhelmingly speak French, and they also eat cheese, drink wine (though less than they used to), prefer their politicians pompous and literate rather than folksy and anti-intellectual, and doubtless have much in common genetically (compared with, say, Germans). My point is that all of this is usable only in the present, with the facts available for examination; it is useless as a tool to investigate the past, because we do not know what language correlated with before historical records tell us. All attempts to localize or describe "speakers of Proto-Indo-European" based on words for 'beech' or pottery styles or genetic research fail, because we have no idea how PIE got from one place to another (via migration? conquest? trading relationships?) or what the word we reconstruct as 'beech' may have meant before it was carried into a region where the beech tree was common -- or rather, we have too many ideas, and no conclusive way to decide between them. I'm not saying Renfrew's studies are pointless; he comes up with all sorts of suggestive ideas, and some of them may even be correct. But we'll never know (barring the invention of a time machine), and I strongly object to suggestive ideas being presented as newly discovered facts (something journalists, needless to say, are addicted to). Sorry if I came off as more hard-assed than I actually am.

I might also add that Indo-Europeanists are regarded by many other historical linguists as hidebound conservatives because of our insistence on lots of verifiable data and strict adherence to neogrammarian rules; we, on the other hand, try not to be too smug about our rich stores of data going back millenia, and while we feel sorry for Malayo-Polynesianists and the like who have to work from records going back a few centuries at best and often from contemporary dialects that are written down only by linguists, we don't feel their paucity of evidence entitles them to play by looser rules and pretend they're discovering truth. There's only so far back you can look using the lenses of linguistics, and for most language families it's not very far back at all.
posted by languagehat at 8:52 AM on December 8, 2003

« Older It's snow much fun   |   Best quote of the week Newer »

This thread has been archived and is closed to new comments