Does digital writing leave fingerprints?
August 2, 2011 11:38 AM   Subscribe

"When legal teams need to prove or disprove the authorship of key texts, they call in the forensic linguists. Scholars in the field have tackled the disputed origins of some prestigious works, from Shakespearean sonnets to the Federalist Papers."
Decoding Your E-Mail Personality Ben Zimmer, of Language Log discusses the Facebook case and forensic linguistics in the NY Times.

More on forensic linguistics (and forensic stylistics) over at Language Log. For a much more in-depth discussion of the Facebook case, including a plethora of links, Language Log's post High-stakes forensic linguistics has got you covered.

Other linguistic and forensic stylistic analyses and commentary:
Bellweathers to the state of your relationship
Donald Foster's Wikipedia page, follow-up NY Times piece about the real author of Shakespeare's sonnets, the Ramsey murder, and more issues for Foster and the use for forensic stylistics in the anthrax letters case.
Neural Network Applications in Stylometry: The Federalist Papers (this is a JSTOR link to a first page, abstract and downloadable PDF)
Recording police questioning (A NY Times short opinion)
Using linguistic analysis to determine exactly how many events took place in New York on that morning in September: An excerpt from 'The Stuff of Thought' and the final World Trade Center insurance settlement.
Interview with Carole Chaski and the Keyboard Dilemma (Google Books link)

Other forensic linguistics resources:
The Forensic Linguistics Institute, including a corpus of texts!
Institute for Linguistic Evidence
International Association of Forensic Linguists
International Association for Forensic Phonetics and Acoustics
posted by iamkimiam (13 comments total) 65 users marked this as a favorite
 
OMFG! I just had a linguistic nerdgasm! It feels just as good as it sounds!
posted by Wilder at 11:43 AM on August 2, 2011 [2 favorites]


so sorry, sweetie, I forgot to say "Thank you!
posted by Wilder at 11:45 AM on August 2, 2011 [1 favorite]


The problem with a lot of this research is that the researchers overtrain for their data set. So you'll see a demo on the Enron corpus, which is a standard in the legal industry, and it will deliver outstanding results. Then you try it on some other body of data and it gets results that are not statistically significant. A similar problem exists with topic mapping, and all the proprietary tweaks to the current Hot Thing, the latent Dirichlet analysis algorithm.

By a fun coincidence, I'm building a forensic analysis network using bioinformatics software right now (or rather, the computer is...there's a great deal of number-crunching involved)...on the Metafilter infodump.
posted by anigbrowl at 12:11 PM on August 2, 2011 [5 favorites]


I'd venture to guess that my particular pattern of abuse of commas is unique enough that it wouldn't be any trouble at all to ID me.

I know I horribly abuse commas, but I'm like an addict, I just can't help myself.
posted by sotonohito at 12:14 PM on August 2, 2011


..on the Metafilter infodump.

The next phase of evolution is beginning.
posted by Sticherbeast at 12:18 PM on August 2, 2011


I know I horribly abuse commas, but I'm like an addict, I just can't help myself.

As long as you don't abuse you colon: that would be bad. Abusing your semi-colon; on the other hand; would be semi-bad.
posted by GenjiandProust at 12:19 PM on August 2, 2011 [2 favorites]


My thoughts while reading the first article: "I wonder what iamkimiam's take will be?".

d'oh!
posted by benito.strauss at 1:03 PM on August 2, 2011


My thoughts while reading the first article: "I wonder what iamkimiam's take will be?".

If it makes you feel any better, I thought the same thing.
posted by Zophi at 1:04 PM on August 2, 2011 [1 favorite]


Interesting. According to this theory I am Isaac Asimov when I write even numbered chapters but I'm Philip K. Dick when I write odd numbered chapters.
posted by localroger at 1:49 PM on August 2, 2011


Has anyone seen the three of you in the same room, eh?
posted by Anything at 1:58 PM on August 2, 2011


I would like the science of forensic linguistics to be surefire and accurate, but I have a paralyzing suspicion it is closer to phrenology than it is to physics.
posted by ricochet biscuit at 2:01 PM on August 2, 2011 [2 favorites]


iamkimiam: Neural Network Applications in Stylometry: The Federalist Papers

Busted link, me thinks.
posted by Hairy Lobster at 2:07 PM on August 2, 2011


I like to think that each of my Facebook comments and tweets has my own style.
posted by Lovecraft In Brooklyn at 12:18 AM on August 3, 2011


« Older How to Get Kids to Read in One Easy Step   |   Ooh, that smell. Can't you smell that smell? Newer »


This thread has been archived and is closed to new comments