We study techniques for identifying an anonymous author via linguistic stylometry, i.e., comparing the writing style against a corpus of texts of known authorship. We experimentally demonstrate the effectiveness of our techniques with as many as 100,000 candidate authors. [...] In over 20% of cases, our classifiers can correctly identify an anonymous author given a corpus of texts from 100,000 authors; in about 35% of cases the correct author is one of the top 20 guesses.On the Feasibility of Internet-Scale Author Identification[pdf] is a draft of a paper for the IEEE Symposium on Security and Privacy.
« Older Lucha: VAVOOM!... | An oral history of The Adventu... Newer »
This thread has been archived and is closed to new comments
posted by BrotherCaine at 5:40 AM on February 22, 2012