So on the development set of 270 cases, we get 80% correct in 17 seconds (a rate of 16 Hz), and on the final test set we actually do a little better, 87% correct (but at only 14 Hz). Therefore I can declare victory: the program is doing better than 80% and better than 10 corrections/second.That's really, really bad. Imagine if 20% of all of the miss-spelled words were guessed incorrectly. That would actually suck. It would mean one in five words couldn't be guessed, and that's even worse then firefox for me. (I'd say that it guesses right 90% of the time). Word and google must be better then 98% for me. And I'm a terrible speller.
import collections
model = collections.defaultdict( lambda : 1 )A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. "Google uses Bayesian filtering the way Microsoft uses the if statement," he said. That's true. Google also uses full-text-search-of-the-entire-Internet the way Microsoft uses little tables that list what error IDs correspond to which help text. Look at how Google does spell checking: it's not based on dictionaries; it's based on word usage statistics of the entire Internet, which is why Google knows how to correct my name, misspelled, and Microsoft Word doesn't.
« Older The Spirit Of Truth.... | Stem Cell Research: An intere... Newer »
This thread has been archived and is closed to new comments
He had me at "collection of Sherlock Holmes stories."
posted by waldo at 7:01 PM on April 10, 2007