Brain, Word, and Man: Machine
December 14, 2016 1:36 PM   Subscribe

The story of how Google Brain improved Google Translate using neural networks. In nine months, the new technique was able to improve translations more than a team of human engineers had managed in ten years. This article describes how this was possible, and what it means for the future of artificial intelligence. Featuring a crash course in the history of A.I.; supercillious cats; and a cameo by John Searle. (SLNYT).
posted by Diablevert (21 comments total) 46 users marked this as a favorite
Microsoft’s new plan is to flood your entire life with artificial intelligence.

If you want a picture of the future, imagine Clippy stamping on a human face — forever.
posted by 1970s Antihero at 2:44 PM on December 14, 2016 [11 favorites]

Could this mark the start of the Singularity? John Connor, where are you?
posted by oluckyman at 3:32 PM on December 14, 2016

one user spent 9 hours and 53 minutes in one day actively talking with Zo, the company’s latest AI-powered chatbot

Did they check that it wasn't just another chatbot?
posted by effbot at 3:33 PM on December 14, 2016 [4 favorites]

Repeated Japanese produces... interesting results.

posted by vibratory manner of working at 4:05 PM on December 14, 2016 [2 favorites]

What a great article! Thanks for posting this!
posted by Greg Nog at 5:14 PM on December 14, 2016 [4 favorites]

If you want a picture of the future, imagine Clippy stamping on a human face — forever.

I suspect it won't be anything like that, just Clippy presenting you the options you would have preferred yesterday, reinforcing the person you used to be at the expense of the person you might want to become. Autocomplete that's learned all your old verbal tics and misspellings. A nav system that drives you past the bars and liquor stores you used to frequent before you were struggling to stay sober.
posted by mhoye at 5:54 PM on December 14, 2016 [13 favorites]

What a great article! Thanks for posting this!
posted by montag2k at 5:59 PM on December 14, 2016

What a gr at arti le! Thanks f r OTTOMAN GOAT HYBRID
posted by Greg Nog at 6:31 PM on December 14, 2016 [14 favorites]

Possibly I should have made clear at the outset that this was def a long read. But I thought Lewis-Kraus had the best attempt at explaing how neural nets work for a general audience that I've come across, and I've seen a helluv a lot of writers try.

It's interesting to me in lots of ways, but one of the is them is the dumbbell problem, and especilly how it applies to unlabelled sort of relates to two of the Rumsfeld connundrums, known unknows and unknown unknowns. With the dumbell pictures, you ask the AI to show you dumbells, and it's turning up vague pink blobs, you can see it's fucking up somehow, even if you don't know how, and that at least gives you something to work with. You know the data you gave it, you can go back and correct, or try. What about the biases inherent in the world? Inherent in large organic data sets like you tube videos or Siri commands? Those are unknown unknows, and in a lot of cases it's not going to be nearly so easy to catch a mistake....say you had an A.I. that suggested songs in a genre, but always ignored songs with female singers or with Tennessee-based musicians or those where the bass hit a certain string of notes. How would you notice that? Who could ever pick up on the fact that it was giving you pink-blob-type answers to your dumbell-search-type questions? For jumped-up monkeys, human pattern recognition skills are pretty darn good, but consistently recognizing the abscence of a pattern where one should be may be beyond us. It takes a Holmes to deduce solution from a dog that didn't bark.

Yet: This shit is so damn useful. We might be getting pretty damn close to the Babelfish. It's going to be everywere, and no one will know how it works, exactly, not even its masters and creators. Feels like going back to trials by ordeal. Want a mortgage? How long shall we put you in the slammer for this crime? Enter your plea and wait for an inscrutable sign from the magic box...
posted by Diablevert at 7:55 PM on December 14, 2016 [4 favorites]

just Clippy presenting you the options you would have preferred yesterday

This so much, the tools are getting cheap and easy to use but the basic law of GIGO holds. A quite large percentage of big data AI is pretty basic statistics (most of the other slice is linear algebra) and we know how wretched stats can be used. And please don't get worried about the singularity yet, language translation is utterly amazing, stunning it's come this far, but it remains a bounded problem with some very clear constraints. The moment the littlest bit of "common sense" is needed you have Diablevert's pink dumbbell problem and the solution fails to converge.
posted by sammyo at 8:04 PM on December 14, 2016 [2 favorites]

The moment the littlest bit of "common sense" is needed you have Diablevert's pink dumbbell problem and the solution fails to converge

That's Noam Chomsky's take, but the article's is quite different. The pink dumbell problem isn't that the tech isn't producing useful solutions. It's producing useful solutions whose flaws we can't identify. As a whole, the image search AI that had the dumbell thing produced great results, it just had this weird error due to the data it had been trained on, and that error was spotted and corrected, because it was obvious to humans. It's entirely possible for a deep learning app to produce great, better than human results 99% of the time, and a few errors. But we may not be able to spot the errors.
posted by Diablevert at 8:33 PM on December 14, 2016 [3 favorites]

The huge event this year was adversarial training for generative nets. The idea is that you train two neutral networks, one as a generator, and another (the discriminator) trying to tell 'real' things apart from what the generator is producing. The generator learns to try to beat the discriminator, and the discriminator tries to outsmart the generator, and in time, the results are either totally realistic, or at least realistic in almost eerie ways. The unreality seems more a problem of the initial training days than the technique, though.

The idea is both fundamentally new and somehow unsurprising. We learn so much through mimicry, and battle our imposter syndromes, and fake it till we make it.

The piece that's missing, imho, is just experience. The first book on ai I worked through, in 2002 or so, discussed the problem of perception. Robots just don't have terribly rich sensors, picking up a tiny fraction of what we do. There's a big difference between curated training sets and unmediated experience, just as there's a big difference between a video and a collection of a million images.

There are so many open problems. The latency issue they talk about in the article is a great example of one. But the progress is coming so quickly now, it's really kinda breathtaking.
posted by kaibutsu at 12:13 AM on December 15, 2016 [4 favorites]

Fantastic article, both for its contents and as a piece of journalism (I'm enjoying the author's earlier article on travel photography as a result). As a computer science undergrad in the late 1980s, I took a course on AI, which involved building our own expert systems; it seemed obvious what a challenge that was always going to be, compared with some of the promising machine-learning alternatives. I now see that this was a brief window when neural networks were taken seriously, before their proponents were cast into the wilderness for the next decade and a half. That seems crazy to me, as someone who left the field. In the 1990s I was reading neuroscience theories about how minds emerge in an evolutionary way. Surely these theories and AI research would cross-fertilize each other, leading to new insights in both domains? But it seems that for a long time they didn't. Maybe they will now.
posted by rory at 2:16 AM on December 15, 2016 [1 favorite]

I guess the dream of teaching symbol-manipulating systems all the facts about the world and waiting for AI to emerge is dead. But it's a shame. There's a certain grandeur to it. Heavy things are hard to move, cats don't like the bathtub, "collar" can mean the thing on your shirt or an arrest, and so on. And so on.
posted by thelonius at 3:59 AM on December 15, 2016 [1 favorite]

We're using the Google Website translator on one of our websites and the results are still underwhelming, even for English->French translation, which (according to the article) was "so good that the improvement won't be obvious", which is really a bold lie: Google Translate is a fantastic tool and I'm glad that it exists, but it returns a lot of garbage (I just received a mail saying that the hausa translation is "meaningless").

The bizarre thing is that when I cut and paste the raw text of a page in Google Translate, the English->French translation is actually better than it used to be, but pasting the URL in Google Translate or using the Translator widget returns a different and much worse translation. Using the Translator Toolkit also returns the bad translation. It looks that Google is rolling out Neural Machine Translation only for Google Translate but not for its other translation tools. I can't find anything that explains these differences.
posted by elgilito at 4:15 AM on December 15, 2016

And still so bad at Dutch - English.

At least it's free.
posted by SpacemanRed at 4:23 AM on December 15, 2016

elgilito: We're using the Google Website translator on one of our websites and the results are still underwhelming, even for English->French translation

Given that the Rosetta Stone the article mentions for English<>French is Canadian parliamentary records, I'm expecting that your main problem is too many occurrences of "fuddle duddle".
posted by clawsoon at 7:23 AM on December 15, 2016 [1 favorite]

I'm expecting that your main problem is too many occurrences of "fuddle duddle".
I wish, but these are straigthforward scientific/technical texts. The main problem is a lack of consistency, particularly when a noun or short noun phrase is translated differently throughout the text. One thinks that the algorithm would pick a candidate translation and stick to it, but it doesn't, so a text like "[STUFF] blah blah but [STUFF] blah blah and then [STUFF] blah blah" is renderered as "[GOOD TRANSLATION] blah blah but [NONSENSICAL TRANSLATION] blah blah and then [INACCURATE TRANSLATION] blah blah". It also uses French articles at random (I guess they sound cool, like "Le Big Mac" in Pulp Fiction). The GNMT really improves grammar and syntax but it still suffers from such inconsistencies.
posted by elgilito at 8:40 AM on December 15, 2016

The allusion to Baidu's AI efforts made me wonder whether or not the authorities in China (or other authoritarian regimes) have started thinking about training AI systems to police dissent, far more efficiently than armies of human informants and censors can. Could one train a vast neural network to detect ideas which can, however obliquely, be incompatible with Socialism-with-Chinese-characteristics, or Iranian Islamism, or Russian National Bolshevism, or British values, or the values of a conservative Christian school?
posted by acb at 8:46 AM on December 15, 2016 [4 favorites]

This is a fantastic article, and I am so grateful you posted it - I probably would never have seen it otherwise.

It was so beautifully written, from the overall framing to the memorable portraits of all the individual players to the delightful anecdotes (the Dean facts were my favorite). And it linked to all the seminal papers! So wonderful. I don't think I'd read anything by the author, Gideon Lewis-Kraus, before, but I immediately wanted to find more.

(Sadly, I, too, am getting unimpressive results from Google Translate at the moment, Japanese to English, but perhaps it's not really rolled out all the time everywhere?)

This was such a thoughtful, thought-provoking, fascinating piece. Thank you so much for sharing it with us.
posted by kristi at 10:01 AM on December 15, 2016 [1 favorite]

these are straigthforward scientific/technical texts

Technical things just fall apart in translation software. I imagine it is because of jargon- words that have specific meanings only within that topic. I hope this problem is tackled soon. There are a lot of good projects that never cross language barriers.
posted by bhnyc at 10:12 AM on December 15, 2016 [1 favorite]

« Older Enigmas, Logogriphs, Charades, Rebuses, Queries...   |   Simply Having a Ghost-Induced Nervous Breakdown Newer »

This thread has been archived and is closed to new comments