Deep Network, Deep Flaw
April 14, 2017 4:39 PM   Subscribe

A recent study of neural networks found that for every correctly classified image, one can generate an "adversarial", visually indistinguishable image that will be misclassified. This suggests potential deep flaws in all neural networks (with a hint of a solution).

(Original study link seems to be blocked.)
posted by blue shadows (47 comments total) 56 users marked this as a favorite
 
Original study seems to be up on the arXiv.
posted by aiglet at 4:49 PM on April 14, 2017 [1 favorite]


Further, after examining these findings, it seems clear that they are only shocking given an unrealistic conception of what deep networks actually do, i.e., an unrealistic expectation that modern feed-forward neural networks exhibit human-like cognition.

Of course, this "unrealistic conception" is exactly what the futurists have been selling and funders have been buying....
posted by GenjiandProust at 4:53 PM on April 14, 2017 [27 favorites]


Humans are vulnerable to 'supernormal stimulus' AKA hyperstimulus; We take advantage of this fact every time we gulp down a diet soda. In some way I feel like these images must be supernormal stimulus for the poor little neural networks.
posted by the antecedent of that pronoun at 4:54 PM on April 14, 2017 [3 favorites]


.. I wonder what an insect would look like, after a million years of coevolving to be undetected by the algorithms running in garden-tending robots to detect them ..
posted by the antecedent of that pronoun at 4:56 PM on April 14, 2017 [58 favorites]


Echoes of Minsky and Papert's Perceptrons.
posted by leotrotsky at 4:57 PM on April 14, 2017 [3 favorites]


Thanks for the well-rounded post; the last link in particular was very interesting and seems to make a lot of intuitive sense.
posted by splitpeasoup at 4:58 PM on April 14, 2017 [1 favorite]


I don't totally get why this is a big problem for NN. There are lots of algorithms which work great with high probability in normal situations but which fail terribly under very specific artificial conditions imposed by an adversary. I feel like in the rest of complexity we're moving away from getting stressed about worst-case. Why not here? I would think the right question is: how often does it correctly classify image + RANDOM noise? I'd guess almost always?
posted by escabeche at 5:03 PM on April 14, 2017 [6 favorites]


. I wonder what an insect would look like, after a million years of coevolving to be undetected by the algorithms running in garden-tending robots to detect them

If the links are to be believed, they probably won't look much different to us.
posted by leotrotsky at 5:03 PM on April 14, 2017 [8 favorites]


I'm reading a textbook on AI written by cognitive neuroscientists (cf. Chomsky), and the very first chapter they're explicit in making the case that neural networks are a fundamentally flawed approach, so stuff like this is part of that bigger narrative; it's been an ongoing divide between people in the field. The difference is they make an information-theoretic counterargument as opposed to an empirical one as in this paper.
posted by polymodus at 5:16 PM on April 14, 2017 [7 favorites]


I don't totally get why this is a big problem for NN.

I'm inclined to agree. This is an interesting finding, but the "neural networks are broken" thing feels more like a problem of people not understanding that machine learning models are not foolproof. My job involves things like trying to predict when people will click on online advertisements. And then it involves explaining that any model I produce is going to be completely wrong the overwhelming majority of the time and we're using it to make money by beating out the naive strategy. But by the time this makes it to a sales guy, every time someone sees an ad, I know if they're going to click on it. Obviously, there are applications where you're dealing with events substantially more common than clicks where you expect your model to be right the vast majority of the time. But you still know it's not right all the time.

Of course, it would also be helpful it people understood "deep learning" wasn't some magical thing that will solve all their problems. People finally have the computing power to make use of some of the clever neural net ideas of the last 15-20 years and there's definitely stuff I want to take out for a spin in the hopes it works where other things have failed, but you better believe if I can get a logistic regression to work well, that's what I'm sticking in the ad server.
posted by hoyland at 5:29 PM on April 14, 2017 [17 favorites]


I don't think this necessarily represents a fundamental or even a deep flaw in neural networks, but I do think it suggests that the current networks are basically just classifying on a vote of local features (i.e., that they have learned how to do bag of words, albeit with better local features than before).
posted by Pyry at 5:35 PM on April 14, 2017 [3 favorites]


I think that the last link (hint of a solution) gets it right. These _are_ unnatural images that are tuned specifically to fool an existing model. But there's no specific reason why we can't train a NN to recognize (at least this class) of attack.

It does make one think about the parallels with optical illusions though. Human visual systems have a similar set of images that "break" us or make us see something that isn't there. Possibly it is inevitable that there will always be a way to break a recognition system, just because the model is so complex.
posted by sedna17 at 5:37 PM on April 14, 2017 [12 favorites]


As others are pointing out, this is basically an application of a pretty well-acknowledged flaw in NNs, and ML techniques in general: They really do seem almost intelligent, if all you care about is human-like successes. But if you're looking for human-like errors, they can start to be incomprehensible, even random.

The issue as I see it is with how we define "intelligence". One interpretation that's catching on is that these are algorithms which can make decisions independently. But where the results of human decision-making slip from "exactly right" to "pretty good" to "not great", an adversary can convince an ML algorithm to skip straight to "exactly wrong"— and with inputs that a human bystander will find unremarkable.

Maybe that won't be a big deal when we're classifying breeds of dogs, but one could imagine (as one of the links does) that the same techniques could be turned against an algorithm which classifies illegal internet traffic, explosive material in luggage, firearms held in the hand, or some other task for which an expert somewhere right now is showing off a slide with a bunch of graphed data and the title "Better than an untrained human".

I'm not saying that anything bad will happen, necessarily, but if it doesn't it'll be because we admit this is a pretty serious problem.
posted by emmalemma at 5:46 PM on April 14, 2017 [18 favorites]


These _are_ unnatural images that are tuned specifically to fool an existing model.

But theoretically we're training the neural network to recognize the same kinds of images as humans. So saying an image is unnatural when a human wouldn't know the difference seems counterproductive.
posted by dilaudid at 5:57 PM on April 14, 2017 [2 favorites]


Previously
posted by zinon at 6:18 PM on April 14, 2017


The "unnatural" modifications are artifacts of the particular form of network used to do image classification at present. The deep learning networks we've been seeing lately are structurally very rigid.

My ears pricked up when I saw the phrasing, though, because "adversarial" also means something else in the neural network world right now. There's an increasingly common technique where two networks are trained in parallel, one to evaluate a given input, and the other to craft an input that fools the other network...
posted by phooky at 6:24 PM on April 14, 2017 [7 favorites]


Apparently this is old news (from 2014?), but here's a great summary of the paper "Review on The Most Intriguing Paper on Deep Learning", Mar 2015.

My ears pricked up when I saw the phrasing

It's clearer from this summary, that they use "adversary" in a slightly different sense, more like a theoretical adversary as opposed to applications of neural networks. So I think the paper makes much more sense if it's read as dealing with theoretical issues, e.g.:

Deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend [sic].

The explanation is that the set of adversarial negatives is of extremely low probability, and thus is never (or rarely) observed in the test set, yet it is dense (much like the rational numbers, and so it is found near every virtually every test case.

By invoking continuity, density, etc., and the result being basically constructing a "neural net" analog of the halting problem (the existence of a weird counterexample that says something about the limits of a given computational model)--yes, the examples are from an applications perspective seems contrived, but in the theoretical sense this is super interesting (which this other blogger says as well) to know about. So as a piece of science, this result of an interesting blending of theory and experiment.
posted by polymodus at 6:27 PM on April 14, 2017 [3 favorites]


The link also mentions that finding an adversarial image for one network might also cause a misclassification in other neural networks. That's kind of interesting and if common has some real world problems. Because if you're using a neural network to, like, identify NSFW images on a forum, then it might be possible to make tools that will make images that will pass your NSFW detector.
posted by RustyBrooks at 6:30 PM on April 14, 2017 [3 favorites]


That is, you can take a neural network trained on a set of images and find an image, with small pixel values, that you can add to any other image that causes it to be misclassified with high probability. This universal adversarial perturbation is independent of the image you want the network to misclassify. What is even more worrying is that it is largely independent of the neural network being used.
Sounds like LSD to me, man.
posted by clawsoon at 6:41 PM on April 14, 2017


Does this mean Ray Kurzweil is going to STFU?

Fat chance.
posted by adam hominem at 6:44 PM on April 14, 2017 [5 favorites]


What I'm curious about is what happens if you take a bunch of these adversarial overlays and apply them randomly to the training data for a network - will it learn different heuristics for classifying the images that aren't susceptible to this type of attack?
posted by NMcCoy at 6:46 PM on April 14, 2017


If one incorporates the adversary into the network, then you get a deep convolutional generative adversarial network which learns by trying to fake itself out.
posted by ethansr at 6:56 PM on April 14, 2017 [6 favorites]


Usually (always?) neural networks learn to recognize images as matrices of pixels. I wonder if there's any other data type that more closely reflects how humans perceive images.
posted by dilaudid at 7:05 PM on April 14, 2017 [1 favorite]


dilaudid: Usually (always?) neural networks learn to recognize images as matrices of pixels. I wonder if there's any other data type that more closely reflects how humans perceive images.

One of the comments on the last article suggested converting images to vectorized form (e.g. Adobe Illustrator) instead of using pixel (e.g. Adobe Photoshop) images, suggesting that the emphasis on edges in vectorized images are closer to how our brains perceive objects. I don't know enough about cognitive science to judge this suggestion, but it certainly fits with stick figures showing up on cave walls as our earliest images and pointillism not being developed until a century ago.
posted by clawsoon at 7:26 PM on April 14, 2017 [2 favorites]


I came to say what sedna17 did about optical illusions. Humans make predictable mistakes with certain kinds of pictures and so do machines. The exact kind of pictures are very different and we can both trivially avoid the other one's mistakes.

This is so unsurprising it's weird that at least some of the discussion thinks it's interesting or challenging? As long as I remember (early '80s or so) there've been people jokingly observing that an intelligent machine would have to make mistakes.

Happily the links are more sophisticated and interesting than "Oh, we expected these newer neural networks to mimic human thought, mistakes and all." This from the second one catches everything I just said:

In this sense, it may be important to distinguish the capabilities of CNNs from human abilities. The authors make this point from the outset, and the argument is reasonable. As both Michael I. Jordan and Geoff Hinton have recently discussed, deep learning's great successes have attracted a wave of hype. The recent wave of negative publicity illustrates that this hype cuts both ways.

Exactly.
posted by mark k at 9:21 PM on April 14, 2017 [2 favorites]


Human visual systems have a similar set of images that "break" us or make us see something that isn't there.

Given that the only ones that really fool my brain are of the "stare at the high contrast color and mess up your color perception" variety that exploit the physical nature of the cones in the human eye, I have little doubt that if image recognition neural networks share anything with human pattern recognition (and I believe they do, to a very limited extent, though I am no expert..I program things, but not NNs), they can indeed be trained to not be fooled by such exploits.

I understand and can see what the optical illusions are supposed to do, but I can also see them for what they are. Probably because I spent a lot of time as a kid investigating them. The downside is that it is damn near impossible for me to see those "3D noise" posters whose name I suddenly can't recall that were popular in the 90s.
posted by wierdo at 9:54 PM on April 14, 2017 [1 favorite]


Magic Eye posters?
posted by Autumnheart at 11:42 PM on April 14, 2017 [1 favorite]


In a nutshell, neural networks have their own concept of the uncanny valley.
posted by simra at 12:17 AM on April 15, 2017 [2 favorites]


I think that computers make mistakes, surprise, is missing the point a little. It's about the unexpected (to most anyway) method in which they can be fooled, and the potential implications of it.
posted by blue shadows at 1:02 AM on April 15, 2017 [5 favorites]


The downside is that it is damn near impossible for me to see those "3D noise" posters whose name I suddenly can't recall that were popular in the 90s.

Stereograms.
posted by jaduncan at 2:21 AM on April 15, 2017


So … hypothetically … lets say I would like to sabotage a product demonstration of a new police robot. Could I then use this technique to generate a sound that, when projected into the demonstration room, is indistinguishable to humans from ventilation white noise but perfectly fools the robot's sound-of-gun-hitting-the-floor neural network?
posted by rycee at 2:32 AM on April 15, 2017 [3 favorites]


No. Whatever the sound is, it's likely to sound weird as hell.
posted by jaduncan at 2:56 AM on April 15, 2017 [1 favorite]


We'll only have arrived at true AI when the output is:

Of course I see a sailboat.
...
Don't you see a sailboat?
...
...
Dave?

posted by rokusan at 3:49 AM on April 15, 2017 [2 favorites]


As a few of the greybeards here might remember, I worked for an AI interest way back in ye olde eighties, and my job consisted in large part of trying to break the latest builds by confusing them with ambiguities in syntax, grammar, and so on.

I am saddened that only today do I realize that the title on my business card could have been Adversary.

Oh, well. Replaced by robots it is.
posted by rokusan at 3:57 AM on April 15, 2017 [13 favorites]


"Could I then use this technique to generate a sound that, when projected into the demonstration room, is indistinguishable to humans from ventilation white noise but perfectly fools the robot's sound-of-gun-hitting-the-floor neural network?"

If I understand this correctly, I think that's got it backward, since the article describes generating false negatives rather than false positives.
posted by dbx at 6:54 AM on April 15, 2017


It's not making it fail to identify - it makes it identify something that isn't present.
posted by idiopath at 7:53 AM on April 15, 2017


The "optical illusion" interpretation is philosophically interesting, and there is a certain Hofstadterian sense in which any complex system must have a complex weakness, but I think it's an overburdened analogy. ISTM that illusions target our theory generating stack, making us come to beliefs about what we perceive that are not, on a second inspection, believable. We wrongly assume the subject is in motion, or is projecting toward us, or has the same or different coloration as another object. In cases where we are confused about what the subject is, it's often a paradox — we would argue that the image actually does have visual attributes of two objects, and it would be irrational to ignore that evidence, whether our first impression was correct or not.

Neural networks don't have any of that ability to come up with ideas about what they see. If our model is human vision, they're only the first layer or so, taking raw optical data and turning it into usable numbers. And the inputs that "fool" them are not what we could normally call "deceptive" images, unless we replace our entire epistemology with that of non-theory-generating automata. I mean, look at it— the picture of a car is a picture of a car. Whether the noise overlaid is random or malicious, we should still rationally call it noise, not information.

So I would say these aren't illusions as such, but hallucinatory basilisks. They don't trick the algorithms, they compel them to see something which isn't there. There's nothing really like that in humans — outside of, well, hallucinations — but the closest approximation I can think of is the visual blind spot. However much you try to reason your way around that part of your vision, what you see there is what it is. Your brain makes up the perfect lie long before you have a chance to form ideas about it.
posted by emmalemma at 8:55 AM on April 15, 2017 [5 favorites]


emmalemma: This may be down to semantics, and my own lack of understanding, but there are some optical illusions that AFAIK work at a very low perceptual level in humans. One example is motion illusions such as this this, or this. Or take the checker shadow illusion. the description "they compel them to see something that isn't there" seems to fit as a description for what's going on in that image.
posted by Grimgrin at 10:04 AM on April 15, 2017 [1 favorite]


This paper used to be a lot more fun, but the idea is still interesting. Asking thousands of people whether a mishmash of noise could be an image of X allows a you to reconstruct of an "imaginary classifier" of X. In other words if people look at static and you ask them whether the static looks like a ball, you might eventually collect enough information about ballness to generate a rough picture of a ball.
posted by ethansr at 10:54 AM on April 15, 2017 [2 favorites]


I'm ok with mistakes!
posted by oceanjesse at 11:11 AM on April 15, 2017


Grimgrin, you're quite right to point out that there are motion illusions that don't really fit my model. I can only defer by saying they don't seem nearly as interesting or thought provoking in this context.

I think the "grey square" illusion is a crystallization of my argument. It's not a basic error; in fact it relies on my having a model for physical space complete with deep theories of light, shadow, and color. Under that model, my perception that the squares are different colors makes total sense. That those theories leave me incapable of even seeing the true colors of the pixel regions on my screen is partly a perplexing flaw in perception, but moreso a surprising demonstration of the fact that my visual system doesn't really perceive wavelengths of light, but something much more conceptual.

The reason I don't think that's semantics is because not only are neural nets not tricked in that way, they have no place to be tricked. There's nothing we could reasonably call a theoretical framework for their observations to be laden by. If there were, I think we'd find that their errors in judgment would tend to be much more defensible, similar to my excuse for misperceiving the colors on the checkerboard.

Now, if you can convince me by other means that that's not at all a rendering of a cylinder on a checkerboard, but in fact a more-or-less accurate reproduction of Washington Crossing the Deleware, I'll be forced to concede the point :)
posted by emmalemma at 11:41 AM on April 15, 2017 [2 favorites]


Usually (always?) neural networks learn to recognize images as matrices of pixels. I wonder if there's any other data type that more closely reflects how humans perceive images.

This is actually somewhat debatable, but at the level of the retina the best answer is probably center-surround receptive fields. Briefly, the retina contains ~100 million photoreceptor cells, which provide inputs to ~1 million retinal ganglion cells through a network of excitatory and inhibitory connections, such that (most) RGCs have a small region of the retina which excites them, and a ring-shaped surrounding region which inhibits them. An image which uniformly excites both the center and surround of an RGC's receptive field will fail to excite the RGC; only an image which partially covers the RGC's receptive field, differentially activating the center and surround, will drive it. This makes RGCs excellent edge detectors, which is one of the fundamental computational principles of the visual system. (Actual RGCs are more complex and varied than this simple description, of course, but is a reasonable first-pass explanation.)

A slightly longer answer is that it can be somewhat dangerous to apply concepts like "data type" to biological neural networks. It encourages us to think of the representations used by the nervous system in terms of concepts familiar to us from digital computers, but while the brain is a computer, it is not digital. The more generic concept of "representation" is safer, but it is very important to keep the distinction between the kinds of representations used in analog computers like brains (and retinas) versus the kinds of representations used in digital computers very clear. A center-surround receptive field in an RGC is a representation in the sense that it formally describes the relationship between the structure of the visual stimulus and the physiological properties of the RGC, and how this relationship applies to the computational problem of vision. Intuitions about "data types" from digital computation as things which can faithfully stored, transmitted, and manipulated tend to lead to fairly naïve misunderstandings about computation in biological neural systems. (See, for example, Ray Kurzweil.)

Some additional complexity is introduced by the fact that the retina and the visual system in general do not operate like cameras, capturing high-fidelity images for storing and processing, so even a concept like the center-surround receptive field seriously understates the complexity of the representations required for vision. For example, the retina is not uniform, and the size of receptive fields varies from very small near the center (the "fovea"), providing high acuity, to very large in the periphery. We are not generally conscious of the amount of dynamic scanning required to perceive even static images, but in general our normal vision depends on our ability to compare multiple "views" of the same visual scene by reorienting our eyes, moving the high-acuity foveal field across the visual scene. The computational principles behind this kind of temporal integration of visual information are much, much more complex, and still not very well understood.
posted by biogeo at 12:14 PM on April 15, 2017 [12 favorites]


I think the "grey square" illusion is a crystallization of my argument. It's not a basic error; in fact it relies on my having a model for physical space complete with deep theories of light, shadow, and color.

That is not in general how the grey square illusion is thought to work, unless the hard-wired structure of the early visual system is a "model for physical space complete with deep theories of light, shadow, and color." The grey square illusion is generally thought to occur because our visual systems are optimized for edge detection and contrast comparisons, with interior regions of low-spatial-frequency smooth variation being subject to "in-filling." Of course, the reason the early visual system has the structure and properties that it does is because it has shaped by evolution in an environment in which the physics of light, shadow, and color apply, so in this sense it absolutely does have such a model of optical physics built in, but I'm not sure I would characterize this model in terms of "deep theories" of any sort.
posted by biogeo at 12:24 PM on April 15, 2017 [2 favorites]


One of the comments on the last article suggested converting images to vectorized form (e.g. Adobe Illustrator) instead of using pixel (e.g. Adobe Photoshop) images, suggesting that the emphasis on edges in vectorized images are closer to how our brains perceive objects.

I'm not as familiar with deep learning artificial neural networks as I'd like to be, but my understanding is that they generally just throw the images directly at the deep learning networks. I would not be at all surprised if it turns out that the adversarial images found here could be easily ignored by a network with a "hard-wired" retinal-biomorphic layer as its first layer, which would do more or less the same thing as this vectorization suggestion. Given that this more "engineered" solution is already known to work quite well at projecting natural images onto a useful set of basis vectors, it should be great at essentially "ignoring" the kind of perturbations used in these adversarial images.

But of course as others have pointed out, these findings are interesting not so much because "OMG ANNs are dumb!" but because they are exploiting an interesting quirk of ANN training and computation. So what I think would be really interesting to know is, is there a set of adversarial images for deep learning networks that use center-surround-type input filters, and what do those look like?
posted by biogeo at 12:38 PM on April 15, 2017 [3 favorites]


Your point is taken as well, biogeo! To be clear, by a "deep theory" of light and shadow I don't mean to imply cognition— just that our brains apply a model which is suited to determining the shape and color of objects in consideration of lighting conditions, and which hasn't (yet) been fully replicated by computer vision. This, I'll grant, is semantics :)
posted by emmalemma at 12:46 PM on April 15, 2017 [2 favorites]


Can I paint these pixel patterns on my face to avoid detection by facial recognition systems?
posted by MtDewd at 7:15 PM on April 15, 2017 [1 favorite]


Can I paint these pixel patterns on my face to avoid detection by facial recognition systems?

Yes(ish)
posted by ethansr at 10:01 PM on April 15, 2017 [1 favorite]


« Older “If you don’t want to know the scores, please look...   |   Squirrels & Squee is not just an indie band Newer »


This thread has been archived and is closed to new comments