Enhance 224 to 176. Move in. Stop. Pull out, track right. Stop.
September 1, 2016 9:42 AM Subscribe

You know that thing cops do in movies and TV to enlarge small images that's impossible? Well now, with Python + Tensorflow, it's (sort of) possible to enhance image. It still won't recover information that's not there, but it'll make a pretty good guess.
posted by signal (60 comments total) 25 users marked this as a favorite

Obligatory: "Enhance."
posted by stannate at 9:58 AM on September 1, 2016 [3 favorites]

It needs more training to deal with big glasses, clearly.
posted by GuyZero at 9:58 AM on September 1, 2016 [1 favorite]

Higher resolution, but of different people. So yes, I guess that can be called enhanced.
posted by Kiwi at 9:58 AM on September 1, 2016

It needs more training to deal with Skrillex, clearly.
posted by mcstayinskool at 10:01 AM on September 1, 2016 [1 favorite]

"As the jury can see, the face of the accused is an exact match for that in the reconstructed enhancement of the crime scene video."
posted by straight at 10:03 AM on September 1, 2016 [18 favorites]

Someone should take that scene from Blade Runner and dub the word "Guess" over the word "Enhance".
posted by I-baLL at 10:04 AM on September 1, 2016 [7 favorites]

Yeah, but can it create a GUI in Visual basic to track the killer's IP address?
posted by Mayor West at 10:06 AM on September 1, 2016 [19 favorites]

It's a neural net. It can do that *while* it reconfigures the main deflector array to emit a stream of kremulon particles.
posted by ROU_Xenophobe at 10:08 AM on September 1, 2016 [10 favorites]

It needs more training to deal with Skrillex, clearly.

Who doesn't?

This is pretty impressive if it's real. My concern about these kinds of things being real is when someone is arrested or has their name smeared based on an "enhanced" picture. There is enough difference to look like different people in some of them.

I can't see how it could ever be legal evidence, but who knows?
posted by bongo_x at 10:08 AM on September 1, 2016 [2 favorites]

Yeah. This sort of stuff really blurs the line between recording and simulation. The data is missing, so the algorithm fills it in with some plausible details. At one level the photograph represents the camera's view of the actual scene, but at another it's just a plausible guess as to what the scene might have looked like, if the camera had been able to see it. Where is the line? If a photo has been treated in this way, can it be trusted to provide a record of what was really there? Once these algorithms improve, it may become difficult to tell whether this kind of reconstructive enhancement has been applied or not, casting the veracity of all photos into doubt.

Obviously, in many situations this is immaterial. Art photos and casual snaps don't have to draw meticulous boundaries between reality and fiction. But what about when it does matter, like in a courtroom setting? When this sort of technology becomes ubiquitous (which it will, because it solves one of the biggest, most persistent, most ubiquitous problems in photography and does it in a really slick way) what does that mean for photographic evidence, the historical record, scientific data, news footage, things like that?

On a broader level, what does it do to our collective memory of the past? Photo and video are the main ways that we catalog the things that happen to us. What does this do to that? How will that change? Will we even notice?
posted by Anticipation Of A New Lover's Arrival, The at 10:18 AM on September 1, 2016 [7 favorites]

I could see this stuff being legal evidence. The status quo doesn't account for this kind of techno-magic, since it doesn't really exist yet. Even if the right decision eventually gets made and this kind of manipulation is banned, it could be years or decades by the time that happens, and in the meantime a lot of people could get sent to prison. It may also eventually become impossible to tell whether this sort of "enhancement" has been done to a recording.

Mostly though, I could see this getting ignored and the status quo preserved simply because it is in the interests of the carceral state for that to happen. This kind of enhancement will make it easier to put people in prison, and we should not delude ourselves into thinking that "putting [some kinds of] people in prison" is not the main purpose of today's court system, at least in the U.S. There will be people lobbying to allow this stuff into evidence. They will be muddying the waters, creating the impression in the minds of technologically-inept decisionmakers that the enhancements do reflect reality without ever quite saying so, and whispering "all you have to do is leave things as they are, and our courts can use this powerful new tool to see that justice is done."

This tool, suitably refined, plays right into the hands of the surveillance state and the carceral state. For that reason alone I am sure that we will see it abused.
posted by Anticipation Of A New Lover's Arrival, The at 10:37 AM on September 1, 2016 [4 favorites]

The Classic Enhance
posted by oneswellfoop at 10:46 AM on September 1, 2016 [6 favorites]

I think there's a reason why we're all thinking of the creaking "enhance" TV trope when we see this. How often, these days, is inadequate pixel resolution a major problem that needs to be solved? In my personal photography, that used to be a big deal. Lately, when I've got lousy photos, it's usually a lighting problem that needs a completely different kind of fix.

Aren't security and dashboard cameras in a similar position, with higher resolution year over year? Is "this image is too pixelated" really usually the problem, any more?

Which is just to say: this is a super awesome demo, but I'm not sure it's societal implications are either wonderful or terrible.
posted by gurple at 10:51 AM on September 1, 2016 [1 favorite]

It doesn't matter what the technology actually does. Prosecutors and District Attorneys will tell juries that it does magic, and juries will vote to convict based on that.
posted by Pope Guilty at 11:02 AM on September 1, 2016 [3 favorites]

I had thought the big camera makers offered forensic/legal-evidence models that marked original image files with digital signatures, specifically so that edits could be identified as such unequivocally. But I'm not able to find any reference to such a thing just now.

In principle, at least, I think such a thing would be possible. And that's "just" software, so it could be applied to the whole camera market pretty inexpensively.
posted by Western Infidels at 11:05 AM on September 1, 2016

Depends what you're trying to do, gurple. With things like security and dash cams, even recent ones, where there isn't a photographer behind them massaging things and making choices about what to focus on in the scene, you get a lot of poor-quality recordings of things that are far away, or not centered in the scene, or not lit well. A distant subject might look fine in an artistic photo, but is there enough detail in their face to convince a jury of their identity?

Even with artistic photos, resolution can still be a major concern. Maybe not so much in all genres, but think about something like wildlife photography where there's a strong emphasis on getting razor-sharp focus and tons of detail for subjects that are often really far away. Even with current pro-grade cameras and the best telephoto lenses, there are frustrating limits. Look closely at the cover of something like Audubon Magazine and you'll often see that the resolution of the photo has been pushed right to its limit—there's an impression of infinite detail, but if you look deeper it's not actually there. Anything that lets you bring a subject closer to the viewer is a boon. If you can be a little more aggressive with the ol' crop-and-zoom while still maintaining acceptable sharpness and detail, that's great.
posted by Anticipation Of A New Lover's Arrival, The at 11:08 AM on September 1, 2016 [2 favorites]

Also, a lot of the time the effects of poor lighting and/or exposure are functionally very similar to a lack of resolution. Data is missing, detail is lost. I am sure that if we can interpolate detail into a digital zoom, doing the same for a blown highlight or a blacked-out shadow is only a step away.
posted by Anticipation Of A New Lover's Arrival, The at 11:11 AM on September 1, 2016 [1 favorite]

A neural network is fundamentally a simulation of a sort of dual transient (or less transient, in the case of big RNN) chaotic dynamical system, taking advantage of the fact that any dynamical system with positive Lyapunov exponent will end up creating entropy and so you can have the backwards pass synchronize with some real data. So it's fundamentally a confabulation, no ifs ands or buts about it.

There exists some extraordinarily old and pretty meh work in lots of picture confabulation and image reconstruction, especially in fractal compression (but lots in normal compression) and image reconstruction, which is pretty interesting. There is a one-to-one correspondence between any statistical inference and statistical compression methods (there are no non-statistical compression methods, deterministic as some may be).

With the FastFood methods and Unitary RNN and ACDC and things like that, (if you are to bet between an O(dn log n) algorithm versus O(dn^2) algorithm and huge amounts of money - bet on the O(dn log n) algorithm, although there is no method to get it to speed up depth yet), I think that neural network methods will be as commonplace at the end as compression, and as quotidian.
posted by hleehowon at 11:12 AM on September 1, 2016 [2 favorites]

oh, smeg
posted by jrishel at 11:13 AM on September 1, 2016 [1 favorite]

Neural networks are terrible things that will hurt people in the service of law enforcement.

I agree. We've been told point blank that PCA and factor analysis is a non-starter to go to court with. That Bayesian inference is impossible to get accepted in court. That we have to continue using dumb old Students-t tests, with all the well-known weaknesses and fragility that has simply because that's the only statistical test that has precedent for successful convictions in law.

I can't imagine any prosecutor I've worked with ever wanting to take on neural net interpretations of photographic evidence.
posted by bonehead at 11:14 AM on September 1, 2016 [1 favorite]

hmm, dual isn't the right word for it, isn't it? Two simultaneous dynamical systems?
posted by hleehowon at 11:15 AM on September 1, 2016

It seems to me that, if the goal is to help humans (or computers) recognize faces, then the thing to do with any algorithm like this would be to do a big study where you use the software to do reconstruction and then see if that reconstruction helps people (or computers) pick the right face instead of similar faces.

If it does that consistently, then the magic is useful and not particularly evil. If it doesn't, then it's not useful and potentially all kinds of evil.

Study design would be super important, of course, but implementation would be super easy and cheap, with centralized standard datasets and with Mechanical Turk or something to provide the large numbers of humans.
posted by gurple at 11:18 AM on September 1, 2016 [1 favorite]

This sort of stuff really blurs the line between recording and simulation. The data is missing, so the algorithm fills it in with some plausible details

Digital cameras have to do this already *to some extent*; pixel data is interpolated and reconstructed from samples provided by the sensors.
posted by a snickering nuthatch at 11:20 AM on September 1, 2016

casting the veracity of all photos into doubt

That can only be a good thing. Photos are already not 1:1 representations of nature; they're 2D images of a 3D world, plus their colors are not 1:1 matched with reality, etc etc. Perhaps the mistake was ever believing photos/video were admissible evidence.
posted by eustacescrubb at 11:31 AM on September 1, 2016 [1 favorite]

True eustacescrubb, but when we run into the limits of representation with traditional photographic techniques, it usually happens in a fairly predictable way that can be recognized and accounted for. If a neural network is just filling in details with its personal idea of what those details ought to be, that's a bit harder to manage. "No data" or even "systematically skewed data" is much easier to deal with than "plausible but fabricated data."
posted by Anticipation Of A New Lover's Arrival, The at 11:38 AM on September 1, 2016 [1 favorite]

I can't imagine any prosecutor I've worked with ever wanting to take on neural net interpretations of photographic evidence.

I tend to think you're right.

But the optimist in me says that something like this is pretty easy to defend against: run a couple of scaled down photos of things you have high-res versions of through it and see what it comes up. Maybe, oh... I dunno... a scaled down photo of the prosecutor pretending that it's evidence? My guess it that "reasonable doubt" is pretty easy to create if you do that.

Ooooor I could just be naive and optimistic. Sadly I think that might be the case...
posted by -1 at 11:39 AM on September 1, 2016

This is great, as long as criminals stop and pose for a well-oriented headshot. I'm sure they'll comply.
posted by howling fantods at 11:44 AM on September 1, 2016

"Your honor, the report summary does indicate the Neural Net placed my client at the scene of the crime with 88% confidence, but if you look closely you'll see he's placed as the teller, not the suspect."
posted by Mr.Encyclopedia at 11:49 AM on September 1, 2016 [4 favorites]

This is great, as long as criminals stop and pose for a well-oriented headshot. I'm sure they'll comply.

That's kind what worries me about it, TBH. The approach is obviously *very* training dependent ('cause that's how neural nets roll), and I could see the lack of training material being a serious issue. I mean... the DMV isn't about to be all like "OK, look at the camera. OK, good... now look away from the camera while pretending you're running from the police... OK good, now look to the side and pretend that you're concealing a stolen item..."

Current facial recognition is far from foolproof. Even the best modern systems get fooled with Dazzle makeup pretty easily (or at least the ones I've seen/read about). I can't imagine that this system would be any less immune. If anything it might be a bit more susceptible since the lack of resolution in the intended use case would make it even harder to distinguish between makeup and face...
posted by -1 at 12:00 PM on September 1, 2016

If I get how these things work, isn't this pretty much "find the face(s) in the training database that most match the input"?

So, I dunno, it seems you'll get a lot of convictions of the people who posed for the training photos.
posted by zompist at 12:16 PM on September 1, 2016 [3 favorites]

So, I dunno, it seems you'll get a lot of convictions of the people who posed for the training photos.

Good enough reason right there to start scrubbing public photos of yourself from the Internets.
posted by sutt at 12:18 PM on September 1, 2016

One thing that I think is being somewhat elided here though is that this isn't really detail enhancement. It's just facial recognition and face swapping, with the new part being that there's an algorithm that chooses what face to swap in in order to provide the missing features. It's not going to show you the leaves on a tree or the facets in a bumblebee's eye or anything. Like a lot of fancy image-processing software of late, this is strictly faces-only and doesn't seem like it would easily generalize to photography in general.

Bit disappointing, really.
posted by Anticipation Of A New Lover's Arrival, The at 12:27 PM on September 1, 2016

I can't help wondering what happens if you feed it a picture of classic Mario...
posted by BinaryApe at 12:27 PM on September 1, 2016 [1 favorite]

It's just facial recognition and face swapping, with the new part being that there's an algorithm that chooses what face to swap in in order to provide the missing features.

Interesting, because that one of my other concerns but I didn't really understand what was happening. So what if a prosecutor decides to have the picture matched with a suspects face to "prove" it was them?
posted by bongo_x at 12:37 PM on September 1, 2016 [1 favorite]

you can see how this works in Total Information Awareness Baltimore. they don't report the surveillance to court, but use it to gather admissible evidence after they have "identified" a suspect.
posted by ennui.bz at 12:40 PM on September 1, 2016

you can see how this works in Total Information Awareness Baltimore.

"This" is some python code that was uploaded 6 days ago to Github. The thing going on in Baltimore has nothing to do with this.
posted by tonycpsu at 12:42 PM on September 1, 2016

Judge: How does this algorithm work?
Prosecutor: Well it's just a bunch of matrices multiplied together
Judge: Where did the matrices come from?
Prosecutor: We trained them
Judge: Get out
posted by RobotVoodooPower at 1:08 PM on September 1, 2016 [1 favorite]

The new Jason Bourne has one of these "enhance" scenes early in the movie and I sighed so hard I nearly fell over. If my wife hadn't dragged me to the movie I would have walked out right there. As a photographer who uses cameras capable of producing extremely high resolution images these "enhance" scenes make me insane.
posted by photoslob at 1:17 PM on September 1, 2016

I remember this scene in Blade Runner. They were enhancing around corners . . . .
posted by birdhaus at 1:23 PM on September 1, 2016

Law enforcement already uses something similar to this--facial composite sketches. Just because it wouldn't be submitted as evidence in the courtroom doesn't mean it couldn't be useful.
posted by HighLife at 1:24 PM on September 1, 2016 [1 favorite]

and in Enemy of the State, they somehow had a recording with which they able to rotate the camera after the fact, and see things that were previously blocked.
posted by entropone at 1:27 PM on September 1, 2016 [1 favorite]

casting the veracity of all photos into doubt

the veracity of photos has been in doubt since Stalin
posted by atoxyl at 2:11 PM on September 1, 2016 [2 favorites]

Law enforcement already uses something similar to this--facial composite sketches.

This is a pretty smart comparison!
posted by atoxyl at 2:14 PM on September 1, 2016

"This" is some python code that was uploaded 6 days ago to Github. The thing going on in Baltimore has nothing to do with this.

if you want to see how "enhance" works wrt police procedure, Baltimore is a prime example. the article I linked to actually walks through exactly the hypothetical discussed in this thread: airplane based surveillance catches a murder, video sequence is "enhanced" to locate a suspect and track them, etc.

But then you run straight into the fact that someone wanted to hack together a tool for big brother to use... as some python code on github.
posted by ennui.bz at 3:28 PM on September 1, 2016

if you want to see how "enhance" works wrt police procedure, Baltimore is a prime example. the article I linked to actually walks through exactly the hypothetical discussed in this thread: airplane based surveillance catches a murder, video sequence is "enhanced" to locate a suspect and track them, etc.

I read the piece, but I don't remember anything like this tech, and it's a long piece. Which part are you talking about here?

But then you run straight into the fact that someone wanted to hack together a tool for big brother to use... as some python code on github.

What? Someone wanted to hack together some code because it does an interesting thing. That interesting thing can be used for good or for ill, like a lot of other technologies. The idea that the person who wrote this code wanted big brother to use it is absurd.
posted by tonycpsu at 3:40 PM on September 1, 2016 [1 favorite]

Every single one that doesn't end up a blurry smear looks to me like a different person. So for applications where you want to know who was really in this photo, I think it violates the rule-of-thumb that you don't show precision where you don't have the accuracy to back it up.

I think everyone's thinking of the TV procedural "enhance" scene because that's where our fiction has shown this sort of technology in use. Fiction doesn't always predict what tech will really be useful for.
posted by RobotHero at 4:28 PM on September 1, 2016 [2 favorites]

Except its guesses look basically nothing like the real photo. Just so everyone's clear, its output is the second picture from the right. The picture on the left is the blurred input. The picture on the far right is the real picture. The picture second from the left is a generic 'make this less pixellated'.

The woman on the bottom is especially glaring, as it's pretty clear from the pixellated version and the less-pixellated version the person is female, but its output looks like a man.

The second from the bottom looks like a different ethnicity in the output.

The third one down looks Asian in the less blurry photo, but also morphs ethnicity in the output to a 'whiter' looking version.

My guess would be the training data was mostly headshots of white men, so it's trained to turn everyone into white men.
posted by pravit at 5:09 PM on September 1, 2016

Proves that AIs are just speciecist, and think all of us humans look alike.
posted by kandinski at 5:34 PM on September 1, 2016

Dammit pravit, Not One Black Face (NOBF) is my line/meme.!

This is obviously some 50's science fiction technology.
posted by djrock3k at 6:45 PM on September 1, 2016

"Sure, machines are cool, but can they make ungrounded biased assumptions with potentially dangerous consequences like we can???"
posted by speicus at 7:27 PM on September 1, 2016 [1 favorite]

Yeah, pravit, I was also struck by how it seemed to 'whiten' faces. According to the github the dataset used was Large-scale CelebFaces Attributes (CelebA) Dataset so go figure.
posted by Standard Orange at 9:26 PM on September 1, 2016

One thing I've always wondered. In theory you could take a blurry video and create a sharp image, because combining lots of information from many frames gives a lot more information about the image than what you see in a single frame (assuming there is movement in the video). I don't know if anyone is doing this though.
posted by eye of newt at 9:33 PM on September 1, 2016

When this sort of technology becomes ubiquitous (which it will, because it solves one of the biggest, most persistent, most ubiquitous problems in photography and does it in a really slick way) what does that mean for photographic evidence, the historical record, scientific data, news footage, things like that?

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning
posted by flabdablet at 12:11 AM on September 2, 2016 [6 favorites]

Obligatory XKCD
posted by acb at 4:46 AM on September 2, 2016 [1 favorite]

So, this is like Deep Dream, only with white celebrities' faces instead of slugs and puppies?
posted by acb at 4:47 AM on September 2, 2016

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning

Ah, yes, the subtitle for that story effectively being "don't use a library/algorithm without understanding what it's actually doing". So many wonderful classic software bugs have been introduced because people didn't read the documentation and made assumptions (Xbox using TEA, for another example).

But yeah, the takeaway for non-nerds is: computers are as reliable as the people who program them.
posted by -1 at 7:43 AM on September 2, 2016

Next-frame video prediction is shaping up to be pretty cool too.
posted by RobotVoodooPower at 8:48 AM on September 2, 2016

Does next-frame video prediction do as good a job as the motion-vector stuff built into ordinary video compression algorithms? It looks a hell of a lot more computationally intensive for what I suspect will be not much practical gain.

More interesting for real-world applications, I would have thought, would be a motion predictor following a feature extractor and object recognizer. Self-driving cars are an application that immediately springs to mind.
posted by flabdablet at 9:35 AM on September 2, 2016

I'm sort of curious if this NN technology is any better than tools like maximum entropy deconvolution, which is at least 20 years old. Are the nets adding or inferring additional information into the photos that's not there in the source, or are they simply maximizing the available info in the source image? If the latter, how are they better than a MED approach? If the former, how ground-truthed are they to get it right? And, if they are informed-guessers, I can't see that ever, ever being defensible as sole evidence for culpability.
posted by bonehead at 10:20 AM on September 2, 2016 [1 favorite]

Not just Enemy of the State. In the famous Bladerunner enhance, at 2:02 Deckard manages to get the UI to pan left and alter depth, parallax and orientation to reveal a reclining figure previously obscured by an interposing object.

I remember being impressed with the nascent Internet/Usenet in the 1980s when the first Bladerunner FAQs were being drawn up and people argued deeply back and forth that either this was in fact an algorithmic extrapolation, or that the photo was just a 2D representation of a more complex 3D captured object, or that "allowing" Deckard to find an obviously impossible detail was how the actual Bladerunner Gaff fed his tame, unknowing Replicant salient or new investigation info within a context that made sense to him and would direct him along the path Gaff wanted.
posted by meehawl at 8:24 PM on September 2, 2016 [1 favorite]

« Older Men In Hats | True Facts about the Octopus Newer »

This thread has been archived and is closed to new comments

MetaFilter

Enhance 224 to 176. Move in. Stop. Pull out, track right. Stop.
September 1, 2016 9:42 AM Subscribe

Tags

Share

Enhance 224 to 176. Move in. Stop. Pull out, track right. Stop. September 1, 2016 9:42 AM Subscribe

Tags

Share

Enhance 224 to 176. Move in. Stop. Pull out, track right. Stop.
September 1, 2016 9:42 AM Subscribe