3D Reconstruction from Accidental Motion
April 19, 2014 11:37 AM   Subscribe

Fisher Yu, a Princeton grad student, and David Gallup, a Google employee, have published a method for retrieving the 3D information of a scene from the small motion of the hands that occurs while taking video. They've given their paper a website that includes a video, the paper itself, and a dataset. One neat application of this is the ability to simulate short depth of field, a feature that has made it into the new Google Camera app.
posted by Maecenas (26 comments total) 16 users marked this as a favorite
The new camera app, which is total ass compared to the old one, IMO. The technique is both stupidly obvious and incredibly brilliant. As in "duh, parallax will allow you to calculate distance" (if you think of it), but who the hell would ever think of using the incredibly tiny amount of parallax involved in the random shake of your hands when taking a photo?

I'm actually a little bit surprised there's enough resolution, even in an 8MP or higher sensor, to resolve enough fine detail to calculate the parallax changes.
posted by wierdo at 11:41 AM on April 19, 2014 [3 favorites]

posted by Thing at 11:47 AM on April 19, 2014

Very cool stuff. I was wondering what the actual applications would be, watching the video.

(And now I'm curious about the progress, if any, being made with photosynth, which seems sorta similar, in a vague way. Is it my imagination that photosynth no longer requires silverlight?)
posted by maxwelton at 11:55 AM on April 19, 2014

So as implemented in Camera, it doesn't actually rely on your random motion. You put it in a specific mode, called "Lens Blur," which prompts you to slowly raise your phone while keeping the subject centered after pressing the shutter button. If you're doing an extreme close-up, it's really hard to move it as slow as it wants you to, but otherwise it's reasonably easy to capture as instructed.

The results are actually reasonably impressive for requiring zero extra hardware and only an extra second or so to capture the pic. It would be nice if it didn't require the user do anything specific so it would work even when things in frame are moving, but still quite a decent first showing.
posted by wierdo at 12:04 PM on April 19, 2014

Very cool tech! But unfortunately it's yet another step in the direction of, "You can be a complete moron and have absolutely no training, knowledge, or information about what, why and how photography/aperture works and still take photos that look like you do!"
posted by ReeMonster at 12:24 PM on April 19, 2014

Synthetic virtual focus is an amazing thing to play around with... here's some photos of Chicago taken in this style. You would think that once things went digital, photography was done, but it's nowhere close to that.

Marc Levoy wrote an iPhone app to do this years ago.
posted by MikeWarot at 12:32 PM on April 19, 2014 [1 favorite]

And that's unfortunate because why? Experts have to innovate.
posted by planetesimal at 12:33 PM on April 19, 2014 [6 favorites]

So it's actually doing enough work to not look like ass?

That instagrammy filter that was trying to simulate a short DOF sets my teeth on edge. My brain keeps trying to decode the scene and fails because it's being done totally wrong.
posted by wotsac at 12:33 PM on April 19, 2014 [1 favorite]

maxwelton: "I was wondering what the actual applications would be, watching the video. "

It would be great if this turned into for example, being able to scan your kitchen with your phone during a remodel to size counter tops, and sending specs via email. The pessimist in me however thinks this will probably used in drones first to map combat zones in real time.
posted by Big_B at 12:33 PM on April 19, 2014 [2 favorites]

A cursory search isn't telling me what the name of the episode was, but this reminds me of a bit in ST:TNG where the characters had video footage taken while surveying a planet and used it to create a 3D scene of the survey site on the holodeck. (IIRC, they used it to deduce that a chameleon-like alien had been standing there watching them while they worked.)

So, is this going to get applied to the Zapruder film and tell us who killed JFK? Not that, like, remodeling your countertops isn't exciting too.
posted by XMLicious at 12:34 PM on April 19, 2014 [2 favorites]

But unfortunately it's yet another step in the direction of, "You can be a complete moron and have absolutely no training, knowledge, or information about what, why and how photography/aperture works and still take photos that look like you do!"

Applying filters and effects is not at all the same thing as framing, composition, subject, and timing. No camera can divine the picture-taker's intent.
posted by Greg_Ace at 12:41 PM on April 19, 2014 [1 favorite]

So brilliant.

Some years ago my partner and I were sitting on a bench idly flipping crumbs of croissant to a mixed group of birds milling around our feet -- sparrows and pigeons with an outer edge of more cautious crows -- and I was thinking that the pigeons were dotting the smaller bits up pretty damned accurately for animals whose eyes were so far out on the sides of their heads they obviously couldn't have much real depth perception, but then it occurred to me that with the way they moved their heads, whipping them back and forth but with a pause at the endpoints, they could be comparing the two images formed by a single eye at the endpoints and extracting depth information that way. I was kind of crestfallen when somebody later told me that possibility had already been suggested.
posted by jamjam at 12:44 PM on April 19, 2014 [4 favorites]

col, so can they use this to make #D movies that don't make my eyes bleed?
posted by es_de_bah at 2:46 PM on April 19, 2014

I think this would only work on still scenes, but I guess that's usually what one is photographing anyway, eh? Also, I think this work is not really theoretically groundbreaking, but more like a nice app-lication of current computer-vision techniques.
posted by zscore at 3:23 PM on April 19, 2014 [1 favorite]

and only an extra second or so to capture the pic

An entire second? That sounds positively Victorian.
posted by acb at 4:48 PM on April 19, 2014 [2 favorites]

Lytro Looks to Refocus

Let's hope they don't set their sights on comedy.
posted by Greg_Ace at 7:17 PM on April 19, 2014

I think this work is not really theoretically groundbreaking

I think so too.
posted by surplus at 7:34 PM on April 19, 2014

can they use this to make #D movies that don't make my eyes bleed?

This looks broadly similar to techniques that have been used in 3D conversion for a while. As far as I can tell, the novel part is making it something the average consumer can use. If it has the same weaknesses, it's going to have trouble with anything translucent or reflective.

They've made a lot of impovements to converted 3D, but I still find the "shoot in 2D and convert in post because it's cheaper than shooting in 3D" attitude annoying. Imagine if movies were routinely shot in black-and-white and colorized instead of shot in colour. It's always going to look slightly wrong.

On the other hand, if 3D makes your eyes bleed because of the delivery method, this is completely unrelated.
posted by RobotHero at 9:29 PM on April 19, 2014

This is technically impressive, and cute... but the blur they're applying is really unconvincing. It looks like some type of gausian blur+some sort of extra pizzaz is the form of some filtering/tweaking.

It's like those apps that make a photo look like a fake polaroid with the border, and then mess with the contrast and stuff a bit.

I guess i do have some crow to eat though, because my comment on lytro when it came out was "wow, like $500 and it doesn't even take very nice pictures? Let me know when they stuff it in a phone as a bullet point feature)

Lytro Looks to Refocus as Smartphone Apps Find Ways to Mimic Its Signature Trick

Are they refocusing/"pivoting" on doing something completely different like going out of business? Because thus far all they've shown is an expensive camera that takes depressingly noisy/grainy low-res photos with unimpressive optics, and a required software suite that just eats CPU power.

Can someone explain if what's being done here is any different from what the lytro does? Because that always seemed like entirely software trickery to me(like narrow aperture+effects), and if they're doing approximately the same thing but on a mobile CPU instead of 80% load with lots of lag on a desktop-class cpu that's impressive. Although i guess, lytro was doing it with a single exposure, or at least appearing to.

I also think there's some legitimate grumble to be had here with the whole "revolutionary awesome new feature in the latest google camera app update*

*only available on 4.4, which is only on select devices". Like, everyone always acts like android is sooo much more open and you get more for your money and in the way of software and such, but the way google is starting to "put the genie back in the bottle" with this type of stuff is a bit disconcerting. Like, how is this any different than if apple had introduced this with ios 8, but only on the iphone 6? i'm honestly impressed the nexus 4 got this, but how many recent phones got shafted?

Basically though, i can't wait for the play store/ios app store blatant ripoff of this that's 99 cents.
posted by emptythought at 9:49 PM on April 19, 2014

My understanding was the Lytro was trying to do it with hardware; thousands of tiny micro-lenses on the camera sensor. Which yeah, has the inherent trade-off that the more light you capture for this refocusing thing, the less light you're capturing for the regular photo image.

This is interpreting the frames before and after the actual photo to find the depth. So the actual photo can have all the light, but the shallow depth of field is all interpolated, not based on actual light captured from reality. So this is to Lytro as frame interpolation is to a slow-motion camera.

I expect Lytro would get more convincing results where there is translucent material, reflective material, or or the wrong kind of motion at the time of the photo. But because Lytro requires actual hardware, and that hardware results in trade-offs in resolution and noise, this is going to be widely adopted and Lytro is going to be a niche product.
posted by RobotHero at 11:18 PM on April 19, 2014

What the Lytro does is somewhat different. This technique uses parallax to build a depth map, which is then applied to one frame of the captured image, and used to selectively blur areas to simulate depth of field. The Lytro, and other plenoptic cameras, capture vector information about the direction, as well as intensity and color, of light at each point in the image. This allows software to construct images with different focal points and depths of field from the data. It's not just "doing it in hardware," because the "it" is quite different in much the same way that raytracing is different from rasterization.

One of the advantages of plenoptic cameras is actually better low light performance, because you can use very large apertures that would result in focusing problems with a traditional camera. Another advantage is that it is faster than traditional image capture, because the camera never needs to focus, while this technique is slower, because you need to capture multiple frames.
posted by Nothing at 8:58 AM on April 20, 2014 [2 favorites]

Yeah I should have emphasized more, the Lytro is creating an image based on the real light it really captured. This is using the motion to create simulated geometry. And then using that geometry to simulate the shallow focus. With the Lytro, when you change the focus in post, it's more akin to cropping; you're deciding which light to use and which to ignore.

But I expect the general public will not care about that difference, the same way a lot of people now think tilt-shift lens = gradient blur.

If this thing hadn't used the simulated geometry to create a simulated shallow focus, I don't think anyone would have made the comparison. They could have added fake fog to the image, or little Kilroys looking over objects in the background, using the same simulated geometry. But fake shallow focus has more marketing potential.
posted by RobotHero at 10:35 AM on April 20, 2014

Is that a Syncro back there in example #3?
posted by lagomorphius at 11:12 AM on April 20, 2014

Fake shallow focus is only one of the things that you can pull off with this. If you have enough post-capture CPU capability, you can figure out the camera poise in each frame to sub-pixel resolution, and be on your way to super-resolution capabilities.

I've experimented with it a bit, a few years ago... here are the results. Unfortunately the new Flickr redesign makes it hard to see the results properly. 8(
posted by MikeWarot at 7:56 AM on April 21, 2014 [1 favorite]

I've experimented with it a bit

Ooh, cool. Do you remember what algorithm you were using?
posted by Maecenas at 10:43 AM on April 21, 2014

« Older "AH! Oh my God! Oh my God! Arg!"   |   Buzzfeed, it ain't Newer »

This thread has been archived and is closed to new comments