Join 3,562 readers in helping fund MetaFilter (Hide)


2D => 3D
June 14, 2006 4:38 PM   Subscribe

Carnegie Mellon researchers have created a program that can automatically generate a 3-D model from a single photograph, using machine learning. Take a look at this high-res comparison of original and generated images, also demonstration animations and downloadable videos (with executables). [via /. see also: a little on human 3d perception at everything2, groovy dragon illusion]
posted by MetaMonkey (42 comments total) 9 users marked this as a favorite

 
Note: pending the resolution of a licensing issue with Mathworks, I cannot post the distribution libraries required to run these executables, unless you own Matlab 7.3 for Linux or Matlab 7.04 for Windows.

That sounds like a bad thing for people wanting to mess with this. Which I do.
posted by puke & cry at 4:48 PM on June 14, 2006 [1 favorite]


Also, love that dragon.
posted by puke & cry at 4:52 PM on June 14, 2006 [1 favorite]


See also the Campanile movie from Paul Debevec's SIGGRAPH 1997 presentation.
posted by Rhomboid at 4:53 PM on June 14, 2006


Even when Efros and Hebert assigned Hoiem to use machine learning techniques to teach visual context to a computer two years ago, they regarded it primarily as a learning exercise for their student. "We didn't believe it would work," Efros said.

Heh.
posted by smackfu at 4:56 PM on June 14, 2006


Just wait until this makes its way to first-person shooter level-builder kits. We'll have a digital version of the whole freaking world.
posted by brain_drain at 5:01 PM on June 14, 2006


Oh, this is cool. Nice.
posted by blacklite at 5:03 PM on June 14, 2006


i was thinking the same thing brain_drain. i was also idly speculating how this could be used in the porn market.

Because you know someone out there is going to try to figure out how to just that.

Neat tech either way. i seem to remember seeing a more rudimentary version of this a couple of years ago, i'm glad to see the project has gotten as far as it has.
posted by quin at 5:08 PM on June 14, 2006


So lets say you were to aquire Matlab 7.04 for Windows. It would come with the required librarys to run this, yes?
posted by puke & cry at 5:11 PM on June 14, 2006 [1 favorite]


I'm with quin and brain_drain too. Porn and video games always seem to push these things ahead, don't they?
posted by danb at 5:11 PM on June 14, 2006


I'm waiting to see this applied to Google Earth.
posted by adamrice at 5:12 PM on June 14, 2006 [1 favorite]


I want the web interface that allows me to upload images for processing.
posted by NationalKato at 5:12 PM on June 14, 2006


I'm totally making that dragon right now.
posted by bigmusic at 5:14 PM on June 14, 2006


Wow, a lot of 3D artists in the game biz are gonna be on a soup line when they get this perfected.

There will still be work for people making stuff up from scratch, of course, but any sort of "realistic" game will want to use this to speed up production. We model a lot of stuff like cars, houses, barrels, bathroom fixtures, furniture, contemporary weaponry, etc., and some of these things can take many hours, even a week or more to do depending on complexity. Environments can take several weeks to build

This sounds like it would speed that process up a great deal.

It would also be a real help for architects and engineers, to create models of existing buildings or landscapes, from which they can create their 3D models of whatever they're designing.

Most impressive.
posted by zoogleplex at 5:21 PM on June 14, 2006


From the Dragon Illusion WMV when just about to describe and show how it works: "If we move far enough around to the side, we suddenly realize..." then the video cuts out. Argh!
posted by effwerd at 5:31 PM on June 14, 2006


There will still be work for people making stuff up from scratch, of course, but any sort of "realistic" game will want to use this to speed up production. We model a lot of stuff like cars, houses, barrels, bathroom fixtures, furniture, contemporary weaponry, etc., and some of these things can take many hours, even a week or more to do depending on complexity. Environments can take several weeks to build

The thing is this is relying on straight verticals, level ground and predictable horizons to do everything. I don't see how the technology can be extended to more detailed stuff. If you look at the VRML files all it's doing is dividing the scene into 5 or 6 polygons, which admittedly is quite impressive, but it's still a very small step towards modeling actual objects automatically.
posted by cillit bang at 5:36 PM on June 14, 2006


The video works ok for me effwerd, try downloading it again maybe.
posted by MetaMonkey at 5:37 PM on June 14, 2006


I'm just amazed to see actual, real-live vrml files! :P
posted by delmoi at 5:40 PM on June 14, 2006


So lets say you were to aquire Matlab 7.04 for Windows. It would come with the required librarys to run this, yes?

Yes, or someone with the correct version of Matlab could give you the MCR. I think thats over 100MB but it's still better than the gig or so that Matlab weighs in at.

Unfortunately I have the Unix version of Matlab, and only 7sp2 at that. I'll have to see if this works tomorrow though, it certainly looks very interesting.
posted by Olli at 5:43 PM on June 14, 2006


The video works ok for me effwerd, try downloading it again maybe.

I tried a couple of times, MetaMonkey, no joy. Cuts out at the same exact time. Then again, I'm on a Mac and since updating QuickTime I've been having some general systemwide weirdness going on, so who knows. I think of it as incentive to print the PDF and make it myself.
posted by effwerd at 5:46 PM on June 14, 2006


Well, this is impressive, but lets be realistic here. We have two eyes for a reason, because determining visual context from a single image is hard. Also, this is not a breakthrough for modeling at all, it's not very accurate, and there is already better technology out there to do the 'automatic modeling', just use a stereoscopic camera or better yet: a camera with a built in z-buffer. They do exist already, but are very had to google for because 'camera' is also used in 3-d rendering terminology.
posted by delmoi at 5:47 PM on June 14, 2006


did you try the 'high res' version?
posted by delmoi at 5:47 PM on June 14, 2006


As far as games go, I imagine the technology would be most handy for games featuring an element of simulation - flight, car racing, war games, fps' etc. A lot of games already spend a lot of time recreating cities, so this will likely mean a lot more of that sort of thing.

I'm also looking forward to taking virtual strolls around random photos; people's neighborhoods and so forth, which shouldn't be too hard to generate between google images and flicker, and maybe a little extra effort.

My guess is this, and its subsequent refinements will be quite a big deal for all sorts of VR/simulation/modelling things we haven't yet considered. I imagine it could be handy for robots, particularly moving ones - anyone know any different?
posted by MetaMonkey at 5:48 PM on June 14, 2006


I was thinking something along the lines of cillit bang. This seems it would be pretty adept at modeling stuff that is relatively predictable. Anything asymmetrical to any noticeable degree might not fare so well. Still pretty cool
posted by edgeways at 5:49 PM on June 14, 2006


bwahaha, I love the background music they're using.
posted by delmoi at 5:50 PM on June 14, 2006


This reminds me of Metacreations Canoma. In that program you had to model the geometry and then apply the picture(s) to the scene. It was actually pretty cool, but it got lost in Metacreations Shuffle of early 2000. Most decent modeling packages come with Camera Mapping now, anyway, so it's not like you couldn't do this now - in other words, I don't forsee a work shortage for 3d artists.
posted by hoborg at 5:51 PM on June 14, 2006


The one thing they seem to get significantly wrong is the roof, which is just missing from the overhead view. Makes sense: the computer doesn't "know" that buildings have roofs. Cool stuff.
posted by nebulawindphone at 6:41 PM on June 14, 2006


Pretty cool, but I hope they come up with a stand-alone version for those of us who don't have the specialized software required to run this.
posted by Alexandros at 6:42 PM on June 14, 2006


Now that science has "proven" that Leonardo's most famous subject was a contralto, can Efros & Hebert give us a peek at the bootè della Gioconda?
posted by rob511 at 6:55 PM on June 14, 2006


Its pretty impressive that it was able to create that from just a still. At closer inspection the geometry is not that accuarate and is very simple, but its a start.

There are already many applications that can create geometry from moving images which are surprisingly accurate. I suspect this can recognise simple shapes based on lighting?

Ive always thought technology like in boujou would be great in games - its just that at the moment it is very slow. A combination of the two technologies might be interesting.
posted by phyle at 7:19 PM on June 14, 2006


I wonder what happens if you feed it an Escher drawing.
posted by zixyer at 7:35 PM on June 14, 2006 [1 favorite]


Well whaddya know, it's a file called MCRInstaller.exe, and it appears to be the runtime required!

Gotta love Google. ;)
posted by Vamier at 7:43 PM on June 14, 2006


I saw a shop once selling the equivalent to that dragon illusion - big framed golden "pictures" of Buddah that appeared to turn to face you. Again, they were negative "cut outs" like the dragon, but quite effective.
posted by Jimbob at 8:08 PM on June 14, 2006


I thought about googling the installer name but then I looked at something shiny and was distracted. Is that the right version, Vamier? Or does it matter?

Olli said you need the correct version of matlab with the MCR but that could mean the correct version of matlab that has the MCR and not the correct version of the MCR.
posted by puke & cry at 8:29 PM on June 14, 2006 [1 favorite]


BLADE RUNNER! BLADE RUNNER!
posted by Meccabilly at 12:26 AM on June 15, 2006


We don't need to find him to study him now!

"BLADE RUNNER! BLADE RUNNER!"
Science immitates science fiction
posted by D J Robertstein at 4:24 AM on June 15, 2006


The dragon cut-out is awesome. Can't wait 'til the housekeeper comes next week. She'll drop one in her bloomers! I put it on a display shelf in the kitchen. LOL!
posted by Goofyy at 4:53 AM on June 15, 2006


Take a look at this high-res comparison of original and generated images

Aw, man. I have to be back there in a week.
posted by ludwig_van at 5:30 AM on June 15, 2006


This seems like a really important advance for robotics, too, isn't it?

If they get this working in real time it would probably be all they need to fix computer 'vision'. You have two cameras making two versions of the 3d world from their perspectives, use the paralllax to adjust it and then continue to adjust as the robot moves.
posted by empath at 5:33 AM on June 15, 2006


I thought about googling the installer name but then I looked at something shiny and was distracted. Is that the right version, Vamier? Or does it matter?

I have no idea whether it's the right version or not. :) I *just* installed it and will report on my success/non-success...

It's probably not, since it's looking for mclmcrt72.dll, which doesn't seem to exist on my machine. *Le Sigh*

That's an older runtime, and hey, I thought I was being all smarmy-smart-ass, and it turns out that I was. :P A smarmy smart-ass, that is.
posted by Vamier at 6:34 AM on June 15, 2006


Good stuff, but I'm a little disappointed the rendered pics of the building are noticeably "off" given how simple that shape is. Still, this is an advance, and will hopefully be tweaked and improved in a timely fashion so that we can create virtual neighborhoods out of (single) still photos of real ones, or more easily (I would expect) out of video imagery, using a process like tweening in animation (this corner in this frame becomes this corner in this frame 100 frames later - follow it through).

I have a photo-and-video documentation project I did a few years back that was predicated on this technology emerging, so as you can imagine I've been getting somewhat impatient.
posted by soyjoy at 9:39 AM on June 15, 2006


I just talked to a CMU grad student who is working on a web based interface for this program. I'll let you know when it's done.
posted by delmoi at 12:49 PM on June 15, 2006


CMU is a sad place :(
posted by ludwig_van at 5:01 PM on June 15, 2006


« Older Google's Flickr Killer?...  |  The New Yorker suggests that B... Newer »


This thread has been archived and is closed to new comments