Join 3,564 readers in helping fund MetaFilter (Hide)


Internet Image Montage
October 5, 2009 9:00 AM   Subscribe

"We present a system that composes a realistic picture from a simple freehand sketch annotated with text labels." PhotoSketch takes simple drawings and makes composite photos.
posted by Pope Guilty (44 comments total) 28 users marked this as a favorite

 
Holy cow, that is AMAZINGLY AWESOME!
posted by misha at 9:03 AM on October 5, 2009


Wow. Does it actually work? What I mean is, does anyone here want to volunteer to test it? I would but I'm at work...
posted by scrutiny at 9:07 AM on October 5, 2009


The highlight of the video for me is that they snuck in a picture of a shark jumping out of the ocean to bite at a helicopter.

No but seriously, this looks much too good to be true. I would be very interested to see some larger examples.
posted by Mizu at 9:08 AM on October 5, 2009


Just think of all those b3tanauts and /b/tards who will be cruelly put out of work by this technology...
posted by benzo8 at 9:10 AM on October 5, 2009 [1 favorite]


borked already
posted by leotrotsky at 9:13 AM on October 5, 2009


But will it work with all the hand drawn porn you made when you were like, eleven?
posted by uandt at 9:14 AM on October 5, 2009


> Just think of all those b3tanauts and /b/tards who will be cruelly put out of work by this technology...

This looks shopped. I can tell from some of the pixels and from seeing quite a few shops in my time.
posted by ardgedee at 9:16 AM on October 5, 2009 [4 favorites]


If this thing can understand my drunken napkin doodles we're all in serious trouble.
posted by rokusan at 9:18 AM on October 5, 2009 [1 favorite]


Alternate link? If it works, it's going to take Rule 34 into terrifying new realms of possibility.
posted by permafrost at 9:20 AM on October 5, 2009 [3 favorites]


Can’t watch the video at work, but I’d like to see the composites in a larger format. Also, I would like to see some really absurd compositions done so we can see if they can really combine anything. Does the video satisfy?
posted by Think_Long at 9:25 AM on October 5, 2009


No but seriously, this looks much too good to be true.

Yeah, there are several super hard/impossible problems they are claiming to solve with this thing. State of the art image recognition can barely figure out where people's faces are in a given image, and yet somehow this system can not only figure out which item in an image search result is a person/sheep/fish/bear/sailboat but then seamlessly edit that particular part of the image out and paste it onto a completely different background. Seems fake to me.
posted by burnmp3s at 9:26 AM on October 5, 2009


damn, that is geeky cool
posted by kuatto at 9:35 AM on October 5, 2009


The video is on Vimeo too, if the Chinese webserver dies: link
posted by smackfu at 9:38 AM on October 5, 2009


This looks shopped. I can tell from some of the pixels and from seeing quite a few shops in my time.

Considering the whole application is just automated photoshopping, it should look 'shopped. That doesn't mean it's faked.
posted by chundo at 9:50 AM on October 5, 2009


chundo, I'd like you to meet my old friend sarcasm.
posted by exogenous at 9:55 AM on October 5, 2009 [1 favorite]


The binaries are available to download. A quick try reveals it requires some .dll's from OpenCV. The .ini is a little cryptic. Wish I had more time to play with this (work). One thing I don't get is, why does it go through the trouble of OCRing the text labels? Why not just put X,Y positions and tags in the .ini? Anyways.
posted by hanoixan at 9:56 AM on October 5, 2009


One of the other authors has the SigGraph paper posted: PhotoSketch: Internet Image Montage

yet somehow this system can not only figure out which item in an image search result is a person/sheep/fish/bear/sailboat but then seamlessly edit that particular part of the image out and paste it onto a completely different background.

The paper covers this and it's pretty clever. They admit it's a hard general problem, but when you have every image on the internet available, you don't need a general algorithm: you just need one that works well in a specific situation, and a way to filter the images to limit it to that situation. Like cutting out an object on an arbitrary background is hard, but on a solid background is easy, so just use images with a solid background.

Also they use humans to filter out the crap results.

Also at the end they say the system is still pretty limited and improving it is really hard.

I agree that I wouldn't have been surprised if it was a fake video, but the paper is pretty convincing.
posted by smackfu at 9:58 AM on October 5, 2009 [2 favorites]


chundo: "This looks shopped. I can tell from some of the pixels and from seeing quite a few shops in my time.

Considering the whole application is just automated photoshopping, it should look 'shopped. That doesn't mean it's faked.
"

You may want to google the phrase you quoted...
posted by benzo8 at 9:59 AM on October 5, 2009


I think he's explaining it to me in that video, but all I can hear is "Magic magic magic. Magic magic. Magic magic magic."
posted by lucidium at 10:05 AM on October 5, 2009 [8 favorites]


Apologies for not being up on my fark memes.
posted by chundo at 10:05 AM on October 5, 2009


Well I don't know how you expect to get along in this world if you don't know your "I can tell from some of the pixels" from your "That would not kill Dracula!"
posted by Naberius at 10:19 AM on October 5, 2009 [1 favorite]


The tech reminds me a little of this video: Seam Carving for Content-Aware Image Resizing.
posted by lazaruslong at 10:23 AM on October 5, 2009


Also they use humans to filter out the crap results.

And I think after skimming that paper, the users also tweak the final images in cases where the system screwed up in the automatic blending or composition. So it's a combination of brute force image recognition and many hints from the user on which ones are correct. That makes a lot more sense than the automated sketch -> perfect image process that the brief description on the site suggests. I would actually really like to see more details on how the users interact with the system, because it does sound neat even if it does rely on a lot of human filtering.
posted by burnmp3s at 10:23 AM on October 5, 2009


In fact, if they combines the technologies in order to grab the photos intelligently and a general background palette they could seam-carve expand it to perhaps work even better.
posted by lazaruslong at 10:26 AM on October 5, 2009


I have the binaries and they seem functional but as mentioned above, the documentation sucks and it seems rather buggy at present. Definitely needs more work, but I guess I needed something to hack on...
posted by anigbrowl at 10:33 AM on October 5, 2009


Amazing stuff.

And after the earth becomes an arid, overpopulated, resource-stripped rock, we'll all be very appreciative of our digital fantasy worlds.

>it's going to take Rule 34 into terrifying new realms of possibility.

For better or for worse, the leverage provided by computers makes the subtle mash-up quality embedded in all art (content X of context 1 applied to context 2, so as to create content Y) utterly explicit and seductively simple. Why draw a line, when a thousand people have already done it, and done it perfectly well-- why not just use that time to copy-paste, and then look for more pretty designs, concepts, and commentaries to find and clone and stack? It's less about The Collective, than the Collector. At this rate, it'll soon be quite difficult to even pretend you're being original.

Art's summum bonum is an animated gif of the internet fucking the internet.

I'm sure I read that someplace.
posted by darth_tedious at 10:54 AM on October 5, 2009


ARGH SIGGRAPH presenters, why must you make me feel so dumb? This is awesome.
posted by GuyZero at 11:14 AM on October 5, 2009


       o    o    o
Me--> /|\  /|\  /|\  <-- My buddy Lenin
      / \  / \  / \
            ^-- [NOT-TROTSKY]

posted by qvantamon at 11:31 AM on October 5, 2009 [8 favorites]


I wasn't able to watch the video with sound on, or read any of the in-depth technical papers. Are they able to tackle the issue of lighting (i.e. disparate images will be lit from different angles, looking strange when combined into a single frame)?
posted by Edgewise at 11:45 AM on October 5, 2009


I don't think they tackle lighting at all aside from having a few human-mediated steps they sort of gloss over.
posted by GuyZero at 11:51 AM on October 5, 2009


(i.e. disparate images will be lit from different angles, looking strange when combined into a single frame)

They present a bunch of images, and the user has to pick. It looks like using this program will still take a lot of work (I would bet they probably have you try to at least point out where in the found image the actual object you want is) but it's a lot less work then sitting there in photoshop and trying to draw out the contours with the selection tool.
posted by delmoi at 1:52 PM on October 5, 2009


This is insanely cool. I can't wait to play with it.
posted by flatluigi at 2:03 PM on October 5, 2009


> it's a lot less work then sitting there in photoshop and trying to draw out the contours with the selection tool.

In design, the biggest expense in using stock images can be in picking the images. It's not difficult, it's time consuming. And time is money. A tool that can, say, go through the hundreds of millions of photos in the Corbis and Getty collections, pick the twelve photos that most resemble men in blue shirts sitting in office chairs, legs crossed with ankle on knee, holding up a pen, can have immediate commercial value. If it can go a step further by roughing in the composition you provide and keep an audit on the images used for later invoicing, it's an app with huge potential. And it doesn't matter if its compositing skill suck; that's what the studio's production staff are for.
posted by ardgedee at 2:06 PM on October 5, 2009


1. Combine this in with Amazon's Mechanical Turk
2. Automagically add witty caption
3. Post on blog with Google Ads
4. PROFIT.
posted by blue_beetle at 2:29 PM on October 5, 2009 [2 favorites]


A tool that can, say, go through the hundreds of millions of photos in the Corbis and Getty collections, pick the twelve photos that most resemble men in blue shirts sitting in office chairs, legs crossed with ankle on knee, holding up a pen, can have immediate commercial value.

And I'm sure that's the very first thing new users will attempt with this

[cut to the Weird Science trailer]
posted by mecran01 at 4:06 PM on October 5, 2009


blue_beetle: That just might be the first profit meme that lacks questions and could actually work.
posted by lazaruslong at 4:42 PM on October 5, 2009


The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet. Although online image search generates many inappropriate results...

I predict that any sketch which includes goats, lemons, or girls with cups is going to raise a few eyebrows.
posted by Ritchie at 6:14 PM on October 5, 2009


So have I missed it or is there no way to play with this myself?
posted by If only I had a penguin... at 7:06 PM on October 5, 2009


There's a download on the web page, called something like "binary zip". But it's more of a proof-of-concept the math and algorithms in the paper, rather than something normal people would want to run.
posted by smackfu at 8:01 PM on October 5, 2009


lazaruslong: "The tech reminds me a little of this video: Seam Carving for Content-Aware Image Resizing."

Which has an update to the original findings. Basically, they claim they were DOIN. IT. RONG. Also they apply it to video with some interesting effects.
posted by pwnguin at 9:34 PM on October 5, 2009 [4 favorites]


Smells like hoax to me. The claims about photo sifting for appropriate images seem unrealistic.
posted by Jimmy Havok at 11:41 PM on October 5, 2009


This will make it much easier to form conspiracy theory photoshops. I can't wait to see Joseph Stalin signing Obama's Kenyan birth certificate.
posted by mccarty.tim at 7:52 AM on October 6, 2009


Jimmy Havok, I think the sifting mostly involves removing "unusual" images based on rough colour and other properties, rather than doing any sort of intelligent image recognition.
posted by lucidium at 10:16 AM on October 6, 2009


From the paper:

"Our filtering for content consistency is inspired by [Ben-Haim et al. 2006]. Background images with the same content often have similar appearance. For example, beach images often have yellow sand and blue sky; meadow images have green grass. If we cast the discovered images into an appearance feature space, images with similar content typically cluster together. We assume the biggest cluster is formed by images with consistent content, matching the label. In our implementation, we use histograms in LUV color space as image features."
posted by lucidium at 2:24 PM on October 6, 2009


« Older Wall Street's Version of Flip This House....  |  Listeners to NPR have probably... Newer »


This thread has been archived and is closed to new comments