Face2Face: Real-time Face Capture and Reenactment of RGB Videos We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. - Stanford Computer Graphics Laboratory
Lo and Behold: Reveries of the Connected World - "With interviewees ranging from Elon Musk to a gaming addict, Werner Herzog presents the web in all its wildness and utopian potential in this dizzying documentary." (via)
This experiment explores how to generate little romantic stories about images, using neural-storyteller, a recently published experiment by Ryan Kiros.
The Deep Mind of Demis Hassabis - "The big thing is what we call transfer learning. You've mastered one domain of things, how do you abstract that into something that's almost like a library of knowledge that you can now usefully apply in a new domain? That's the key to general knowledge. At the moment, we are good at processing perceptual information and then picking an action based on that. But when it goes to the next level, the concept level, nobody has been able to do that." (previously: 1,2) [more inside]
Fish on Wheels, a short video in which a goldfish drives around the room.
Dr Steve Mann, the inventor of wearable computing, relates his computer-vision-aggravated assault by McDonald's employees.
I See What You Did There: Software Uses Video to Infer Game Rules and Achieve Victory Conditions. A French computer scientist has constructed a system that successfully divines the rules to simple games just by using video input of human players at work.
Gestus is a moving image processing framework that uses computer vision techniques to explore the artistic possibilities of the vector as a symbolic form.
Inside Google's Age of Augmented Humanity. Wade Roush of Xconomy interviews Google researchers working on speech recognition, machine translation, and computer vision. [CEO Eric] Schmidt talked about "the age of augmented humanity," a time when computers remember things for us, when they save us from getting lost, lonely, or bored, and when "you really do have all the world's information at your fingertips in any language"—finally fulfilling Bill Gates' famous 1990 forecast. This future, Schmidt says, will soon be accessible to everyone who can afford a smartphone—one billion people now, and as many as four billion by 2020.... It's not that phones themselves are all that powerful, at least compared to laptop or desktop machines. But more and more of them are backed up by broadband networks that, in turn, connect to massively distributed computing clouds (some of which, of course, are operated by Google). "It’s like having a supercomputer in your pocket," Schmidt said in Berlin. "When we do voice translation, when we do picture identification, all [the smartphone] does is send a request to the supercomputers that then do all the work."
A visualization of all the nouns in the English language arranged by semantic meaning. [NSFW words included!] [more inside]