Kyle McDonald Explains FaceTracker
FaceTracker is an example of a complex technique that builds on top of a series of computer vision, image processing, and machine learning functions in order to achieve its result. Here's an interview with Kyle McDonald, artist and researcher in New York with a background in computer science and philosophy. He released FaceOSC, a tool for prototyping face-based interaction. Kyle has a growing body of work that uses face tracking in an artistic context, notably Face Substitution.
This is truly awesome. It's difficult to imagine where this technology will be in just a couple years. Seems conceivable we'll be able to impersonate virtually anyone we want. And THAT has some serious comedic, and disturbing, potential. Thank you!
Excellent. I've been playing with the awesome openFrameworks, a C++ framework for developing interactive apps, and OpenCV a computer vision library, in some AR work I've been doing.

It's really wonderful seeing the cooperation between the engineering and artistic communities that has arisen lately due to things like this, Make and Arduino-like physical computing.

Wonderful future ahead of us.
Seems conceivable we'll be able to impersonate virtually anyone we want.

On camera, as long as you don't have to speak. I don't think that voice synthesis is progressing at anywhere near the rate that graphics is.
The software is very cool but I am not impressed on the "art" angle. Hopefully somebody will come up with a clever concept using this and change my mind.

The Media Art world has a tendency to get all excited about some fancy-pants new technology every year. This is then followed by successive waves of gee-wiz demos and me-too blog posts all of which serve to show how amazing and expressive we'll all be in the future. The narrative is usually about making the technology (and by extension the clever technician/artist) into a sort of hero.

If I had a dollar for every drippy Kinect-Wii-Arduino-touchscreen project I've seen...
Well, mr. ersatz, you gotta realize, in the 1990s, they were talking in similar excited tones about Full Motion Video. It was the holy grail, the Big Deal, when computers could actually take a video signal and run it full screen, like a real movie! Whole game companies were founded on the idea of selling FMV games, and quickly folded, because they sucked. (see: "A Fork In The Tale".)

But then, quietly, when nobody was really noticing anymore, it happened. Computers got fast enough to actually play back video. And then it got hardware accelerated, and an average desktop is now perfectly capable of playing back numerous video streams at the same time. And it did end up being terrifically important. See: Youtube.

So, they're probably getting excited about this stuff too early, but it doesn't mean they're wrong to be excited at all. And, at least if video is any indicator, when stuff like this really does become mainstream and routine, people will barely even notice. They'll just integrate it into their lives like the capability was always there.
That was horrifying.
On camera, as long as you don't have to speak. I don't think that voice synthesis is progressing at anywhere near the rate that graphics is.

Actually ... (previously)
That's certainly better voice synthesis than we've had in the past, but we still have films where the visuals are already 100% computer graphics but the voice audio tracks remain 100% recordings of human voices, (albeit with filtering/audio engineering) as we have for a couple of decades now.
films where the visuals are already 100% computer graphics but the voice audio tracks remain 100% recordings of human voices.

Yeah, about that. I wonder if it has something to do with the nature of visual representation vs audio representation, and the symbols within. Like for example, if you want to visually make something look younger, then you squish and squash and make it like a baby. Where if you want something to sound younger, you just pitch up the voice and use a different vocabulary.

Also I think that the amount of effort you need to do a visual trick is much more than an audio trick, see war of the worlds radio vs war of the worlds movie. Like the requirements for creating something out of whole cloth are much higher with visuals, with audio, you just tell the person to imagine something.

I would like to see CG environments for people that see through echolocation though.
It's not just a matter of tricks, we actually don't have the technology to do completely CG voice synthesis that sounds any better than a Doctor Who baddie. That Vocaloid thing that Adamsmasher links to sounds pretty good but it seems to be limited to singing and like other techniques is based on sampling and applying transformations to real human voices. It seems to me that we don't have the audio or voice equivalent of raytracing: a physics model to simulate the important parts necessary to produce a reasonable facsimile of reality.

The echolocation idea is cool, though, and would probably be much easier to accomplish—at least by filtering sampled sounds—than voice synthesis from scratch would be. I know that there are lots of audio games but I don't know if any incorporate that effect.
@Malor: I agree totally about computer vision becoming mainstream and commonplace.

My problem is that lots of people do cool geeky experiments and call it "art" but the "artistic message" if you will, always seems to come out as something like "wow! isn't this new technology fancy and aren't we a fancy bunch of humans for inventing it! In the future nobody shits and everyone updates their check-books by dancing their name in front of a TV, etc."

Artistically, I want some kind of story or polemic or insane vision or at least something more transcendent and critical than the same old myth of technological progress. Media Art needs to stop imitating the tech demos of Silicon Valley.

But maybe I'm just a grumpy old insider.
