Something something Terminator something Predator
April 4, 2011 4:54 PM   Subscribe

Zdenek Kalal, a PhD student at the University of Surrey in England, has developed the coolest object tracking software you'll see this week.
posted by auto-correct (88 comments total) 54 users marked this as a favorite
 
I want this software. You see, I have a cat....
posted by orthogonality at 4:57 PM on April 4, 2011 [6 favorites]


...who wants this for a birthday present?
posted by louche mustachio at 5:02 PM on April 4, 2011 [2 favorites]


This is awesome. (In the original sense, because holy shit this could be used for pure evil.)
posted by oddman at 5:03 PM on April 4, 2011 [4 favorites]


Just show it a picture of Osama and let it do its thing.
posted by vidur at 5:04 PM on April 4, 2011 [3 favorites]


If my cat gets this, he won't need opposable thumbs. Doomed I say.
posted by arcticseal at 5:05 PM on April 4, 2011 [1 favorite]


There's some impressive examples at the very end, so I recommend watching the whole thing.
posted by auto-correct at 5:06 PM on April 4, 2011


Heh. 13 years ago, when I graduated from my CS bachelor's program, my senior project was some object-tracking software.

This makes that look like a tiny stack of punchcards.
posted by gurple at 5:08 PM on April 4, 2011 [3 favorites]


Just show it a picture of Osama Julian Assange and let it do its thing.
posted by clarknova at 5:09 PM on April 4, 2011


This has been Historic Moments in the Formation of the Total Police State. I'm Emmanuel Goldstein. Thanks for being watched.
posted by Horace Rumpole at 5:12 PM on April 4, 2011 [48 favorites]


Just show it a picture of the Wisconsin Democratic House Members, and let it do its thing.
posted by anthill at 5:17 PM on April 4, 2011


I think I watched too many 80s SF movies about merciless robot killers not to find this creepy.
posted by Artw at 5:18 PM on April 4, 2011 [3 favorites]


(Needs more OCR font and scanlines, mind)
posted by Artw at 5:18 PM on April 4, 2011 [1 favorite]


This kid is either the next bajillionaire in the making or the next Evil Genius with his Volcanic Lair and Predator-enabled sharks.
posted by Kitteh at 5:22 PM on April 4, 2011




The guy calls it a freaking PREDATOR. Come on. If this guy isn't seeking either (a) evil military investors or (b) interplanetary alien hunting investors, then I'd be pretty surprised.
posted by jabberjaw at 5:25 PM on April 4, 2011 [3 favorites]


This shit is scary! It takes Big Brother to whole new levels. Pretty cool, too.
posted by mareli at 5:25 PM on April 4, 2011


The war machine applications are obvious. But just imagine the UI possibilities: touchless devices that use the touch metaphors in iOS and similar devices...
posted by Blazecock Pileon at 5:28 PM on April 4, 2011 [1 favorite]


Imagine how various types of physical disabilities could be overcome with this! AWESOME!
posted by Ron Thanagar at 5:30 PM on April 4, 2011 [2 favorites]


(Holy cow. Kalal and Natalie Portman have the same eyes and lips.)

Just show it a picture of Osama and let it do its thing.


Could it have told the difference between Saddam and his lookalikes? Because Conan found the spitting image of Osama driving a cab in Fresno a few years ago.
posted by anniecat at 5:32 PM on April 4, 2011


It's a shame he's called it "Predator", since it biases the discussion to violent and hostile uses. It's a motion tracker that learns. It's pure awesome. It's fundamental research that will helps enable a whole new world of computer interaction. Think Kinect, for one.

The hidden problem with work like this is often the cameras need to be very high quality and the lighting needs to be very good. I don't think anything about this research to say if it's an issue for his Predator.
posted by Nelson at 5:34 PM on April 4, 2011 [2 favorites]


I suspect the fact that it'll as happily follow a photo as a person may actually be a bit of a drawback to it's merciless killing machine capabilities.
posted by Artw at 5:34 PM on April 4, 2011 [1 favorite]


It's a shame he's called it "Predator", since it biases the discussion to violent and hostile uses.

Also "Terminator" would be way more fitting.
posted by Artw at 5:35 PM on April 4, 2011


orthogonality said:

I want this software. You see, I have a cat....

The compiled code is available online:

Zdenek Kalal (Personal Website)

Tracking-Learning-Detection (TLD) aka Predator (Project Page)

Direct Links to Downloads:

TLD v1.0 compiled demo (Windows)
TLD data set (634MB)
Face-TLD compiled demo
Face Detector

According to his website, he is not releasing the source code yet:

"We have received hundreds of emails asking for the source code ranging from practitioners, students, researchers up to top companies. The range of proposed projects is exciting and it shows that TLD is ready to push the current technology forward. This shows that we have created something "bigger" than originally expected and therefore we are going to postpone the release of our source code until announced otherwise. Thank you for understanding."
posted by stringbean at 5:35 PM on April 4, 2011 [8 favorites]


It's a shame he's called it "Predator", since it biases the discussion to violent and hostile uses.

Callitninja! callitninja! callitninja!
posted by anniecat at 5:36 PM on April 4, 2011


So now, instead of just chess, computers are going to kick the human race's ass at hide-and-seek, too?
posted by mr_crash_davis at 5:37 PM on April 4, 2011 [1 favorite]


So, and this could be completely wrong, is it basically trying to find a match to a stored image, and if the area where the previous match was changes it adds the changed image to it;s store and looks for that as well? That seems like it might have some practical limitations.
posted by Artw at 5:37 PM on April 4, 2011


The follow-spot operators at the theatre I run just got a little nervous after watching this.
posted by Isosceles at 5:37 PM on April 4, 2011 [3 favorites]


I always knew And Kaufman would be back, but I gotta say this joke is way over my head.
posted by cmoj at 5:38 PM on April 4, 2011


Also, the music they should have used in the background of the YouTube video should have been this little song by The Police.
posted by anniecat at 5:40 PM on April 4, 2011 [1 favorite]


The war machine applications are obvious. But just imagine the UI possibilities: touchless devices that use the touch metaphors in iOS and similar devices...

Evoluce releases Kinect-based 'Win & I' gesture interface for Windows 7
posted by Artw at 5:41 PM on April 4, 2011 [1 favorite]


If a person sat down in the same subway car as me, and tracked me the way this program does, I would take it as an action of utter hostility.
posted by benito.strauss at 5:41 PM on April 4, 2011 [2 favorites]


Hello, Skynet!
posted by rtha at 5:46 PM on April 4, 2011


This could be a big boon to get over or increase the uncanny valley in AI and robotics. Or did someone already say that?
posted by Max Power at 5:47 PM on April 4, 2011


When I heard it described I was full of apprehension but once I saw the red computer dot implacably superimposed on a human eye I felt a lot better.
posted by DU at 5:47 PM on April 4, 2011 [22 favorites]


Skeptical that this works outside the lab. iPhoto can find faces too, but only because you have a small number of possibilities and reference objects. Take an object calculate a few significant polygons, build a 3-D model of those relationships and start scanning your photos for those items. If you have >100 faces and a high likelyhood that at least one of those 100 is in the scanned image then matching is much less complicated than when you have millions of candidates and millions of null samples. Also matches become much more fuzzy when you have less pixels to measure. I'm not too worried about the big brother level yet. I'll bet he get a big grant to keep his research going.
posted by humanfont at 5:50 PM on April 4, 2011


Shit, I did a doubletake when he produced the photosheet with his picture on it and it immediately identified him.

THEY KNOW WHO WE ARE.
posted by Kraftmatic Adjustable Cheese at 5:50 PM on April 4, 2011


orthogonality: "I want this software. You see, I have a cat..."

So I've read on reddit ;)
posted by symbioid at 5:56 PM on April 4, 2011 [1 favorite]


If I held stock in a graphics tablet maker I'd be thinking of selling. That ability to recognize and track finger poses in 3D is going to make working at a tablet seem tedious.
posted by Hardcore Poser at 5:56 PM on April 4, 2011 [5 favorites]


I want to see how it deals with dazzle makeup.
posted by Western Infidels at 5:56 PM on April 4, 2011 [4 favorites]


Oh. Cool. It's nice to see that Pavel Andreivich Chekov was able to go to grad school after his military service was over.
posted by schmod at 5:59 PM on April 4, 2011 [3 favorites]


Please keep this away from marketers. Please keep this away from marketers. Please keep this away . . .
posted by quadog at 6:00 PM on April 4, 2011


Please keep this away from marketers. Please keep this away from marketers.

There is, right at this moment, a marketer laying pen to paper on a new metric for tracking attention to ads. There is a guy in a military lab looking at adding this capability to a drone somewhere. There's a police captain, a state senate member, thinking how great it would be to add this capability to our security cameras, and coupling it the sex offender registry. Then adding in a database of known felons, any people who've served time, and eventually anyone with a minor drug offense. Eventually, everyone - because "for the children", "nothing to hide".

And somewhere, someone is contemplating ways to help someone with a disability, and lamenting the fact that they'll probably never get the funds to do it.

Technology can do wonderful and awful things, and sometimes I wish we'd stop making it until people grow the fuck up and stop trying to hurt each other with it.

And on a pure nerd level, I still think this is wicked awesome.
posted by mrgoat at 6:19 PM on April 4, 2011 [9 favorites]


Aww, it can identify pandas!

What could go wrong?!?
posted by Short Attention Sp at 6:20 PM on April 4, 2011


I am going to invest in a company that makes burqas. And then I will need the aid of a heavy hitter PR firm to make Man-Burqa completely acceptable to the American male. Maybe change the name to ImpersonaSack or something. Masks, masks will also be good.

Not that I am not already immensely bothered by being completely submersed in a surveillance culture from the second I leave the subdivision, but this sort of thing makes me want to just start smashing every camera I can reasonably reach. Ugh.
posted by adipocere at 6:21 PM on April 4, 2011 [2 favorites]


The gang at facebook is already looking at all your photos to build a profile of your brand preferences and un-documented peer relationships. As well as behavior profiles and stuff related to your geographic locations embedded in those photos you upload, or were uploaded of you. Your friends even help verify the matches.
posted by humanfont at 6:23 PM on April 4, 2011 [1 favorite]


*Rocks back and forth repeating "Technology is value-neutral ... technology is value-neutral ..."*
posted by ZenMasterThis at 6:23 PM on April 4, 2011 [4 favorites]


I suspect the fact that it'll as happily follow a photo as a person may actually be a bit of a drawback to it's merciless killing machine capabilities.

Just wait til it's combined with a stereoscopic camera, a Kinect-type depth sensor, or both.
posted by jedicus at 6:28 PM on April 4, 2011


And somewhere, someone is contemplating ways to help someone with a disability, and lamenting the fact that they'll probably never get the funds to do it.

I didn't hear anything about source availability in the video... but when the Kinect hackers inevitably download or home-brew an open-source implementation of this thing, you'll see it used to control wheelchairs and robot buddies. Actually, said robot buddies might be the same unmanned vehicles you mentioned; military overspending means there'll be surplus.

Silver lining?
posted by LogicalDash at 6:30 PM on April 4, 2011


This is nice and all, but I really don't know what distinguishes it from the other very good object tracking software in the world, also written by CS grad students. I don't mean to deminish the work at all, because it clearly is solid research, but it doesn't exist in a vacuum, nor does it strike me as head and shoulders better than other computer vision research. The most recent paper linked to on the site showed good, but not stellar performance in surveillance situations, for instance. Of course, for those of you who are creeped out by this, the widespread nature of this type of research probably won't make you any more comfortable.
posted by Schismatic at 6:32 PM on April 4, 2011 [6 favorites]


I am going to invest in a company that makes burqas.[...]masks will also be good.

If your masks & burqas company takes off, I'd be sure to also put some money into a company doing Gait Detection.
posted by fings at 6:36 PM on April 4, 2011 [1 favorite]


Just imagine the possibilities if this had been around to prevent Frederick Douglass's Augustus Washington Bailey's felony theft of himself!
posted by orthogonality at 6:37 PM on April 4, 2011


While face/person tracking is the scariest implication of this technology, I think the biggest application will be in overlaying the digital on top of the actual, aka Augmented Reality. AR has been possible for a while, and some people have done stuff with it, but current methods aren't super accurate and are also slow in terms of processing time.

Most computer vision stuff uses Haar-like feature detection, which is kind of more a brute force method where the attributes of certain patterns ( the human face, the human eye, a playing card ) are stored in memory as what are called Haar Cascades. The algorithm generally works but it takes a lot of manual labor ( teaching the computer ) to generate cascades that are useful.

From the looks of this, there is no need to 'teach' the computer what these cascades look like - it learns this itself, and it can do it on a face by face or object by object basis. This is probably leaps and bounds beyond current object recognition technology, and it terrifies me no less than it inspires my curiosity as to how sublimely awesome the future might be.
posted by localhuman at 6:40 PM on April 4, 2011 [2 favorites]


Evoluce releases Kinect-based 'Win & I' gesture interface for Windows 7

I've played around with the Kinect and it's fun, but I'm not sure it would have the accuracy to provide a good user interface for someone a foot or two away from a laptop screen. The proof is in the shipped code, but this "Predator" approach looks like it could potentially do a better job of doing what I'm talking about.
posted by Blazecock Pileon at 6:47 PM on April 4, 2011


I really don't know what distinguishes it from the other very good object tracking software in the world, also written by CS grad students.

I guess that's why the FPP calls it the "coolest object tracking software you'll see this week".
posted by vidur at 6:54 PM on April 4, 2011


It's only Monday
posted by stbalbach at 7:08 PM on April 4, 2011 [1 favorite]


adipocere: "I am going to invest in a company that makes burqas. And then I will need the aid of a heavy hitter PR firm to make Man-Burqa completely acceptable to the American male. Maybe change the name to ImpersonaSack or something. Masks, masks will also be good."

Like a scramble suit?
posted by mkb at 7:10 PM on April 4, 2011


Eventually, this could be the base for a near-perfect waiting assassin. Put it in say, several ATM machines near the target's home, or where he has been known to use them, each with a hidden gun inside, and wait for the target to get up good and close, get a confirmation by an actual human operator, and well, what's done is done.

Think about how often you get in to spaces where you are waiting for periods of a minute of so, just standing around. Elevators, ATMs, bathroom stalls, cabs, lines for a street vendor, bus stops, etc.

Operators are standing by, with a clean-up team on speed dial.
posted by chambers at 7:13 PM on April 4, 2011


The thing I found most impressive was when the panda turned around and it kept tracking. So it's not just looking for a match of the face (or the originally presented side). Cool.
posted by TheShadowKnows at 7:14 PM on April 4, 2011


If this thing helps my parents figure out how to attach a file to an email, I'm all for it.
posted by jimmythefish at 7:58 PM on April 4, 2011 [2 favorites]


The thing I found most impressive was when the panda turned around and it kept tracking.

A better name for it would have been ASS TRACKER 2000.
posted by jimmythefish at 8:00 PM on April 4, 2011


If your masks & burqas company takes off, I'd be sure to also put some money into a company doing Gait Detection.

So this is how wheelchair accessibility will hit the mainstream.
posted by spaceman_spiff at 8:30 PM on April 4, 2011


Yes, but how well does it fare at Where's Waldo?
posted by Godspeed.You!Black.Emperor.Penguin at 8:36 PM on April 4, 2011


I'll bet those guys at Google who made the Gmail Motion April Fools gag are feeling pretty stupid right now, huh?
posted by schmod at 8:38 PM on April 4, 2011


Mix this with autonomous quadrocopters that can play pass the fricken ball and skynet's fricken next.
posted by porpoise at 8:39 PM on April 4, 2011 [1 favorite]


Also, perhaps Butler could have used this to find the basket. :(
posted by Godspeed.You!Black.Emperor.Penguin at 8:40 PM on April 4, 2011 [1 favorite]


Skeptical that this works outside the lab.

Did you watch the video all the way to the end?

It seems to me that the learning aspect of this directly ameliorates the lighting and angle sensitivity problems of traditional recognition approaches: since it has a video containing a continuous sequence, it can take another snapshot whenever the lighting or angle changes too much.

The core of his research, judging from his papers, is not the tracking or recognition itself, but the use of physical continuity ("structural constraints") to automatically generate a family of detectors which detect the same object under the varying conditions seen in a video. I'm guessing you could apply the technique to non-visual tracking/recognition problems as well— gesture recognition? acoustic tracking of submarines? artificial immune systems? spam detection?

how great it would be to add this capability to our security cameras, and coupling it the sex offender registry

Actually that isn't what this is good for. Recognition of specific people out of a large population is a very different problem. (And a well-studied one.) Zdenek's software seems to use face recognizers as a module, but I don't think the TLD techniques would make chambers' AssassinATM any more effective than one using pre-existing techniques.

Mix this with autonomous quadrocopters

According to his webpage, that's actually a FAQ… the zeitgeist, it is quad-rotored.

I love that one of the demos he describes in his poster is tracking a specific actor throughout one episode of The IT Crowd.
posted by hattifattener at 8:44 PM on April 4, 2011


Minority Report computers... Anyone? Anyone?
posted by stratastar at 8:49 PM on April 4, 2011


I am developing better equipment than this, called the Autonomous Location & Indentification LENS.

Or: ALI-LENS.

Let the battle commence!
posted by tumid dahlia at 8:51 PM on April 4, 2011 [1 favorite]


Man, I'm getting old. As I watched that, part of my brain was going "interesting technology," but a much louder part was going "He's a child, I tell you, a child."
posted by CheeseDigestsAll at 10:15 PM on April 4, 2011


Minority Report computers... Anyone? Anyone?

The Kinect hackers are already there.
posted by auto-correct at 10:20 PM on April 4, 2011 [2 favorites]


I will be waiting to see how this enhances chat roulette, myself.
posted by maxwelton at 11:40 PM on April 4, 2011


This is probably leaps and bounds beyond current object recognition technology.

In a word, no. This face tracking paper was presented at ICIP 2010 as a poster (generally the papers that 'wow' the community are given oral presentations), and the TLD approach that it is based on wasn't even published to a conference proper, but to a workshop (workshops are where techniques that aren't quite ready for broader exposure, or that are heavily specialized and of limited interest to the broader community, are published). This suggests that this is fairly average research (not an insult, most research is, by definition).
posted by Pyry at 11:57 PM on April 4, 2011


I audibly holyfucked at the sheet of photos. Is anyone running the software yet? There's a scenario I am wondering if it works on. Let's say it has been learninng my face, and I raise my newspaper to cover my face. Now it's learned that the newspaper image is part of my image? So if someone else entered the scene, and I left, and they raised their newspaper, it would recognise "me" and when they lowered it start adding their features to my recognition profile. Or would it use some form of percentage hits to realise that the image now presented does not to a sufficient extent match my original profile?
posted by Iteki at 3:46 AM on April 5, 2011


we are going to postpone the release of our source code until announced otherwise.

Well, so much for that. He's going to patent and market this, and the scientific research community as a whole will suffer.
posted by Old'n'Busted at 4:32 AM on April 5, 2011


God forbid he make some money from the awesome thing he created.
posted by Aizkolari at 4:36 AM on April 5, 2011 [3 favorites]


Actually that isn't what this is good for. Recognition of specific people out of a large population is a very different problem.

It picked his face out of about 20 others in the video. So, maybe not good for it, but certainly capable.
posted by mrgoat at 5:40 AM on April 5, 2011


"If you want a picture of the future, imagine a computer tracking a boot stomping on a human face in real time, forever." - George Orwell
posted by blue_beetle at 6:01 AM on April 5, 2011


Source code has been released.
posted by SweetJesus at 7:17 AM on April 5, 2011 [1 favorite]


It's cool it tracks his face, and recognizes it from a sheet of photos, but can it tell them apart? On "Burn Notice", Michael Westin fooled a facial recognition device by holding up a photo of the Hotel Manager, and TV doesn't lie.
posted by Uther Bentrazor at 8:02 AM on April 5, 2011


This thread makes me keenly aware of the gap between what the true state of the art in signals processing is and what people think it is. Most of the Orwellian stuff that is causing concern here like automatic facial recognition and tracking is seriously old hat. Hell, there are commercial versions of that technology in widespread use at casinos and in Wal*Mart. There is an endless array of of behavior tracking stuff out there today. Its not much good to blanket the globe with high resolution video surveillance sats if you can't do anything with the raw data.
posted by Lame_username at 8:09 AM on April 5, 2011 [1 favorite]


I want to see how it deals with dazzle makeup.

It might mistake you for a hostile warship, and send missles incoming.
posted by FatherDagon at 8:26 AM on April 5, 2011 [1 favorite]


I am one of those hard working CS graduate students who works in this field. I saw this link a couple days ago and my friends and I were making fun of the name (the technology is solid). The summary of the conversation is that in future all cool computer vision projects should have a cool name.

I have only glanced at the papers but they are on my to read pile. From what I could glean the real innovation here is that the tracker is a more or less a vanilla boosted decision tree where the decision tree is learned on-line versus off-line. The technology is cool, but aside from the on-line learning component there is very little here that we couldn't do before the paper was published. Boosted/cascaded decision trees have been a staple of computer vision for at least ten years. (E.g. here is a face tracker I coded up in like a day for class (shameless self link).)

Speaking as a former defense contractor these kind of videos are great, as six months down the road there will be half a dozen military SBIR grant solicitations trying to apply the technology (versus funding the primary R&D).
posted by kscottz at 9:08 AM on April 5, 2011 [1 favorite]


It could make sure that baby's face is not in the pillow for too long.
posted by dracomarca at 11:54 AM on April 5, 2011


I live that he ominously ends on Thank you very much, and see you later.
posted by Amanojaku at 12:01 PM on April 5, 2011


He GPL'd it.

I just want to know, is the Software Racist?
posted by roboton666 at 10:16 PM on April 5, 2011


errr....Here is the link

I do wonder the ramifications of releasing the code with a GPL license, I mean, does this mean everyone has to share the evil ring with one another?
posted by roboton666 at 10:19 PM on April 5, 2011


It's a shame he's called it "Predator", since it biases the discussion to violent and hostile uses.

Also "Terminator" would be way more fitting.



He could always call it S.T.A.L.K: Surveillance/Tracking Algorithm-Learning from Kalal

Yes, I want my share
posted by youhavetoreadthistwice at 2:23 AM on April 7, 2011


Updated source code link
posted by joshwa at 11:13 PM on April 8, 2011


« Older The Big Map Blog   |   A band of sisters and brothers in a circle of... Newer »


This thread has been archived and is closed to new comments