Well this is Bleeping Creepy...
March 17, 2019 6:16 PM   Subscribe

The Government Is Using the Most Vulnerable People to Test Facial Recognition Software Research shows that any one of us might end up helping the facial recognition industry, perhaps during moments of extraordinary vulnerability.

Through a mix of publicly released documents and materials obtained through the Freedom of Information Act, we’ve found that the Facial Recognition Verification Testing program depends on images of children who have been exploited for child pornography; U.S. visa applicants, especially those from Mexico; and people who have been arrested and are now deceased. Additional images are drawn from the Department of Homeland Security documentation of travelers boarding aircraft in the U.S. and individuals booked on suspicion of criminal activity.
posted by Homo neanderthalensis (18 comments total) 13 users marked this as a favorite
 
I was on a high from that kid chess player and now this.

JFC what is wrong with people? That is officially enough internet for today.
posted by greermahoney at 6:34 PM on March 17, 2019 [6 favorites]


This is an extremely vague article that links to a bunch of PDFs. That isn't to say that it doesn't have meat on its bones, but rather that it's written in a way that obscures a lot of that meat in favor of blanket statements.
posted by Going To Maine at 7:53 PM on March 17, 2019


(you may also interpret this as a complaint about reading PDFs on mobile, which is fair.)
posted by Going To Maine at 8:04 PM on March 17, 2019 [1 favorite]


FWIW, international arrivals facilities throughout the US are about to switch from a kiosk-based "coupon" system for passport control to biometrics, primarily facial recognition. It increases processing speed and reduces staffing for CBP so implementation is fairly inevitable.
posted by q*ben at 9:50 PM on March 17, 2019


This article is somewhat incoherent and given the inaccuracies I can identify, I am skeptical of the overall accuracy.

The Child Exploitation Image Analytics program—which is a data set for testing by facial recognition technology developers—has been running since at least 2016 with images of “children who range in age from infant through adolescent” and the majority of which “feature coercion, abuse, and sexual activity,” according to the program’s own developer documentation.

This is flat wrong. The Child Exploitation Image Analytics program is not a dataset. It is literally a project that uses images of child porn to train algorithms intended to ... identify images of child porn and the children's faces in them. It's not even made clear that this is the same thing as the CHEXIA program the article referred to earlier.

And this:
pointing to human subject review and regulations is a deflection from conversations about data ethics.....we should be focusing on more regulation at every step in the process. This regulation cannot come from “standards bodies” unfit for the purpose. Instead, policies should be written by ethicists, ....

You want NIST to make their own decisions about the ethics of each dataset they use, or you want them to follow the ethics codified as 'human subject review and regulations' by a separate group? Pick one.
posted by the agents of KAOS at 10:18 PM on March 17, 2019 [9 favorites]


It increases processing speed and reduces staffing for CBP builds a data base of every single face entering or exiting the US so implementation is fairly inevitable.

FTFY.

China is already doing this. It was unnerving to be photographed on arrival in Shanghai and there's no way to opt out. And of course I'd forgotten my cvdazzle.
posted by chavenet at 3:35 AM on March 18, 2019 [3 favorites]


On my last trip overseas, I was fingerprinted and photographed in South Korea. I don't think the U.S. is an outlier in that form of data collection any more.
posted by ardgedee at 4:48 AM on March 18, 2019


I have only traveled out of the US once in the last ten years, a trip to Calgary about a year ago. I wasn’t photographed on arrival (at either end) but when I was leaving Calgary I was processed through an electronic kiosk that did take a photo. I’m not sure what the purpose was since it appeared to only capture my chin, neck, and chest; the bulk of my face was out of the frame. That doesn’t seem like it would be ideal if the intent was to match it against my passport photo.
posted by nickmark at 8:25 AM on March 18, 2019


I flew through Toronto in October and went through a similar kiosk. You're supposed to move the camera so it focuses on your face.
posted by Autumnheart at 8:39 AM on March 18, 2019


This is flat wrong. The Child Exploitation Image Analytics program is not a dataset. It is literally a project that uses images of child porn to train algorithms intended to ... identify images of child porn and the children's faces in them. It's not even made clear that this is the same thing as the CHEXIA program the article referred to earlier.

Thank-you TAOKAOS - yes - this dataset is extremely tightly controlled - I was involved in a Child Exploitation initiative for several years in a technical role (support, deployment, language localization, LEO-training) - it was very very very regulated, non-LEO's had no access to any form of data (textual or otherwise).

At the time (~10-years), the image analysis software was very rudimentary - mainly attempted to identify duplicate images by detecting scaling, rotation, cropping, so that it would reduce the manual review of images by LEO's. (i.e. instead of having to review 100,000 images a day, perhaps only 50,000 of those were unique)
posted by jkaczor at 9:58 AM on March 18, 2019


Hmmm - and from reading the linked document regarding Child Exploitation Image Analytics - as a participant, you would send your SDK/API to NIST...

But... then - assuming your algorithm is cloud-based, then the images would be transferred to your servers for analysis... Nothing in that document, nor the "Application to participate" discusses data integrity, security, caching/cache-cleanup, or even about the ability of your servers to save images sent for processing.

Yikes... Now am worried, we had far more "data governance" rules in-place previously... This is bad.
posted by jkaczor at 10:12 AM on March 18, 2019


You're supposed to move the camera so it focuses on your face.

Right - but it doesn’t seem to care if you do that or not, was what I meant. Like a lot of security theater measures it doesn’t seem effective at achieving the implied purpose and leaves one wondering what the actual purpose is.
posted by nickmark at 10:19 AM on March 18, 2019 [2 favorites]


and leaves one wondering what the actual purpose is

Well - it's just like traditional theater - it's about making the participants feel "something", be that good/bad, scared/safe. And all the "pomp and ceremony" helps distract from what is happening behind the curtain.
posted by jkaczor at 10:34 AM on March 18, 2019 [1 favorite]


Jkaczor, that document specifies that the tests will be conducted offline. There's no uploading to your servers.
posted by the agents of KAOS at 12:23 PM on March 18, 2019 [1 favorite]


(Section 1 defines the testing environment, down to the hardware and OS that they will be run against.)
posted by the agents of KAOS at 12:29 PM on March 18, 2019


Right - but it doesn’t seem to care if you do that or not, was what I meant. Like a lot of security theater measures it doesn’t seem effective at achieving the implied purpose and leaves one wondering what the actual purpose is.

Well, the customs official matches your face with your passport, so yes, the picture-taking before you get to a person seems unnecessary. Theoretically facial recognition software could perform this role, but I'm not sure how it replaces what seems to be the main function of that official (at least at the Canada- US border), which is to make sure you're really not entering either country to claim asylum.
posted by MetalFingerz at 12:32 PM on March 18, 2019


document specifies that the tests will be conducted offline

Ok, I missed that in my cursory review of the document (a highlighted section notes 56 logical CPU's available) - and specifically section 1.19.3 which disallows any network access. Then, at the end of the day - the dataset is completely safe - algorithms and their fit/function/performance are the only things being evaluated.
posted by jkaczor at 6:48 PM on March 18, 2019


Second day I've ejected before reading this fully. I can't even.
posted by I'm always feeling, Blue at 8:39 PM on March 18, 2019


« Older We got bored while you were gone.   |   Deep Park Newer »


This thread has been archived and is closed to new comments