In Case You Aren't Paranoid Enough About Social Media & Privacy
October 9, 2015 9:23 AM   Subscribe

"One broader implication of this is that no one should take the NSA seriously when they say they are only collecting “metadata” on whom someone contacts, rather than the content of the communication. Social network metadata is incredibly powerful." How to tell whether a Twitter user is pro-choice or pro-life without reading any of their tweets
posted by COD (47 comments total) 25 users marked this as a favorite
 
There is no such thing as metadata.
There is, only and forever, data.

Ian Welsh:This Is Why I Always Give the Benefit of the Doubt to Left-wing Opponents of the Regime [contains leaked TS COMIN REL FVEY]
posted by the man of twists and turns at 9:26 AM on October 9, 2015 [12 favorites]


"There is no such thing as metadata.
There is, only and forever, data.
"

Metadata is a type of data. There is such a thing.
posted by I-baLL at 9:37 AM on October 9, 2015 [8 favorites]


See also the post about finding Paul Revere
posted by scodger at 9:37 AM on October 9, 2015 [4 favorites]


I see GamerGate continues to be a catch all for "people who are just the worst" - probably a further breakdown would be possible between people who are pro life Gamergaters due to an obligation to all shitty conservative movements and people who are pro life Gamergaters because they are shitty trolls who get a rise out of people by being randomly misoginistic, but at this point who the fuck cares?
posted by Artw at 9:42 AM on October 9, 2015 [4 favorites]


Metadata is data about data.
posted by anifinder at 9:42 AM on October 9, 2015 [5 favorites]


whoa. sobering article. A lot to think about here.
posted by Annika Cicada at 9:42 AM on October 9, 2015 [3 favorites]


But this doesn't make me more paranoid. If someone saw a mailing list for NOW and said "I bet that most of these subscribers are pro-choice!" they'd be right. And I would not be outraged that someone who saw me on that list, or saw a NOW pamphlet or Ms. magazine in my stack of mail, would infer "you are pro-choice." That would not be surprising for them to assume.

Likewise if someone saw me at a table handing out pamphlets about feminist topics, they would be able to infer "probably pro-choice." Or saw a bumper sticker on my car.

Twitter is public. You get on it and you are getting on a tiny digital soapbox in a worldwide town square.

(I did enjoy the correlation between prochoice people and "fuck" and "cats" vs. between prolife people and "Jesus" and "football.")
posted by emjaybee at 9:51 AM on October 9, 2015 [20 favorites]


Metadata is like Uber for your data.
posted by blue_beetle at 9:55 AM on October 9, 2015 [6 favorites]


It is interesting, though, to see how things we post on nominally unrelated topics are often such strong indicators of other things. Stuff like "Catholic" makes a lot of sense, being opposed to abortion and contraception is officially part of Catholic dogma despite 98% of sexually active Catholic women using contraception at least once and the abortion rate among Catholic Americans being identical to the abortion rate of non-Catholic Americans.

But "fuck" and "cats" being strong indicators of pro-choice beliefs is one of those linkages I'd never have imagined would exist. Likewise "writer". To me it's a wonderful surprise to find that such things correlate.

The strong correlation between gamergate and anti-choice beliefs is not even slightly surprising, I've long held that gamergate is essentially the awakening of movement conservatism among gamers and so far every bit of data available to date supports that hypothesis.

Still, while on the one hand such things are somewhat ominous, I'm enough of a data geek that I find my joy in the analysis and connections far exceeds my concern about the NSA or others using those connections. I figure the NSA doesn't really need to resort to such measures to find people on their enemies list, not with every modern CISCO router containing an NSA back door.
posted by sotonohito at 9:58 AM on October 9, 2015 [3 favorites]


MetaFilter: data about data.
posted by Fizz at 10:04 AM on October 9, 2015 [4 favorites]


But "fuck" and "cats" being strong indicators of pro-choice beliefs is one of those linkages I'd never have imagined would exist. Likewise "writer". To me it's a wonderful surprise to find that such things correlate.

Well, these do sound like my people.
posted by Artw at 10:09 AM on October 9, 2015 [25 favorites]


Well, filter about filter.
posted by maxsparber at 10:09 AM on October 9, 2015 [10 favorites]


Doggone it, quit calling it "pro-life". It's pro-coat-hanger or pro-birth.
posted by notsnot at 10:10 AM on October 9, 2015 [18 favorites]


Remarkably, these social network data are so powerful that in some cases, someone’s profile provides no further information. For example, if you know the stances of two of the people a tweeter is following, knowing their gender and whether they describe themselves as Catholic does not improve your prediction of whether they will be pro-life or pro-choice. Put another way, so polarized is the social network structure that even very basic, obvious characteristics stop mattering if we know who your friends are.

One broader implication of this is that no one should take the NSA seriously when they say they are only collecting “metadata” on whom someone contacts, rather than the content of the communication. Social network metadata is incredibly powerful.

posted by a lungful of dragon at 10:11 AM on October 9, 2015 [4 favorites]


"not with every modern CISCO router containing an NSA back door." Please point me to somewhere that I can read up on that, please?
posted by Annika Cicada at 10:16 AM on October 9, 2015


There's possible backdoors in the code.

There's also this:

"To avoid NSA, Cisco delivers gear to strange addresses"
posted by I-baLL at 10:21 AM on October 9, 2015


Pro-forced-birth.
posted by symbioid at 10:22 AM on October 9, 2015 [11 favorites]


The problem is that metadata is an abstract term for something people have experience with.

Metadata is the outside of a sealed envelope. Maybe you can't get inside the envelope, but it's enough to build some strong hunches.
posted by mccarty.tim at 10:24 AM on October 9, 2015 [6 favorites]


We leave foot prints, even when we think we're just mice no one pays attention to.

There are cats looking, always. Anything online is and will be tracked and traced. That we can't truly opt out at needs to be perhaps the most basic lesson of net literacy and awareness.

I really do think no one in the major corporations like twitter intends these sorts of outcomes, but when they design APIs and public-facing metadata, like follower and following lists, who knows what use a third party may do with it, market research, amateur sociology, whatever.
posted by bonehead at 10:38 AM on October 9, 2015 [2 favorites]


"Metadata is the outside of a sealed envelope. Maybe you can't get inside the envelope, but it's enough to build some strong hunches."

I think the problem is that people don't realize that there's an outside of an envelope. Like exif data in an image, most people don't know that it's there.
posted by I-baLL at 10:40 AM on October 9, 2015 [4 favorites]


I think there is a difference between "Hey you can do some really cool shit w/metadata!" and "We, as a country, purposely build systems whose sole purpose is to aggregate, store and analyze metadata for 'National Security' purposes".

I find myself a bit concerned at the concept that "hey if you're on twitter, you're on a public soapbox anyways!"

What if it's not twitter? What if it, say, Livejournal. What if you, as an end user on LJ, take care to use pseudonym and not put any public details about interests and all your posts are "Friends only". It's not a "public soapbox" in this case. But it still is irrelevant, because we're online and it's our fault no matter how secure we want to be. If you want security sneakernet is the only way to go. And I agree ultimately, that is the only way to be secure.

But that's a de jure/de facto reality, not a necessary consequence of how things intrinsically are. What I mean is, to say "well hey you're in public, you should expect this" is putting the cart before the horse and only amplifies the problems we have, because it tolerates this idea that if we're in public, no matter how private the content of the communication, the metadata is fine and dandy and up for grabs and don't like it, get off.

And I find I'm dealing with the ideals I have and facing the realities. I guess I worry that the way these things are spoken as if mere virtue of the factuality of such things means it HAS to be that way. That these aren't things we can control in any way. And maybe they aren't. And so, fine. But maybe they are, but we keep telling ourselves these things as if they're just how they are so we sit down and shut up and accept it.

At the same time, I get it. I don't know how you could actually prevent people from scraping publicly accessible data. Demand a social network has no mass metadata API?

In some sense, I suppose, this would be necessary. Thinking about it now, it's interesting how the needs of the modern capitalist society (vis a vis advertising) nicely coincide with the needs of the national security state. Big Data is the puppet behind the scenes. In order to build a broad profitable social network, one needs to create APIs that tracks the users in order for advertisers to profit and sell more targeted advertising. These are the kinds of tools that the Three Letter Agencies love.

Hmm.
posted by symbioid at 10:47 AM on October 9, 2015 [1 favorite]


Do people who use Twitter or Facebook or whatever like knowing the metadata about themselves? Who follows them and who they follow? It it important that other people can see who your friends are and who you think is worth paying attention to? That's all this particular example is looking at.

It's creepier when companies do thinks we don't expect "dumb" computers to be able to do like automatic face tagging. But we do need to tome to grips with the fact that software is already really good at recognizing people and places and only going to get better. Things aren't the same as they were even five years ago, and the rate of change is quickening, I think.

From a certain perspective, a sense of "creepiness" is futureshock.
posted by bonehead at 11:03 AM on October 9, 2015


The fact that the erosion of privacy is more cultural than legal at this point says that Huxley predicted the future better than Orwell. It didn't take fascists to get us to report our locations and people we associate with, it took free and convenient games and apps. The NSA's dragnet smells more Orwellian, but private companies and individuals operating within the law can slurp up tons of data for abuse and/or profit using regular old APIs.
posted by mccarty.tim at 11:15 AM on October 9, 2015 [4 favorites]


Bonehead, considering Klout, SEO, and ThinkUp are tools people take seriously, I'd say there is a market for your own metadata.
posted by mccarty.tim at 11:19 AM on October 9, 2015


I think there is a difference between "Hey you can do some really cool shit w/metadata!" and "We, as a country, purposely build systems whose sole purpose is to aggregate, store and analyze metadata for 'National Security' purposes".

Oh, absolutely. I still find the NSA creepy, but it's not because identifying information about me is out there, but because they are collecting that data even though they have no reason to and holding on to it indefinitely for purposes that could end with me or mine in some black-ops holding cell if the wrong person wants to get us.

But I don't really think I can move around and act in the world totally anonymously, or even that I'd want to. What I'd prefer is to rein in this ridiculous security state that has carte blanche to disappear me or at least ruin my life for no good reason.
posted by emjaybee at 11:19 AM on October 9, 2015 [1 favorite]


While the government you selected does not trust you, consider using methods of communication that are not that well-suited for mass-surveillance

Wait, wait, wait. The methods of communication we're talking about aren't well suited to mass surveillance. That's why it took them decades and billions of dollars of tax money to do it.

If we want to fix things, we have to stop trying to blame the people who get illegally spied on by parts of their government that don't operate out in the light of day for being spied on.
posted by atbash at 11:30 AM on October 9, 2015 [4 favorites]


Pro-forced-birth.

It's all about the punishment. Someone has to be punished for having sex that for whatever reason resulted in a pregnancy, and someone has to be punished for being convicted of a capital crime that they may not have committed.
posted by fuse theorem at 11:31 AM on October 9, 2015 [1 favorite]


One broader implication of this is that no one should take the NSA seriously when they say they are only collecting “metadata” on whom someone contacts, rather than the content of the communication. Social network metadata is incredibly powerful.

This is the key to the article. By cross-referencing from a bunch of sources, people can more or less easily be profiled with ease (keeping in mind that ease was probably at the cost of some hundred millions worth in IT and computing power).


Also, I see GamerGate as a cartoon fart in that graph. How appropriate.
posted by lmfsilva at 11:58 AM on October 9, 2015 [1 favorite]


If you want security sneakernet is the only way to go.

The floppy disks with the digests of my latest social network posts are in the mail.
posted by cosmic.osmo at 12:12 PM on October 9, 2015 [2 favorites]


keeping in mind that ease was probably at the cost of some hundred millions worth in IT and computing power.

Sort of. One could argue that this comment was made with hundreds of millions of dollars worth of IT, computing power, and infrastructure investment. But it's costing me less than pennies. I doubt the researchers spent any considerable amount of money on this (indeed, I'd bet their time was worth far more than any monetary investment in this research).
posted by el io at 12:23 PM on October 9, 2015


"The fact that the erosion of privacy is more cultural than legal at this point..."

Privacy has always been more cultural than legal.

Legal protections for privacy tend to exist where there's an imbalance of power such that cultural protections of privacy are insufficient. As a rule, almost all of the recent concerns about privacy (putting the NSA issue aside) have everything to do with public speech and activity that is by definition public and not private but, as a practical matter, in the past has been unavailable because it just wasn't worth the effort for most people for most purposes. But no longer -- in many cases, it's trivial.

This is a big problem because that practical barrier was doing most of the work of shielding us from the implications of what it means to say that something is "public" and not legally protected by privacy laws. Conversely, it also shielded us from the implications of what it would mean to legally protect speech and acts that have historically been thought to be public. We didn't need to grapple with this. Now we do.

Even so, a big portion of these issues can be and, I predict, will be solved when culture catches up to this changing reality and social norms of behavior make a lot of stuff, like casual amateur internet sleuthing, socially taboo.

The problem, though, is what I wrote in my second paragraph -- powerful interests that won't be dissuaded by social taboo. The most prominent among these, aside from government, are those who represent the interests of capital: marketing and employers. And you can see, especially in Europe, historically how abuses by those interests have resulted in the prevalence of privacy laws. So, too, the natural progression now with this internet-era metadata collection and profiling -- it should become, and probably will become, legally restricted. Not so much because some random person can figure out all sorts of stuff about you by your social network affiliations and the like, but because marketers and employers are doing this and making everyone's lives hellish.

That said, I have to admit that I have a limited amount of optimism about this for the US. We have the liability here of combining two different powerful American cultural biases -- a bias toward business and advertising that doesn't really respect privacy and a bias toward technological innovation that tend toward a blind embrace of "if we can collect this data, we should!"

I sometimes feel that nothing in recent American history has brought us closer to the dystopia of Pohl's classic The Merchants' War than the rise of the internet. The cultural forces have always been there -- it's just that it took a lot of effort and the exercise of all that effort tended to strike people as pretty skeevy. But now it takes a lot less effort and the methods don't have the stigma of sneaking around in someone's trash (literally or metaphorically). Instead, it's all shiny and technological and the future and that somehow makes it more palatable -- or, at least, it makes it seem more inevitable -- and so we're mostly just going along with it. But we oughtn't.

We're not going to get very far in an effort to obscure this data. It will remain practically available in the same way that taking someone's mail out of their mailbox is easy and available. We'll have to rely mostly upon social taboo to keep people from doing this casually -- but much more importantly we need to have laws with very stiff penalties that prevent this being done by powerful institutions. In other words, the efforts to insist on technological or broad structural solutions that will reverse this trend are futile and, more importantly, counter-productive. They take attention and effort away from focusing on where both the most damage is done and where targeted efforts to prevent this damage are most likely to be successful.
posted by Ivan Fyodorovich at 12:40 PM on October 9, 2015 [4 favorites]


Also, I see GamerGate as a cartoon fart in that graph. How appropriate.

Daughter in law's friend just got dox'd after shouting her abortion, followed of course by the death threats etc.
Not a cartoon.

Doggone it, quit calling it "pro-life". It's pro-coat-hanger or pro-birth.

Seems more like pro-coat-hanger or pro-surgical-suite.
posted by Alter Cocker at 12:51 PM on October 9, 2015 [3 favorites]


And now it's 2024,
Knock-knock at your front door
It's the social media secret police
They have come for your uncool niece
posted by entropicamericana at 1:02 PM on October 9, 2015 [2 favorites]


Not a cartoon.

But certainly a fart.

(and to be honest, I feel kind of insulted that after the months of endless bullshit GGers are churning out, anyone here, of all places, would think that the "cartoon" was the bit that I was implying.)
posted by lmfsilva at 1:07 PM on October 9, 2015


This statistical results are meaningless though because the sample set is self-selectedly vocal or engaged about this issue. Its nonsense.
posted by mary8nne at 1:24 PM on October 9, 2015 [1 favorite]


This statistical results are meaningless though because the sample set is self-selectedly vocal or engaged about this issue. Its nonsense.

Most people who are political are engaged in various issues. The point is not that the ones who are vocal can be identified, the point is, if you're on Twitter, your views can be predicted even if you've never posted on a subject by someone with access to the connection graph.
posted by CheeseDigestsAll at 2:15 PM on October 9, 2015


and to be honest, I feel kind of insulted that after the months of endless bullshit GGers are churning out, anyone here, of all places, would think that the "cartoon" was the bit that I was implying.

Now that I can see the grimace on your face Imfsilva, I apologize.
Please understand that I listened to a deeply anguished phone call from daughter in law just the other night, so I sort of jumped.
I still don't have any insight at all into the minds of those who threaten other people anonymously and then run away professing christian morality. (NO spell check, christian is lower case). I have seen this shit for decades and have no reaction except "fucking coward".
posted by Alter Cocker at 2:18 PM on October 9, 2015


"Metadata is the outside of a sealed envelope. Maybe you can't get inside the envelope, but it's enough to build some strong hunches."

> I think the problem is that people don't realize that there's an outside of an envelope. Like exif data in an image, most people don't know that it's there.

The "outside of a sealed envelope" idea is a great, easy way to describe metadata, and I will be using that some time in the future.

As for EXIF data, that scares me because there is so much that can be tucked away inside EXIF. Even if most is photo-specific, there is still plenty of personally identifying information. The story of John McAfee being located with EXIF data from a photo in a Vice article is well known, I first thought of a musician who I won't name. I recently downloaded a free album from them, which included a photo of what they described to be from their new home. The photo didn't give much for internet sleuths to work from, except there was the location information, automatically tagged on the photo. I wanted to shout on music forums "hey, I think I know where [blank] lives! Check out the photo of [blank] in their new free release!" but I thought that would be rude, especially for a big-name artist. It seems no one else looked into the tags on that file, or haven't posted about it on public forums, so I'll let it die there, hoping they move before anyone digs in and shares what they found.

The weird/uncomfortable thing is that this information will still float around for a while, given the popularity of the artist and the coverage of this free release. Even if they remove the photo, or replace the photo with an anonymized file, the original is still shared on torrent sites and the like, "archived" by fans.
posted by filthy light thief at 2:18 PM on October 9, 2015 [1 favorite]


Anybody follow the link to the Intercept piece from Welsh's article? Made me seriously think that consider the possibility that every online community of significant population (including this one) has been "compromised" by the NSA et al. when topics relevant to them come up. Or it doesn't even need to be the NSA; it can be "merely" corporate sock-puppets, etc. I did already think that one was basically participating in the modern economy and therefore trackable and crackable or off-grid and non-participating, but I think I may have always underestimated the interest Big Data and the NSA took in the activities of online names.

Deus Ex posited that you could replicate the functionality of god and the gods (observation and judgement) with data mining algorithms. I used to find that just philosophically interesting. Today, it seems like anybody with anything to hide at all should find it relevant.

I will say this - I have always looked down on social media, if you consider Facebook et al. social media and metafilter not social media - I always thought that it wasn't really social. I was wrong. I looked on social interaction as more of a conversations-you-have thing, but I wasn't carefully considering all the negative aspects of it - shaming, gossip, arguing. Facebook et al. replicate those functions perfectly. And enshrine it in perpetuity, either online or in backup storage, for sufficiently powerful interests to peruse.

Short of people abandoning social media en masse, I don't really see a way forward where this doesn't just continue the way it has been. Governmental actions aside, people's first impulse is to share freely. For the 10% seriously concerned about these issues, it can lead to problems. For the other 90%, it's not a problem until it's a problem. Or until the internet hate groups find a reason to start ruining your life, even though you took precautions. Which reminds me, now that enough time has passed since my last move, I need to scrub my online information from the most obvious locations that have it up without my opt-in.
posted by Strudel at 2:23 PM on October 9, 2015


(no need to apologize, Alter Cocker. It's just that it wasn't the first time I was accused of downplaying or sympathizing with GG here, and that whole "movement" is something I'd very much rather keep on record as being some of most toxic, inexcusable bullshit I've seen, and it somehow keep finding new ways of getting worse)

Made me seriously think that consider the possibility that every online community of significant population (including this one) has been "compromised" by the NSA et al. when topics relevant to them come up.
It has been a running joke with a friend of mine when we say something that out of context looks terrorist-y, the other says "there goes another red light in Utah!". At this point, I assume everything that gets POSTed on the internet is ripe for interception and storage.
posted by lmfsilva at 3:21 PM on October 9, 2015


it's interesting how the needs of the modern capitalist society (vis a vis advertising) nicely coincide with the needs of the national security state

If Graeber is to be believed, markets and states have gone hand in hand from the beginning, so it's not surprising, really.
posted by Steely-eyed Missile Man at 4:12 PM on October 9, 2015 [1 favorite]


What "metadata"? The optical splitters on the network copy *all* the data. The envelope metaphor is inappropriate. All you're sending is a postcard, readable by everyone who cares to.
posted by mikelieman at 6:52 PM on October 9, 2015


#ShoutYourAbortion thread: End the silence.
posted by homunculus at 10:46 PM on October 9, 2015


CheeseDigestsAll: "This statistical results are meaningless though because the sample set is self-selectedly vocal or engaged about this issue. Its nonsense.

Most people who are political are engaged in various issues. The point is not that the ones who are vocal can be identified, the point is, if you're on Twitter, your views can be predicted even if you've never posted on a subject by someone with access to the connection graph.
"

Nonsense may be overstating the problem, but you are still predicting the views of those who have never posted on the subject based on those who have.

What I mean is, what they've really demonstrated is they can predict what stance someone will express on Twitter based on who they are connected to. But I could easily suppose that a lot of people are less likely to be vocal about their stance if they feel surrounded by people expressing the opposite stance.
posted by RobotHero at 10:24 AM on October 10, 2015




How the Military Uses Twitter Sock Puppets to Control Debate and Suppress Dissent

Why is Twitter Censoring Our Timelines Without Consent?

Appears zooko is quitting twitter due to the censorship, probably more to follow.
posted by jeffburdges at 1:29 PM on October 21, 2015


Probably best not to believe anything ever that you read on Brietbart, especially on any subject that may be a GamerGate bugbear.
posted by Artw at 2:26 PM on October 21, 2015 [2 favorites]


« Older This Could Be Bad For Movie Stars Everywhere!   |   Do what you can with the time you have. Newer »


This thread has been archived and is closed to new comments