Internet journalism and invasive surveillance
May 29, 2015 9:30 AM   Subscribe

 
The real question is, what can advertisers tell from our metafilter history?
posted by The Devil Tesla at 9:31 AM on May 29, 2015


We have cameras?
posted by maryr at 9:44 AM on May 29, 2015 [8 favorites]


But this isn’t the Holy Grail of my surveillance capability. What I’d do next is: create a world for you to inhabit that doesn’t reflect your taste, but over time, creates it. I could slowly massage the ad messages you see, and in many cases, even the content, and predictably and reliably remake your worldview. I could nudge you, by the thousands or the millions, into being just a little bit different, again and again and again. I could automate testing systems of tastemaking against each other, A/B test tastemaking over time, and iterate, building an ever-more perfect machine of opinion shaping.

Hyperbole aside, this is the real issue, and not because I'm worried about how my behavior is modified (I consider myself highly inner-directed), but because of how our culture is modified.
posted by Ickster at 9:46 AM on May 29, 2015 [3 favorites]


maryr: "We have cameras?"

Fiiiiiinnnnnnnnnee. I'll put my pants back on....
posted by Samizdata at 9:49 AM on May 29, 2015 [2 favorites]


We have cameras?

Of course not.



Hey! Is that a new shirt?
posted by Thorzdad at 9:55 AM on May 29, 2015


Adblock and Privacy Badger, the latter by the Electronic Frontier Foundation, help one opt out of this whole mess.

Privacy badger btw tells me that this Mefi page has a (blocked) quantserve tracking cookie somewhere on it that tracks my browsing history between different websites
posted by AGameOfMoans at 9:58 AM on May 29, 2015 [4 favorites]


The pisser is, of course, with websites which refuse to work unless you un-block tracking and ad scripts. I've landed on sites that wouldn't deliver content until things like Google Analytics scripts are re-enabled.
posted by Thorzdad at 10:06 AM on May 29, 2015 [3 favorites]


Maybe once a week... maybe I run across a site that I actually need or want to look at that is refusing to run. For that I have a second unblocked browser in which I open a private window - see what I need to see and then quit the browser which deletes all cookies. It is a very minor inconvenience when balanced against me being spied upon.
posted by AGameOfMoans at 10:16 AM on May 29, 2015 [1 favorite]


The thing about this piece is that it doesn't just apply to journalists. It basically applies to just about anybody who works on making any piece of the web tick. The cross-site analytics, the metrics, the data-hoarding are everywhere in e-commerce, any kind of content-oriented undertaking, social anything - it's essentially universal, and the pressure on technical / marketing / sales / customer service / management people to enable it is both overwhelming and the easiest thing in the world to capitulate to.

Last year I quit a job that I deeply loved, at a place I'd poured years of my life into, partially in explicit protest of this kind of behavior. Lately I work for people who, though conscientious and considerably more self-aware about their own marketing practices and so forth, rely on just as much of the enabling infrastructure. You can refuse to support the machinery or you can participate in this economy, but it's brutally difficult to do both.
posted by brennen at 10:30 AM on May 29, 2015 [8 favorites]


This article is compelling, but it's still not spelling out step-by-step _why_ regular people should worry that this information is being stored and used. Yes, no real security for political activists (her newest article on that site), yes, maybe "predictably and reliably remake your worldview", but that's not really spelling it out. How do these become _real_ for people?

I guess what is needed is a compelling and imaginative story, possibly non-fiction, possibly fiction, that makes it clear how step 1 leads, very probably, to step 4 (unhappiness and/or worse life for you and your children). Is anybody writing _that_?
posted by amtho at 10:41 AM on May 29, 2015


I ended up killing an internal project that I was really excited about in part because I had no idea how to keep it from becoming this kind of mess once it had shipped and was in the hands of random, unprincipled strangers. I feel really fortunate that it was peripheral to my primary work, and I had the luxury of axing it without having to quit my profession and wait tables. It's a really tangly, frustrating system.
posted by verb at 10:41 AM on May 29, 2015 [12 favorites]


The idea of shaping a person by shaping their browsing was a minor plot point in Stross' book Rule 34. In his case an algorithm was looking for potential terrorists and trying to reshape them into nonterrorists.
posted by sotonohito at 11:14 AM on May 29, 2015 [2 favorites]


Perhaps this is false nostalgia, but it still bugs me that we seem to have all the building blocks for making computing safe for average people but we seem to lack a critical mass to push them together into formal projects.

I am not sure that we have the building blocks. Or I guess maybe I'm not sure that the blocks we do have can presently be arranged in a way that overcomes the vast array of perverse incentives and pathologies (both accidental and borne of deliberate malice) driving most of the people capable of arranging them.

We have a lot of what it'd take to build a safe(r) network, but the architecture of just about every existing system is pretty bad on security and pretty susceptible to the cumulative elimination of privacy. I mean, what do we do about the fact that the tech driving a lot of our problems here is something as fundamental as the relational database?
posted by brennen at 11:18 AM on May 29, 2015


Why would we want to do anything about that, brennen?
posted by LogicalDash at 11:21 AM on May 29, 2015


Why would we want to do anything about that, brennen?

I broadly agree with Norton's conclusions about privacy and surveillance, and I think that a lot of the problem falls out of fundamental tech that's become really basic to our economy (like the RDBMS) over the last 20-30 years, and that this makes the problem harder to solve than it would be if we were just talking about the economic incentives around a narrow slice of web tech in isolation.
posted by brennen at 11:29 AM on May 29, 2015 [1 favorite]


A Big Data Breakup Album
posted by almostmanda at 11:41 AM on May 29, 2015


I guess I don't see why the RDBMS per se is part of the problem. I guess it does make it easier to process surveillance data in the same sense it makes it easy to process large amounts of data, generally.
posted by LogicalDash at 11:44 AM on May 29, 2015


RDBMS's generally apply security at the entity level rather than row level. To access any SSN, you must be able to access all SSNs.
posted by blue_beetle at 11:58 AM on May 29, 2015 [1 favorite]


Lately I work for people who, though conscientious and considerably more self-aware about their own marketing practices and so forth, rely on just as much of the enabling infrastructure. You can refuse to support the machinery or you can participate in this economy, but it's brutally difficult to do both.

@pmarca retweets...
-"98% of people who have ever paid for @SlackHQ are still paying for it." ~ @stewart
-"We're doing well. We have $300m in the bank & we've been growing 5%/week for 70 straight weeks." Stewart Butterfield, CEO of Slack [mefi's own #1 employer]

bruces:*
In the startup world, you work hard and you move fast in order to make other people rich.

Other people. Not you.

You're a small elite of very smart young people who are working very hard for an even smaller elite of mostly Baby Boomer financiers so they can buy national governments, shut the governments down, destroy the middle class and the nation-state.

That's been going on a long time. It's not something you invented; that's a historical development. There's a lot of reasons that the nation-state's got to go. There's a lot of reasons why a middle class is in the way.

But that's what you do. That will be the judgment of history for your startup culture. They're going to say that the twenty-teens were all about that:

“It was a tacit allegiance between the hackerspace favelas of the startups and off-shored capital & tax avoidance money laundries. And what were they doing? They were building a globalized network society.”

And that's what's coming next: an actual globalized network society. You're routing around it from the bottom while they climb over it from the top, but you're both aimed in the same direction. That's why you're in tacit allegiance, whether you know it or not.

And right now everybody lives the way that people used to live under empires in colonial states. We're all auto-colonialized by the austerity. That's your big dragon. That's your actual dragon. Not, like, the little tactical dragon. That's the BIG DRAGON. And you know it's the big dragon because you're part of it. You're actually its brain and its nervous system.

You.

And as long as you are making rich guys richer, you are not disrupting the austerity. You are one of its top facilitators.
I guess what is needed is a compelling and imaginative story, possibly non-fiction, possibly fiction, that makes it clear how step 1 leads, very probably, to step 4 (unhappiness and/or worse life for you and your children). Is anybody writing _that_?

this is water by ramez naam (via)
posted by kliuless at 12:00 PM on May 29, 2015 [27 favorites]


RDBMS's generally apply security at the entity level rather than row level. To access any SSN, you must be able to access all SSNs.

In all the databases I've used you can at least apply different permissions to one table. The table in question might actually be a view. That might be the only way that some particular third party can access your SSN data.

What I'm saying is I don't see the relevance.
posted by LogicalDash at 12:25 PM on May 29, 2015


this is water by ramez naam (via)

Wow. Thanks for that, kliuless.
posted by straight at 12:28 PM on May 29, 2015 [2 favorites]


What I'm saying is I don't see the relevance.

Take a hypothetical but fairly typical mid-sized web retailer, customer base in the hundreds of thousands, annual revenue somewhere $10-50 million, in business for somewhere around a decade, running a hacked-up derivative of some common shopping cart stack. Assuming a not-too-unusual amount of continuity in the way this business operates on the web, figure 8-10 years worth of customer account and order data, stashed in MySQL or PostgreSQL: A bunch of logins, a bunch of e-mail addresses, a bunch of mailing addresses and line items and payment records.

My point is basically: Look at all that data and its unintended consequences. Databases (forget the specifics of the relational bit, I'm talking about big collections of data you can query for purposes unanticipated by the people originally storing the data) usually don't exist in the first place to facilitate surveillance as such. The people working for our hypothetical web store aren't malicious or creepy. They just want to ship packages full of things to people who paid for them, handle customer service calls, know what's in their inventory, pay taxes, pass audits, etc. etc.

And yet: All of that data can be correlated to other data, and piped to / correlated by third party services (analytics, e-mail marketing, re-targetting of ads on other sites, fraud prevention, identifiers of users on common social networks, whatever deeply disturbing ways your cell phone provider is selling you out) where it can leak into other databases. And then before you know it x-random vast corporate entity has a pretty good working model of your interactions with all sorts of little web stores and journalistic outlets and not-so-little social networks and governments and so on down the line.

Friendly little web store's database is sort of like the basic problem with dragnet surveillance, writ small: Whatever the rationale for its creation, the possibilities for its eventual use are combinatory and nearly impossible to constrain as long as the data exists and is relatively well-formed.

Databases aren't inherently bad. They're these amazingly useful tools, and probably an inevitable use of computers with long-term storage, just like diaries and ledgerbooks and file cabinets are inevitable uses of paper and ink. If SQL and friends didn't exist, it would be necessary to invent something like them. And yet they're part and parcel of all the ways that we're kind of fucked right now. Which is a big part of why this is a hard problem.

(See also: HTTP cookies, web request logging, credit-card transactions, client-side scripting, location services on mobile devices...)
posted by brennen at 12:56 PM on May 29, 2015 [7 favorites]


forgot to mention that Cathy O'Neil is writing a book -- partially as a result of mefi's own Jordan Ellenberg (escabeche) -- called Weapons of Math Destruction about: "how big data is being used as a tool against the poor, against minorities, against the mentally ill, or against public school teachers... on the ways big data increase inequality and further alienate already alienated populations."*

also btw...
The Guy Who Worked For Money by Benjamin Rosenbaum
posted by kliuless at 2:56 PM on May 29, 2015 [5 favorites]


The thought occurred to me just now that what if you made a plug in that did not just block tracking info ... but instead scrambled it in such a way that the tracking agencies would end up with this huge db of junk data .... hmmmm.....
posted by AGameOfMoans at 3:17 PM on May 29, 2015 [1 favorite]


The thought occurred to me just now that what if you made a plug in that did not just block tracking info ... but instead scrambled it in such a way that the tracking agencies would end up with this huge db of junk data .... hmmmm.....


Not a terrible idea. The issue is that the data already isn't clean, and they still find ways to track people. There's only so much you can do on an Internet you don't own.
posted by The Devil Tesla at 3:58 PM on May 29, 2015


You know what threw me for a loop was seeing ads related to [traditionally very private mental health topic that affects me] browsing the web at work. There's no mystery as to how they could do this - I was actually logged into the same Google account in both locations - but what the fuck?

On the other hand the fact that the vast majority of ads that, say, Youtube wants to show me are for audio gear and software - a correct identification of my hobbies - is something I'm perfectly happy with, in itself. That's the kind of mutually satisfactory example ad people use to defend their trade. Problem is the underlying techniques are way too powerful and pervasive to trust that they will only be used in benign ways.
posted by atoxyl at 5:53 PM on May 29, 2015 [1 favorite]


Another problem is that the people who are doing a lot of it are, because they are human, also at times extraordinarily incompetent and blinkered. The tools are powerful but they're being used on the wrong things and provide unintended consequences.

I take heart because the ads I'm shown so consistently don't fit me despite my generally cavalier attitude toward money, catholic tastes, and willingness to buy online. Either I already bought it or there is no way in hell I'd buy it.
posted by Peach at 7:21 PM on May 29, 2015 [1 favorite]


Related: Twitter is collecting a list of the apps you have installed on your phone (someone caught them last year doing this, but now they have a FAQ about it).
posted by RobotVoodooPower at 7:45 PM on May 29, 2015 [2 favorites]


Yeah as far as my experience goes with this stuff server side most people want aggregate optimize-our-service performance and utility data, and treat PII (personally identifying information) as a kind of toxic waste they desperately want to avoid dealing with.

Sites usually don't want PII any more than they want your passwords or credit card on file: it makes them liable, very liable, for misbehaving employees and security breaches. They need to build and pay for a huge army of support staff for dealing with individuals, rather than computer sanitized aggregate demand, aggregate interest, aggregate money.

Everyone wants the aggregate info, and to target aggregate demographics with ads, and to receive aggregate money from aggregate buyers. Virtually nobody wants personal details of users. Handful of properties. You can tell because they use terms like "identity" and provide login/authentication service.
posted by ead at 11:42 AM on May 30, 2015


(And, of course, the handful of sneaky ad companies that turn identity trackers into aggregate demographics. But that's the thing: they're selling a service that analyzes, categorizes, aggregates and blinds the customer to PII, because that's what the customer actually wants.)
posted by ead at 11:44 AM on May 30, 2015


I really disagree that "nobody wants personal details of users". In my experience, plenty of organizations do, and are gathering it already. They just haven't really figured out how to use it yet. It's being worked on though. Nieman lab published a pretty compelling/revolting vision of what this might look like just yesterday.
posted by spudsilo at 12:23 PM on May 30, 2015 [1 favorite]


That article is a great example of what I'm talking about. The hypothetical NPR program or NYT article the device is delivering to the user is being selected by an identity agnostic, dimension reduced model of user interest. NYT isn't going to nefariously catalogue the fact that John Smith is browsing want ads again and he's the sort of person who likes stories about British sitcoms while eating toast but not in the shower. Because they don't want to know, it's too much detail and not safe for them to handle. Some information about identity may leak through to NYT but primarily they're going to negotiate with an aggregate traffic source / user identity modeling and clustering 3rd party (Apple, ad networks, etc.) for higher quality targeting of their outgoing articles, higher quality matching of articles to ads, ads to abstract/latent user dimensions, or articles to paying subscribers, some metrics like a certain number of seconds of confirmed attention or interactions per billion licensed snippet deliveries, etc.
posted by ead at 2:21 PM on May 30, 2015


(Of course it depends on the site; if they're in the business of cultivating a strong sense of individual user identity -- Twitter is a fine example -- they will do more of this. But in general cultivating personal relationships with users is super expensive and dangerous, and well outside most sites' reach. Hence all the login with Facebook, pay with Stripe, monitor traffic with Google stuff. People outsource contact with user identities to protect themselves, limit costs.)
posted by ead at 2:27 PM on May 30, 2015


I guess what is needed is a compelling and imaginative story, possibly non-fiction, possibly fiction, that makes it clear how step 1 leads, very probably, to step 4 (unhappiness and/or worse life for you and your children). Is anybody writing _that_?

I don't know if this meets your threshold, but from this report the soon-to-be-grandfather seemed pretty upset.

No word on how the mother-to-be reacted to all this, but I'm having a hard time imagining it was a purely pleasant experience to have the surveillance-industrial complex breaking this news to your family.

Also, mind, this was before the massive data breach at the retailer in question, which brought about the resignation of its CEO.

So, the situation for private industry is very much like that of the military-police-surveillance complex: Not only are they collecting and using this data in squicky ways, they can't seem to keep it from malefactors (or whistleblowers they prefer to cast as malefactors).
posted by one weird trick at 7:32 PM on June 2, 2015


« Older Less Marc Jacobs More Jane Jacobs   |   SoX -- Surf Newer »


This thread has been archived and is closed to new comments