Once Again, You're The Product
February 25, 2015 2:28 AM   Subscribe

It’s 2015—when we feel sick, fear disease, or have questions about our health, we turn first to the internet. According to the Pew Internet Project, 72 percent of US internet users look up health-related information online. But an astonishing number of the pages we visit to learn about private health concerns—confidentially, we assume—are tracking our queries, sending the sensitive data to third party corporations, even shipping the information directly to the same brokers who monitor our credit scores.
posted by chavenet (57 comments total) 24 users marked this as a favorite
 
Incognito mode & using a trusted proxy, should be default for all interactions on the internet that don't prohibit them.
posted by lalochezia at 2:47 AM on February 25, 2015 [2 favorites]


This is astonishing like the sun coming up in the morning.
posted by Wolfdog at 3:01 AM on February 25, 2015 [8 favorites]


confidentially, we assume

Who's this "we" the writer speaks of? No one who uses Facebook or Google should have any reason to assume personal data is treated confidentially online.
posted by Gelatin at 3:11 AM on February 25, 2015 [1 favorite]


Note to self: When needing a "scary info-graphic", make it black, make it look like an octopus or the cold dark of the depths of outerspace.
posted by HuronBob at 3:12 AM on February 25, 2015 [5 favorites]


Isn't it better to flood them with info? Like search for every illness and disease known to mankind? Sort of garbage in, garbage out?
posted by TWinbrook8 at 3:25 AM on February 25, 2015 [4 favorites]


This is why we have private/incognito modes on browsers. You can go and use proxies or VPNs or someone else's WiFi as well, if you like, but even if you don't I don't think there's much commercial value in trying to use what information sticks across incognito sessions. That may change if everyone starts doing it, but it's very noisy so you'll have to work harder to get the signal up.

In all these cases, I feel the most useful analysis is to ignore whether some actor or the other is being moral or illicit or evil or whatever, instead trying to ascertain whether there's a reasonable chance of it being commercially useful to them - and then work out what one's proportionate response should be.
posted by Devonian at 3:29 AM on February 25, 2015


This DOES explain why I'm getting targeted ads for loose cat poop.
posted by HuronBob at 3:31 AM on February 25, 2015 [4 favorites]


Absurb overconfidence (again/still) in what is laughably referred to as "data".

I do such searches for friends; family; pets; friends of friends; livestock; diseases, disorders, and drugs I've only just heard of from teevy ads; complete strangers profiled in news stories; and historical figures. Really, you'll learn nothing about *my* health by collecting this kind of kipple.
 
posted by Herodios at 3:48 AM on February 25, 2015 [19 favorites]


Incognito mode & using a trusted proxy
...
This is why we have private/incognito modes on browsers.

This is false confidence. Nobody needs to set cookies in your browser when they can fingerprint you via your user-agent string, query settings via JavaScript, use Flash "super cookies" that the browser can't control, or (if you're on Verizon Wireless) the unique advertising identifier that the network itself sets for you.

How about stop blaming the victims here and telling the shithead advertisers to knock it off?
posted by indubitable at 4:17 AM on February 25, 2015 [75 favorites]


I wonder what my insurance company makes of me having head pigeons, crotch mice, and butt drama?
posted by louche mustachio at 4:25 AM on February 25, 2015 [21 favorites]


Indeed, privacy mode and proxies are not even enough to stop web sites from being able to identify you anyway.
posted by Poldo at 4:32 AM on February 25, 2015 [3 favorites]


I will RTFA, I promise, and I don't intend this as a derail, but I asked about something similar three years ago, and would love to hear more current thinking on it: "What effect do my Ask MetaFilter searches on behalf of others have on data miners' sense of who I am?... Do these searches add to the noise and help camouflage searches about topics that are personally relevant? Do they contribute to a risky-looking profile?"

I do not know enough about this topic to answer those questions. I do a ton of Googling on behalf of Askers on topics that are *not personally relevant.* So does that extra searching do or change anything in terms of the data trail I'm generating?
posted by MonkeyToes at 4:46 AM on February 25, 2015 [2 favorites]


Ha!! Good thing I've always been a curious fan of Grey's Anatomy, House MD, ER, and all those shows....
posted by ipsative at 4:47 AM on February 25, 2015 [1 favorite]


Yeah, I'm fairly convinced that nothing short of an encrypted proxy service will help.

I'm running Chromium without Flash and with AdBlock, Ghostery and Privacy Badger installed, and had been feeling fairly smug about my privacy savvy.

In the last few months I've noticed more and more that sites suggest content to me that only makes sense based on what I've looked at on unrelated sites. I've been noticing it often enough that it's unlikely to be coincidence.
posted by Ickster at 4:59 AM on February 25, 2015 [1 favorite]


Indeed, privacy mode and proxies are not even enough to stop web sites from being able to identify you anyway.

That was depressing. Thanks to my large-ish font collection alone, my browser is basically 1 in 5,059,655. Even the user agent, which I thought would be not very unique (Chrome on Windows), has enough unique information to be 1 in 5000.

Maybe it's time to pick a VPN with multiple international outlets and only browse from a snapshotted VM with a bog-standard install of Ubuntu. But there's relatively few users of Ubuntu, compared to OS X & Windows, so that would probably be easily tracked as well.
posted by honestcoyote at 5:04 AM on February 25, 2015


I also wouldn't be terribly surprised if your ISP was building a profile of you based on your browsing history, search queries, etc. and selling it off. Mine, for example, has no qualms about breaking DNS by redirecting queries to non-existent domains to their own ad-laden "search" page.

Tor would help immensely here, but it's going to be a (losing) arms race with the panopticon people if we keep having to rely on technological solutions rather than just making this shit illegal.
posted by indubitable at 5:08 AM on February 25, 2015


Like ipsative, I watch a lot of medical shows and google conditions I haven't heard of. I also teach biology, and even though I'm an ecologist, students ask me questions about every medical condition they've ever heard of on a regular basis. And yes, I usually google that for them. Sometimes, I even use what I learn to add an example to a lecture. Conditions I've googled lately that I do not have: Tay Sach's, achondroplasia, Huntington's, sickle cell, albinism, tetrachromatism, cystic kidneys, Angelman's, pancreatitis, Turner's, etc.
posted by hydropsyche at 5:08 AM on February 25, 2015


You might want to dump Adblock for uBlock, at least to better avoid ads. Still won't do much to keep you from being identifiable, though.
posted by Poldo at 5:13 AM on February 25, 2015


Yeah, I google a lot of medical conditions that I'm curious about but don't have. If that's reflected in my credit score, then I'm pretty sure my credit score thinks I'm already dead.
posted by ArbitraryAndCapricious at 5:18 AM on February 25, 2015 [5 favorites]


Encrypted browsing has become the tin-foil hat of the digital age.
posted by HuronBob at 5:21 AM on February 25, 2015 [2 favorites]


The only ways to stop the bastards being bastards are to (a) induce some sort of religious conversion to ethical behaviour, (b) create a strong, effective, active, intrusive regulatory environment, (c) widespread education and motivation of users to eschew interaction with bastards, (d) poison the sea in which the bastards swim.

(d) is, as far as I can see, the only workable approach. It's not a 'stop blaming the victims' thing: in any case, blame is useless. I don't blame bastards for acting like bastards. Bastards gonna bastard. I don't blame users for not tooling up for the fight: users just want to do what users just want to do.

Forget blame. It's a very floppy strand of spaghetti to take to a knife fight. Treat this as an engineering task in signal processing, or as a classic arms race, or as an exercise in economics.

Ok, so how does one work out what is actually going on? There is a world of difference between a proof-of-concept 'look, they can fingerprint your browser so you're fucked' test and commercial deployment. Is there widespread deployment? Who's doing it, and how can you find out?

Taking the arms race approach:

1. Characterise the enemy. Develop tests that give some indication of what actions they are taking and when it's happening.
2. Develop defences. Don't have to be passive, they can include various forms of fuzzing
3. Repeat

For example, what could track identity across sessions if you spin up a standard clean install of a mainstream OS (or something that looks like one) in a VM and destroy it every time? Can Virtualbox (or whatever) be induced to keep state across that? I doubt it. How about traffic analysis - if you were the only person on the Net doing this, there's no doubt you could be tracked that way, so can this be productised and deployed widely? (On preview, roughly what honestcoyote said.) Can a browser with a built-in VM approach together with a Tor-like multihop proxy approach (to defeat ISP snooping) be created and can enough people be persuaded to use it?

I think the latter half of that question is the harder to answer: perversely, I think if you did it as a Kickstarter for a $15 'privacy box' where all of the hard work was done by software you'd happily give away if you could, you'd get to critical mass faster.
posted by Devonian at 5:34 AM on February 25, 2015 [2 favorites]


I wonder what my insurance company makes of me having head pigeons, crotch mice, and butt drama?

Yeah, it's too bad The Colbert Report is off the air. Dr. Steve could've done a segment of Cheating Death wherein he encourages The Nation to participate in a simultaneous orgy of googling spurious diseases.

Our nation turns its lonely eyes to you . . .
 
posted by Herodios at 5:36 AM on February 25, 2015 [3 favorites]


I don't blame bastards for acting like bastards. Bastards gonna bastard.
WTF? This is the entire reason for a system of laws and the criminal justice system.
For example, what could track identity across sessions if you spin up a standard clean install of a mainstream OS (or something that looks like one) in a VM and destroy it every time? Can Virtualbox (or whatever) be induced to keep state across that? I doubt it. How about traffic analysis - if you were the only person on the Net doing this, there's no doubt you could be tracked that way, so can this be productised and deployed widely?
Do you have any idea how absurd this sounds? Nobody who's not already highly informed and determined is going to jump through these kinds of hoops to browse the web. You know, regular people?
posted by indubitable at 5:48 AM on February 25, 2015 [2 favorites]


Treat this as an engineering task in signal processing, or as a classic arms race, or as an exercise in economics.

Engineering solutions are fine to lots of problems, but I think the biggest problem with 20th century American society was that we tried to solve tons of socioeconomic problems with engineering solutions, and to some extent we still try to do that. We try to fight obesity with drugs and pseudoscientific diets rather than getting people healthy food; we try to surveil our way to security instead of fixing out dysfunctional foreign policy. Sometimes sociological problems need sociological solutions.
posted by thegears at 5:49 AM on February 25, 2015 [11 favorites]


Meanwhile, Ghostery counted 12 trackers on Vice's article…
posted by los pantalones del muerte at 5:55 AM on February 25, 2015 [11 favorites]


You might want to dump Adblock for uBlock, at least to better avoid ads. Still won't do much to keep you from being identifiable, though.

I do wish that some of the effort currently put into ad blocking was instead put into 'shady tracking service' blocking.
posted by Lanark at 6:03 AM on February 25, 2015


I do wish that some of the effort currently put into ad blocking was instead put into 'shady tracking service' blocking.

These are mostly the same things, actually. For example, from uBlock's rules match report on TFA:
vice.com*/mb_tracker.html
||cloudfront.net/atrk.gif?
||scorecardresearch.com^
posted by indubitable at 6:12 AM on February 25, 2015 [1 favorite]


CBS 60 Minutes segment on defending privacy. Opting out of data brokers is described as a full time job.
posted by Brian B. at 6:13 AM on February 25, 2015


The ad i just got on this page while logged out on my phone? Hawking "public record search."
posted by JauntyFedora at 6:21 AM on February 25, 2015 [1 favorite]


We all need to adopt an obfuscation strategy. The best way to fight databases is to fill them with crap. I'm going to code up a little bot that just Googles items from a dictionary all day long, when my machine is idle. The dictionary will contain known diseases, and consumer items (luxury, ordinary, and durable goods). To prime the pump, I have just manually searched for "shingles", "dropsy", "ovarian cancer", "African sleeping sickness", and "boogie woogie flu".
posted by thelonius at 6:28 AM on February 25, 2015 [6 favorites]


WTF? This is the entire reason for a system of laws and the criminal justice system.


Unfortunately, they're only concerned with poor bastards.
posted by louche mustachio at 6:36 AM on February 25, 2015 [4 favorites]


The problem with amateur "crapflood" strategies is that you are likely just some person reading a website who does something else for a living, whereas some of our nation's brightest math and computer science graduates gather and filter this kind of data as a full time job. It's likely that your gesture will be meaningless because they've figured out clever ways to screen out exactly this kind of thing.
posted by indubitable at 6:37 AM on February 25, 2015 [3 favorites]


Well, if they're so smart, why does Amazon keep showing me ads for things I just bought? I already have that thing!
posted by thelonius at 6:49 AM on February 25, 2015 [13 favorites]


As always, The Onion got there pretty quickly (May 2000): "Internet Opens Up Whole New World Of Illness For Local Hypochondriac."
posted by resurrexit at 6:54 AM on February 25, 2015 [2 favorites]


Isn't it better to flood them with info? Like search for every illness and disease known to mankind? Sort of garbage in, garbage out?

I'm really sympathetic to this point of view. I think its spooky and unsettling when Amazon at my desk seems to know about my Google search from my tablet. Which it does.

But it seems likely to me that this sort of interference will soon be considered illegal computer hacking, if it isn't so considered already. Not that there is a specific law against it, but what you're suggesting really is a malicious attempt to interfere with the operation of a business, with a computer system you don't own, etc. The best way to accomplish the feat would be through a proxy or a browser plugin that automatically makes randomish requests, but that approach also makes the intent to mess with a rich entity's computer network pretty undeniable.

Even if it isn't considered illegal, it may not do any good. The ad networks are trying to send you ads you'll click on; if you attempt to "poison the well" by clicking on everything, they can't lose. Maybe you'll feel better, but they won't change their ways.
posted by Western Infidels at 7:00 AM on February 25, 2015


Ghostery and Disconnect Me are two browser extensions that help protect against some of them. Ghostery tracks and optionally disables all trackers. Metafilter isn't so bad in that it only has 3; Google Analytics, ChartBeat, and Quantcast. I got 11 on the Vice article, including 3 ad tracking beacons. I still run into occasional problems where Ghostery blocking breaks the page; sometimes the Javascript errors it generates forces the rest of the page to stop running. But mostly it's harmless. OTOH it doesn't provide any visible benefit like an ad blocker does.

I'd love to read a detailed technical explanation of how this tracking works vs. SSL. I know the basic Referer header doesn't get sent if the referring page is an https URL; the old trick of tracking google searches with Referer mostly doesn't work now. But there's a lot more complicated ways a hostile page like Vice or CDC or the like can leak your personal data to third parties, and I'm guessing in practice they do leak data even if SSL is involved. I'd just like to read a precise explanation of all the ways they do.
posted by Nelson at 7:29 AM on February 25, 2015


Isn't it better to flood them with info? Like search for every illness and disease known to mankind? Sort of garbage in, garbage out?

That's a great idea - someone needs to write some software that does random google searches for this type of thing during computer downtime.
posted by Salvor Hardin at 7:31 AM on February 25, 2015


The best way to accomplish the feat would be through a proxy or a browser plugin that automatically makes randomish requests

You want TrackMeNot. It's perfectly legal so far.
posted by hat_eater at 7:32 AM on February 25, 2015 [3 favorites]


Privacy Badger blocked 14 trackers on TFA, and uBlock stopped 17 requests. So 14 trackers and 3 ads I assume?
posted by COD at 7:37 AM on February 25, 2015


One thing concerning TrackMeNot: you have to manually edit the list of terms you don't want to show up in your search profile.
posted by hat_eater at 7:39 AM on February 25, 2015


I am Spartacus! And I have crotch mice!
posted by gimonca at 7:42 AM on February 25, 2015 [1 favorite]


I'd love to read a detailed technical explanation of how this tracking works vs. SSL.

TLS introduces all new, even harder to avoid trackers like the one baked right into the HSTS standard.

Privacy Badger blocked 14 trackers on TFA, and uBlock stopped 17 requests.

Really? Wow, mine must be sitting downstream of NoScript and only finding what NoScript doesn't deal with.
posted by indubitable at 7:53 AM on February 25, 2015 [1 favorite]


The best way to accomplish the feat would be through a proxy or a browser plugin that automatically makes randomish requests
The problem with that approach is that if everyone did it, the internet would slow to a crawl with all those random requests.
posted by Lanark at 7:54 AM on February 25, 2015


I got an expensive mail promo for pre paid cremation yesterday. Maybe those big data algorithms know something I do not.
posted by bukvich at 9:35 AM on February 25, 2015 [2 favorites]


"I wonder what my insurance company makes of me having head pigeons, crotch mice, and butt drama?"

" Yeah, it's too bad The Colbert Report is off the air. Dr. Steve could've done a segment of Cheating Death wherein he encourages The Nation to participate in a simultaneous orgy of googling spurious diseases.

Our nation turns its lonely eyes to you . . . "

posted by Herodios


Last Week Tonight with John Oliver could happily figure out a way to explain this and get people to tweet about it.
posted by ZeusHumms at 9:56 AM on February 25, 2015 [1 favorite]


Also, this explains some of the suggestions that LinkedIn makes to me.
posted by ZeusHumms at 9:57 AM on February 25, 2015


Data talks; Anybody can look up anything on the Internet for any reason. You're reading an article that references some disease, or news of a distant relative's disease, you look it up. Means nothing.

But money walks. Credit card purchases, mailing addresses, magazine subscriptions, all those hoary markers are a much more conspicuous indication of your interest trends, lifestyle, etc. Who ever spilled time and tears on all that for the 50 years they've been in vogue?

Perspective! It's the Prime Beef of what people themselves pour into the online cookpots like Frattie's Facebook that is amazing ... not the picked-over bones strewn about in their wanderings.
posted by Twang at 10:32 AM on February 25, 2015


Well, if they're so smart, why does Amazon keep showing me ads for things I just bought? I already have that thing!

Why does Twitter persist in shoving "Promoted" spam at me despite the fact that I block/report Every. Single. One. I. See. Hundreds of them now.
posted by sidereal at 10:33 AM on February 25, 2015 [1 favorite]


Joke's on you, data miners: I'm a hypochondriac!
posted by Eyebrows McGee at 10:49 AM on February 25, 2015 [4 favorites]


I love how often advertisers think I want to buy the stuff I clicked through to from askme. No, I don't want those boots, messenger bag, dresses, or jewelry - but nice try!
posted by ldthomps at 11:27 AM on February 25, 2015 [1 favorite]


I google a lot of really absurd things, and get to a lot of really odd things from sites like this very one we all are on. So my advertising tends to be a bit - well, "less than targeted."

For about a week or so, I was getting advertisements that actually said "Looking for dead bodies on everest?" and would link to some random store.

Anyways, as others have said, this sort of wholesale information gathering is not defensible via incognito or proxies. It is very trivial to confuse the advertising with input, though - I would love to see a summary of what all I am targeted for.
Subject is interested in:
Katzenklavier
Dead bodies on everest
Exploding head syndrome
Life expectancy of hobbits residing permanently on space stations
posted by MysticMCJ at 12:50 PM on February 25, 2015 [1 favorite]


Once in awhile, I take a photo of of my most recent Google searches on my phone, and they're... something:

Quiet power introverts
Caffeine content coffee ice cream
NPR streaming
Clip cat nails
Burrito cat towel how
Schefflera
What does being loved feel like
NPR stream listen
Petsmart nail clipping cat cost

I read somewhere that "search is your psyche laid bare" (was that here on Metafilter maybe?) and it's so true. It is both terrifying and awesome that all of this stuff is being tracked. I myself cringe at my search lists. It's the only thing that I am protective of my phone over - my boyfriend will grab it and I'll say, panicked, "Don't look at my search history!!" because I don't want anyone to know what is going on in my head. I don't even know - those photos of lists confound me. What is Schefflera and why was I Googling it on my phone? Who knows? Google probably knows better than I do what I was up to, because it knows what I clicked and how long I spent on each page and it knows what I was doing directly beforehand (probably looking at something on Metafilter about plants) - and so do any third parties they sold that data to. But I don't know!

We live in interesting times. Thirty years ago, you had to go ask another person that was trained to help you find information if you wanted to get information from the Internet, and they would perform a search for you using sophisticated terms and Boolean operators and all that. Now we can just grab the computers that we carry around in our pockets - the computers that are tracking our every movement, the computers that have cameras and microphones that can easily be turned on and monitored even when they aren't on, and we just throw a few words in to Google or say them aloud and bam, there's the answer, and as a bonus, the fact that you performed that search has now been logged and tracked and will be used to develop a better profile of you so that you can be sold things. And we don't think about that; we don't realize that there is a profound cost to having ready access to whatever information we want whenever we want it, because the cost is invisible. It's an unbelievable time. It is the future.
posted by sockermom at 1:11 PM on February 25, 2015 [6 favorites]


The Butlerian Jihad may happen ahead of schedule.
posted by double block and bleed at 2:47 PM on February 25, 2015


I'm waiting for doctors to make full use of this.

'Ah yes, Mr. jamjam; we've been expecting you -- in fact you're here at our clinic almost two weeks beyond the ETA window for the average patient with your pattern of searches. Some cognitive issues there, I'm sorry to say.

No, no -- no need for tests or your history (as if we didn't know it better than you do!), we already know what's wrong with you from your searches and other site visits: you have tuberous sclerosis.

There's no cure and not much treatment, and we have to make our money up front since we won't be seeing you on a continuing basis -- not that we would be anyway with your prognosis. That will be $1500 please, plus a $200 surcharge for having to circumvent your feeble anonymising protocols.'
posted by jamjam at 7:16 PM on February 25, 2015 [1 favorite]


I recognize that there's a downside to believing this, but I think that Big Data basically has no flipping idea what I'm [dildo] about, and they're kidding themselves if they think they can make me buy stuff based on an algorithmic assessment of my internet behaviour. And based on the reporting I've seen about other people's behaviour, they're trying to model nonsense, and their bosses keep paying for it because they're hoping [Cave Johnson] to cash in. The reason advertizing works now is because people are (mostly) all the same, and [ISIL] slicing up a bigger data pie in [Mormon] infinite ways isn't going to tell them anything new. "A difference is a difference that makes a difference" and all that. Bugger.
posted by sneebler at 9:21 PM on February 25, 2015


The problem with amateur "crapflood" strategies is that you are likely just some person reading a website who does something else for a living, whereas some of our nation's brightest math and computer science graduates gather and filter this kind of data as a full time job. It's likely that your gesture will be meaningless because they've figured out clever ways to screen out exactly this kind of thing.

Yea, this is an arms race, and we're going to lose. Even people who work in the industry are fucked. Because while we might want to hide our info and have a good understanding of what's getting picked up and how... there are those aforementioned very smart people getting paid borderline hilarious amounts of money, like enough that in more rural areas or the midwest their kids would be the "rich" kids at the high school, to stop you from blocking them and track you more effectively. Every time i hear about some new method like the verizon supercookie thing i go "fucking SERIOUSLY!?!" and then realize, why am i surprised?

Arming up is not the solution to winning this, that's like trying to fight the US army with legally available weapons. You're going to get hit with a missile from a drone above the clouds 10 minutes in.

This stuff needs to be regulated, and there needs to be gigantic fines for doing it. But that's about as likely to happen as BP getting punished for the oil spill and dumping all that corexit.

I don't know what the solution is, but trying to out-obfuscate them isn't it. And with stuff like what verizon was doing, there is pretty much no solution outside of "use a VPN" at which point you're still on the internet, in a normal browser, and all the other methods are still functional. You can develop a giant stack of shit to try and cover up what you're doing, but there's always going to be some new outrageous method that fucks you over. That, or youv'e gotten to the point that even if they can only vaguely track you server side, you're leaving such a weird footprint from everything you've done that now you're the albino deer.
posted by emptythought at 4:45 AM on February 26, 2015


Isn't it better to flood them with info? Like search for every illness and disease known to mankind? Sort of garbage in, garbage out?

That's a great idea - someone needs to write some software that does random google searches for this type of thing during computer downtime.


Yes, since the NSA revelations, I've been waiting for someone to launch a "CHAFF app" which randomly send messages or makes searches which contain NSA keywords or encrypted packets of nonsense along with encrypted real data. We've all got enough WiFi bandwidth to do this, and it would just be small amounts of data that would in no way be accused of choking the infrastructure like bittorrent. It really needs to be included with a massively used app like gmail though or only the paranoid will be identifying themselves.
posted by guy72277 at 1:07 AM on March 3, 2015


« Older Gresham College lectures   |   A Few Silent Men Newer »


This thread has been archived and is closed to new comments