Cookieless Monster
August 28, 2013 10:59 PM   Subscribe

Cookieless Monster: Exploring the Ecosystem of Web-based Device Fingerprinting [pdf]. From the 2013 IEEE Symposium on Security and Privacy, this article examines "how web-based device fingerprinting currently works on the Internet. By analyzing the code of three popular browser-fingerprinting code providers, we reveal the techniques that allow websites to track users without the need of client-side identifiers [i.e. cookies]."

A November 2010 Wall Street Journal article (Race Is On to 'Fingerprint' Phones, PCs) provides a neat summary of the matter: "It might seem that one computer is pretty much like any other. Far from it: Each has a different clock setting, different fonts, different software and many other characteristics that make it unique. Every time a typical computer goes online, it broadcasts hundreds of such details as a calling card to other computers it communicates with. Tracking companies can use this data to uniquely identify computers, cellphones and other devices, and then build profiles of the people who use them."

And from the pdf: "...browser extensions that are available for users who wish to spoof the identity of their browser...all fail to completely hide the browser's true identity. This incomplete coverage not only voids the extensions but, ironically, also allows fingerprinting companies to detect the fact that user is attempting to hide, adding extra fingerprintable information."
posted by paleyellowwithorange (33 comments total) 32 users marked this as a favorite
 
May as well chuck a link to this here: panopticlick, so you can see how unique your browser footprint is.

There's a good recent article on Cookieless Web Tracking Using HTTP's ETag, which may or may be related to the linked pdf, and is relatively non-technical.
posted by Mezentian at 11:39 PM on August 28, 2013 [7 favorites]


These discrepancies, combined with other weaknesses found in less thorough user-agent-spoofing extensions (see Section V), can uncover not only that the user is trying to hide, but also that she is using Torbutton to do so.

It's not meant to be hard to determine whether someone's browsing through Tor. There's a limited number of exit nodes. Detection by websites is not a bug.

Other than that, interesting. Explains why a crappy webform I was trying to fill out managed to kick me out despite having changed my User-Agent...
posted by BungaDunga at 11:41 PM on August 28, 2013


Panopticlick sez: Your browser fingerprint appears to be unique among the 3,332,983 tested so far.

Oh crap, it's my weird choice of system fonts, isn't it? Well, if you want to take away my Peignot and Eurostile, you'll have to pry them from my cold, dead c-drive!
posted by oneswellfoop at 11:47 PM on August 28, 2013 [1 favorite]


Avoiding leaking information through browsers - which are basically interpreters for arbitrary public code - is an extremely hard problem. Defending against attacks that target Javascript functionality, bugs, and behavior to identify browser versions seems like it's completely insolvable, for example. Developing WebGL applications is hindered by the fact that sending graphics card, driver, and even available memory information gives away too many bits of info, but various quirks can be exploited to get this information anyhow.

If that weren't enough, a variation on the old CSS history detection hack using :visited has resurfaced, now using requestAnimationFrame timing to identify sites you've visited.
posted by lantius at 11:51 PM on August 28, 2013 [1 favorite]


now using requestAnimationFrame timing to identify sites you've visited.

And to read text inside a cross-domain iframe- fairly laboriously, but still.
posted by BungaDunga at 12:06 AM on August 29, 2013


Tracking companies can use this data to uniquely identify computers, cellphones and other devices, and then build profiles of the people who use them.

The NSA has of course been doing this for years and it's nice to see the government's investment into basic R&D become a new job-creating industry for private enterprise.
posted by three blind mice at 12:19 AM on August 29, 2013


Things like this are why I smile when I read people saying "Oh, I use RSA encryption with a 65536-bit key, triple-triple-DES and ROT13 everything twice!" Because that's like a prisoner keeping a diary written in a private code: yes it's probably secure but the authorities know everything you do and everybody you talk to and the extent of their knowledge is only constrained by their lack of interest.
posted by Joe in Australia at 1:18 AM on August 29, 2013 [3 favorites]


panopticlick: "Your browser fingerprint appears to be unique among the 3,333,434 tested so far."

next visit: "Your browser fingerprint appears to be unique among the 3,333,445 tested so far."

No changes to browser between visits.
posted by fredludd at 2:46 AM on August 29, 2013


uh, not to put too fine a point on things fredludd, but I think that's the idea.

They know it's you.
posted by hobo gitano de queretaro at 2:52 AM on August 29, 2013 [3 favorites]


ROT13 everything twice!"

I ROT13'd this twice using LEETkey.
Now we can talk in private.
Like our very own cone of silence.
posted by Mezentian at 2:53 AM on August 29, 2013


Mobile / tablet platforms are locked down enough that this fingerprinting doesn't seem to work terribly well- at least for ios.

Strange that less freedom can mean more freedom.
posted by jenkinsEar at 2:55 AM on August 29, 2013 [2 favorites]


They know it's you.

I misunderstood them, then. I thought they were saying, "Gosh, we've never seen this browser before."
posted by fredludd at 3:20 AM on August 29, 2013


> The NSA has of course been doing this for years...

How do you know this?
posted by ardgedee at 3:35 AM on August 29, 2013


I ROT13'd this twice using LEETkey.
Now we can talk in private.
Like our very own cone of silence.


What do I need to do to read this?
posted by Joe in Australia at 3:41 AM on August 29, 2013 [2 favorites]


So... no way to combat, if not defeat, this sort of thing, then? Other than not using a computer, that is.
posted by InsertNiftyNameHere at 3:43 AM on August 29, 2013


Actually not so hard. Use a browser inside a virtual machine with a vanilla installation of your operating system, and use a fresh virtual machine each time you log on. That way the only thing that persists between sessions is your IP address.
posted by Joe in Australia at 3:50 AM on August 29, 2013 [1 favorite]


VMWare is the new burner phone.
posted by fullerine at 4:31 AM on August 29, 2013 [3 favorites]


The few times I've been to that Panopticlick website, I've always wished it would have a little note at the end saying:

"By removing fonts x and y and deleting cookie z you will suddenly be lost in a crowd of 14,231 people".
posted by Static Vagabond at 5:08 AM on August 29, 2013 [14 favorites]


Actually not so hard. Use a browser inside a virtual machine with a vanilla installation of your operating system, and use a fresh virtual machine each time you log on. That way the only thing that persists between sessions is your IP address..

Just don't screw up and visit any site that you've previously been fingerprinted on a different machine.
posted by T.D. Strange at 5:13 AM on August 29, 2013


Within our dataset of several million visitors, only one in 175,471
browsers have the same fingerprint as yours.

Currently, we estimate that your browser has a fingerprint that conveys
17.42 bits of identifying information.
lynx on Debian Wheezy
posted by hardcode at 5:25 AM on August 29, 2013 [1 favorite]


It's surprising to me there isn't a plugin to hide plugin details (or is that pointless?). Even as a web developer I don't see why I need to know a user's browser plugins (yes I can think of advantages to it but you'd have to have a lot of time and budget on your hands to differentiate). Can't the browser just say, "Yes I accept PDFs, mp3s ..." rather than

Microsoft Office 2010; Office Authorization plug-in for NPAPI browsers

Why the hell do I need to be broadcasting that?
posted by yerfatma at 5:40 AM on August 29, 2013


>It's surprising to me there isn't a plugin to hide plugin details (or is that pointless?).

this, plus why no addon to disable or obfuscate the browser features that assist with identification (fonts, screen, etc).

There are so many other little tells that leak identity information anyway (the above, chrome phone home, safebrowsing, geo.enabled, ???) that I suspect that private mode is a joke, and all it does is clear your local cache. The world still knows what you did yesterday and last summer.

I doubt that you can trust anyone to provide you with a robust private browser. With all the shenanigans going on nowadays, you have to wonder if they'd be allowed to do so. So, accept that all your dark secrets aren't really :)
posted by w.fugawe at 6:46 AM on August 29, 2013


...because your boss just wants to click on his web Outlook icon and see the document without having to deal with any of those confusing [Save file....] boxes that gave him a virus the last time?
posted by Orb2069 at 6:46 AM on August 29, 2013


Yeah, don't confuse the so-called stealth or privacy modes offered by some browsers with, y'know, actual stealth or privacy. The only thing they do is avoid leaving a trail on your own computer, in case someone checks out your history or cache. They don't obscure anything whatsoever in terms of network traffic and activity.
posted by George_Spiggott at 7:49 AM on August 29, 2013


Mobile / tablet platforms are locked down enough that this fingerprinting doesn't seem to work terribly well- at least for ios. Strange that less freedom can mean more freedom.

Well, more privacy, which is an overlapping but not equivalent notion. But this is probably the most fruitful avenue to pursue for any developer who wants to cater to privacy concerns: a standard browser distribution with good functionality which is not customizable, or which can sandbox any customizations you make in order to work with a given site, to only expose those customizations to those sites.

In other words, trusted sites that need some nonstandard settings and plugins could be whitelisted to expose those differences if needed, but by default sites see a browser fingerprint that's indistinguishable from a million others.

There are plugins that offer slivers of this functionality, and Firefox natively lets you keep things like flash and java turned off until you need them. Flash cookies are particularly nasty; I strongly recommend you get and configure the BetterPrivacy plugin or some equivalent. Having said that, I do all that and more and apparently panopticlick still finds me wonderfully distinctive.
posted by George_Spiggott at 8:13 AM on August 29, 2013 [1 favorite]


I'm having a hard time understanding how my Firefox, Chrome, and Safari browsers are each broadcasting unique signatures according to Panopticlick, but for different reasons. (Firefox seems to have peculiar browser plugin details, while Safari and Chrome report peculiar system fonts? Seems ... fishy.

And now I try again with Firefox and it tells me

only one in 1,667,250 browsers have the same fingerprint as yours.

Wait, a few seconds ago I was unique among the 3.2 million+ browsers they'd seen. Either they're double-counting me (stupid) or one of you is my secret doppelganger. (Can we meet up?)
posted by RedOrGreen at 8:34 AM on August 29, 2013


Browser fingerprinting is nothing new, but this paper is the first time I've seen evidence that it's actually being used in practice. If you read the fine article you'll see they examine three commercial libraries: BlueCava, Iovation, and ThreatMetrix. This takes the technique beyond the realm of some theoretical thing that bad people might play with to an everyday, easy to apply technique being used in the real world. By ad networks, no doubt. (It's a bit crazy that an academic paper is basically them reverse engineering commercial methods. But I'm glad it's written.)

This paper doesn't talk about who specifically is using these grey-hat technologies. There's a bit of that in the cited paper Third-Party Web Tracking: Policy and Technology (Jonathan R. Mayer and John C. Mitchell Stanford University, 2012).

I'm a little skeptical of how useful the enumeration of fonts and plugins is; those things change on a weekly time scale, right? But maybe that's enough, and of course with enough extra data you can track the user through some changes.
posted by Nelson at 8:34 AM on August 29, 2013


I'm a little skeptical of how useful the enumeration of fonts and plugins is

Hmm, maybe that's the easiest fix: instead of trying to hide your details, have a plugin (or OS-level logic) that reports a random set of fonts each time.
posted by yerfatma at 8:51 AM on August 29, 2013


This chrome install is uniquely identifiable (not really surprising tbh). My more-paranoid firefox install with noscript is 1 in ~21k, mostly thanks to noscript.

I had a vanilla winxp install in vmware that wiped the disk on boot that I tried to use a few desktops ago, should probably revisit that, though ideally with a debian box that just pretends to be XP.
posted by Skorgu at 9:46 AM on August 29, 2013


The HTML Living Standard explicitly addresses the plugin issue:
The fewer plugins are represented by the PluginArray object, and of those, the more that are hidden, the more the user's privacy will be protected. Each exposed plugin increases the number of bits that can be derived for fingerprinting. Hiding a plugin helps, but unless it is an extremely rare plugin, it is likely that a site attempting to derive the list of plugins can still determine whether the plugin is supported or not by probing for it by name (the names of popular plugins are widely known). Therefore not exposing a plugin at all is preferred. Unfortunately, many legacy sites use this feature to determine, for example, which plugin to use to play video. Not exposing any plugins at all might therefore not be entirely plausible.
So, as usual, it's maintaining compatibility with legacy sites that ends up being the rub. They do go on to explicitly suggest that plugins never return anything other than a minimal list of mime-types and a name - none of this long arbitrarily complex version string nonsense.

They also mark all of the "fingerprinting vectors"in the spec with a big swirly fingerprint. There are a lot of them.
posted by lantius at 12:47 PM on August 29, 2013 [1 favorite]


Easy to avoid by constantly making small changes to the browser cfg - automated with a plugin of course, that itself is visible.
posted by stbalbach at 1:18 PM on August 29, 2013


Worth noting that the Chrome "incognito" mode does nothing against these methods. It still looks unique to Panopticlick.
posted by pashdown at 11:19 AM on September 2, 2013


Per the Chromium developers it's a non-design goal:
Note that the purpose of incognito (which is explained in new incognito windows) is to prevent *local* traces of what you do, not mask your identity from remote servers.
Presumably that's due to all the difficulties above with breaking sites by changing what browser properties can be enumerated.
posted by lantius at 12:38 AM on September 4, 2013


« Older Try a Little Tenderness   |   “It is very good here, I can drink here everyday... Newer »


This thread has been archived and is closed to new comments