The Social Network Will Be Monetized
September 4, 2010 7:10 PM   Subscribe

Social Networks and Data Mining: Where it is and Where it's Going
Telecoms operators naturally prize mobile-phone subscribers who spend a lot, but some thriftier customers, it turns out, are actually more valuable. Known as “influencers”, these subscribers frequently persuade their friends, family and colleagues to follow them when they switch to a rival operator. The trick, then, is to identify such trendsetting subscribers and keep them on board with special discounts and promotions. People at the top of the office or social pecking order often receive quick callbacks, do not worry about calling other people late at night and tend to get more calls at times when social events are most often organised, such as Friday afternoons. Influential customers also reveal their clout by making long calls, while the calls they receive are generally short. Companies can spot these influencers, and work out all sorts of other things about their customers, by crunching vast quantities of calling data with sophisticated “network analysis” software. Instead of looking at the call records of a single customer at a time, it looks at customers within the context of their social network.
posted by Weebot (20 comments total) 24 users marked this as a favorite
Hrm. Excellent fodder for the paranoid. Banks are using social network information coupled with IRS records to make loans? Emails are being scanned by the Army Criminal Investigation Command? Police are planning their deployment based on Twitter and Facebook?

It looks like John Poindexter's Total Information Awareness came to pass after all.
posted by hippybear at 7:40 PM on September 4, 2010

Every once in a while, when it comes to data mining or stuff like net neutrality people will say stuff like "Banks aren't interested in data mining and knowing all about you!" or that "ISPs aren't interested in doing special deals for content companies". Or whatever.

It's a really stupid argument to make, because the fact that they aren't interested now doesn't mean they won't be in the future.

The other thing that comes up is the whole "How exactly is this a problem"? argument, were because someone can't figure out how exactly this hurts them, it's a non-issue. But the fact is most people just find this kind of thing creepy.

And secondly, if call patterns are being mined to give some people special deals, that means, uh, might not get them and would therefore be paying more for phone service because according to the algorithm, no one cares whether or not you're getting good service

(When it comes to banks, there have actually been cases where banks have raised people's interest rates because of where they shopped. I.e. if you shop at the same store that people who are more likely to default on their loans, then your interest rates go up)
posted by delmoi at 7:47 PM on September 4, 2010 [2 favorites]

In some companies, e-mails are analysed automatically to help bosses manage their workers. Employees who are often asked for advice may be good candidates for promotion, for example.

How would/do people react if they knew this was going on? My workplace already has a bifurcated communications culture, partially because people thought more routine monitoring was going on and they didn't want the sort of backchannel IM snark that goes on during phone conferences made discoverable in the sanctioned messaging system.

Seems like this sort of analysis would just encourage even more of that kind of behavior, for more kinds of communication people would have been perfectly comfortable having "on the record" in the past. I can see it making certain people clam up for fear of being sorted into the "asker" rather than the "knower" tier, for instance.
posted by mph at 7:54 PM on September 4, 2010 [1 favorite]

I aways thought it was a riot how so many revolutionary types go out and get an email account at Somehow, having all the uppity types' emails all at one easy-to-keep-track-of server sounded like a big early christmas present for The Man.
posted by dunkadunc at 8:06 PM on September 4, 2010

This reminds me of a tech talk I attended when I was still at UC Berkeley.

These guys were presenting on what their platform could do by pulling together data from loads of different resources to make connections that wouldn't otherwise have been seen. They demoed it being used to find a hypothetical terrorist cell by looking at Facebook, phone records, and credit card receipts or something.

I remember thinking that it sounded great as long as it was being used to catch actual bad guys, but it also struck me as something with a huge potential for {mis,ab}use. It's interesting to see that this sort of thing is much more widespread than most of us usually think.
posted by spitefulcrow at 8:10 PM on September 4, 2010

most people just find this kind of thing creepy

And the rest probably will eventually, because this nothing compared to what they'll know about you in 15 years. But the genie is out of the bottle. I think everybody is going to have to learn to live under a digital microscope. The best we can hope for is secure anonymity whenever and however it applies.

In some companies, e-mails are analysed automatically to help bosses manage their workers. Employees who are often asked for advice may be good candidates for promotion, for example.

I find it both hard to believe that companies are doing that, and that it actually works very well. Until a computer can read and understand an email as well as a human, any manager who bases promotions on this information isn't very good. Unless the computer just helps the manager eavesdrop more efficiently.
posted by Camofrog at 8:14 PM on September 4, 2010

I read Idoru recently, and it's basically all about data-mining.

Being the Economist they seem to be taking a pretty rosey view of the issue without looking towards bad possibilities. I'd sort of like some more technical details.
posted by codacorolla at 8:14 PM on September 4, 2010

It's a really stupid argument to make, because the fact that they aren't interested now doesn't mean they won't be in the future.

You mean like, maybe, using all the helpfully tagged public photos on Facebook, Flickr and so on to get people's faces into a biometrics database?
posted by dunkadunc at 8:32 PM on September 4, 2010

I aways thought it was a riot how so many revolutionary types go out and get an email account at

Well, if the email is kept secure, it's not a problem. It's not like they need all the email to be on one server at all. "The man" is going to have a lot easier time getting email from major providers then they are niche sites". I don't know anything about, but if they have good security and privacy policies, it's better then signing up with some huge company.

Anyway, this particular thing has nothing to do with email, but rather
And the rest probably will eventually, because this nothing compared to what they'll know about you in 15 years. But the genie is out of the bottle.
So pass some laws and cram it back in. There's no reason why phone companies should be allowed to, for example, data mine you phone records to determine a 'social network' based on who you call and whether or not you're important.
I read Idoru recently, and it's basically all about data-mining.
What? It's been a while since I read that (like a decade and a half), but I thought it was about (spoilers)
posted by delmoi at 9:34 PM on September 4, 2010

What? It's been a while since I read that (like a decade and a half), but I thought it was about (spoilers)

Laney, the main character, works for SlitScan which is a news entertainment show that mines several thousand databases for "nodal points" to embarass celebrities and public figures. "All about" was a bit of an overstatement, but it's one of the major themes.
posted by codacorolla at 9:53 PM on September 4, 2010

I seriously wonder what law banning data mining would look like. If the definition is too broad, it outlaws basic functions necessary for setting prices or improving network functionality. (Ex: Ok, AT&T; you're not allowed to make marketing decisions based on aggregate calling data?) If it's too narrow, it rules out specific cases of data mining which a crafty company will quickly find a way to circumvent (around codes which no one would want to actually enforce). (Ex: No giving discounts to people based on their average incoming versus outgoing call durations.)
posted by kaibutsu at 3:56 AM on September 5, 2010 [1 favorite]

How about we just make it mandatory to release the source of your data mining software? That way, anyone who wants to can change their behavior to look the best according to your metrics.
posted by LogicalDash at 4:12 AM on September 5, 2010 [1 favorite]

We can catch terrorists *AND* show appropriate ad content? Truly these are the salad days.
posted by RobotVoodooPower at 6:59 AM on September 5, 2010

LogicalDash: What about learning algorithms? (Also the idea that "anyone who wants to" and - wait, that was sarcasm, wasn't it? Sneaky.

Delmoi: How will your law differentiate between good uses of data mining and bad uses? Or between telcos giving deals to their best customers and mom'n'pop outfits giving deals to their best customers? (Both are essentially the same thing, IMO, at very different scales). And how will you stop the offshoring of data mining?
posted by Leon at 8:06 AM on September 5, 2010

I can't think of anyone that would super-useful to more than law enforcement, but of course they've had your tagged drivers license photo for ages.

One photo for every 5-10 years, one profile. Probably not very useful when it comes to automated recognition. But a corpus of 30, possibly even 30 every year or two? Well, now, that'd be pretty helpful to my evil plan: start with a home and/or retail security company. Go to your customers and say "wouldn't it be nice to know who is coming into your home/store?" It would, wouldn't it? Is that magazine salesman really a thief, a drug dealer, or worse, a child molester? Is that young woman who came into your store just now a serial shoplifter? We'll combine the latest in facial recognition technology with our up-to-date personal information databases and let you know! And we'll share retail profile information with you as well -- you'll have a chance to offer tailored in-store promos to your valued guests! Of course, you'll be sending back information to us as part of the agreement. Really, the facebook corpus is just a bootstrap. Pretty soon we'll have a good idea of where most people are at any given moment, as well as which homes and businesses they tend to visit and when. But there will be some holes -- there's a lot of space between businesses and homes. A lot of that people spend in the car, so, to fill in that gap, we could invest in road cameras, maybe even public area CCTV like they have in England. We'll already have a good relationship with law enforcement, which will be curious about the home/residence information, so we'll have the political relationships in place to get in the door there.

Of course, it might not even be that hard, given the way people are starting to gravitate towards augmented reality with mobile devices. It really depends on whether somebody decides to do what I've outlined above before or after the time everyone's sporting eyewear with built-in displays and cameras.

The tagged corpus isn't the only way to bootstrap, of course, but it's a good one. The panopticon is coming, and I know Metafilter will be unsurprised by this, but it won't be Big Brother that builds it, it'll be Big Business.
posted by weston at 10:12 AM on September 5, 2010 [4 favorites]

Hmm, am I the only one who read, jumped to all the same big-brother-business conclusions as most people here, and started scheming ways to get in on the action?
posted by signal at 4:46 PM on September 5, 2010

One of the things that occurred to me about the addresses is that it's getting increasingly harder for small providers to send email without it being marked as spam. I have a domain address that I've owned for ten years and the idea of just giving it to Google to run mail is looking increasingly appealing, in large part because major providers are looking askance at my server more and more often even though we're reasonably careful about spam. But I forward my domain address to gmail, which means it picks up all the spam I get after ten years of the same address. If I were running the mail through Google apps, Google would deal with all that for me.

I don't see why anyone would run a solo server any more unless they had a very specific reason to do it, because it's such a pain in the ass. And that just pushes more and more people into the view of the business panopticon. Creepy as it is, most of us will end up doing it because it's less hassle to stop being a curmudgeon and give up our privacy.
posted by immlass at 8:24 AM on September 6, 2010

The major email providers have special logins for law enforcement. For example, here is a document from Microsoft for law enforcement about their service:,_24_Feb_2010
posted by Joe Chip at 2:44 PM on September 8, 2010

Over the past week I've started graduate studies in information and library science, and I've come back to this article with some new ideas. One of the things to get my mind going was an article called “Information as Thing”. It basically covers what is and is not information, and how we act upon it, and what it means to create collections of information that may be useful. While not related directly to datamining exactly, it is a good overview of Information Science, and where I'm drawing a lot of ideas for what I'm about to write – I'd recommend it for anyone interested in the field.


It seems to me that there is a certain chain of events by which reality becomes information, and by which that information is then turned into knowledge and acted upon.

To make a document is to record an event. Not all documents are exactly records of events. The ones that we're most familiar with are images and videos and stories. We're also familiar with accounts of experiments, which are events conducted under controlled and examined circumstances. Fiction is a representation of reality as it is, or as it may be, or as it could be – in a certain sense a synthesis of events and imagination. Models could also be considered documents, although they're more often documents that are meant to be representations of things rather than records of events.

I envision it like this:

Reality → Event (a discrete chain of infinitely small sub-events tied together in some way) → Observation (sometimes) → Documentation & Storage (more rarely) → Access (even more rarely) → Understanding → Action/Use

When you get to the last stage of use the cycle loops back in on itself (I guess it's a pretty recursive system all around, since the very acts of observation and documentation effect reality). Using information to do some thing creates its own impact on reality, and its own chain of events, and more documentation that might be acted upon.

Computers make things that are exceedingly hard for humans accessible in nanoseconds, such as reproducing text, storing lots of information with great fidelity, and using mathematical logic to make connections and decisions. Before the age of computers (maybe more accurately, before the age of networked computers) the the gaps between the event occurring and the use of the documentation was very large, and very elastic. It could be anywhere from seconds to decades, and in only a very basic sense was it automated.

Consider a credit card transaction (I worked in fraud for a credit card company, and I've written about it here before). In reality you want to buy lunch, so you pull out your credit card and it's swiped. The data travels through the electricity of the wires between you and the processing center, is routed to the proper authentication server, numbers are crunched in a database, and then a reply is sent back that allows you to take your food. This happens in a few seconds. Almost instantly a computer acts on the “knowledge” (I don't know if you can use that term for a computer) to make decisions. It can access every purchase you've ever made against a complex algorithm and try to determine if this is a legitimate transaction. It can add this purchase to your others, and decide whether to raise your credit limit the next time it's able. It can route anything it doesn't understand to a human component of the system. In essence the space between an event and when the event is transformed into data and acted upon is miniscule.

A few points for a rambly post:

As datamining becomes more effective, it will obviously effect reality more dramatically.

Whereas before data was largely synthesized into knowledge (slowly and imperfectly) and then acted on by human components, knowledge is becoming instead a commodity that can be hoarded, connected, and then acted upon immediately either by computers directly, or by computers issuing commands to human agents.

More things are becoming documented events. Social media makes small comments, images, videos, and passing emotions that would normally go unobserved by others, and certainly undocumented into documented objects that can be linked and examined in a database. Also, in a way more directly related to commerce, anything that is handled by a computer is stored, and can be mined for information, which can then be connected through machine intelligence into actionable knowledge.

If you want to live a normal, first world life, then you don't have much choice but to be part of this system.

As far as I can tell, and according to that Economist special about a year ago, minable information is increasing exponentially.

Google's business model is based on the fact that even the act of accessing the information becomes a documented event which can then be mined for connections that can also be acted on.

So what are the effects of this hyper-realized chain of event-to-analysis-to-action-to-event?

How does the feedback loop of data-causing-event-causing-data effect the reality that we live in?
How much of this is under our control... what about those weird spikes from high frequency trading programs?

In the case of self volunteered information what happens to the psychology of the individual, and how does the data that they see change the data that they give?

That's a lot of words, and I'm not sure if anyone is still reading this thread, but forgive me: I fucking love this stuff. This is the future, and I'm excited to be studying it.
posted by codacorolla at 1:27 PM on September 13, 2010

Oops, probably help if I actually linked the journal article I was talking about:
posted by codacorolla at 6:32 PM on September 13, 2010

« Older Medal of Honor video game sales banned by US...   |   SNIT! Newer »

This thread has been archived and is closed to new comments