# Information doesn't want to be scale free

April 23, 2009 10:48 PM Subscribe

"the scale-free network modeing paradigm is largely inconsistent with the engineered nature of the Internet..." For a decade it's been conventional wisdom that the Internet has a scale-free topology, in which the number of links emanating from a site obeys a power law. In other words, the Internet has a long tail; compared with a completely random network, its structure is dominated by a few very highly connected nodes, while the rest of the web consists of a gigantic list of sites attached to hardly anything. Among its other effects, this makes the web highly vulnerable to epidemics. The power law on the internet has inspired a vast array of research by computer scientists, mathematicians, and engineers.
According to an article in this month's Notices of the American Math Society, it's all wrong. How could so many scientists make this kind of mistake? Statistician Cosma Shalizi explains how people see power laws when they aren't there: "Abusing linear regression makes the baby Gauss cry."

Uh, are you talking about the

Of course the

And "long tail" is a hype term relating to the popularity and accessibility of products that satisfy niche tastes. It's basically a marketing term, thought up by wired editor Chris Anderson (who named his blog after it). Saying that the web has a "long tail" makes sense. There are lots of niche sites out there that are only possible because of the low distribution costs online.

Saying that the "internet" has a long tail makes sense if you're talking about the content on the internet -- mostly the web, but it doesn't make much sense if you're talking about the

Basically "the web" and "the internet" represent different things, which seems to have been confused in the text of this FPP, making it somewhat incoherent.

posted by delmoi at 11:28 PM on April 23, 2009 [8 favorites]

*or the or the***internet***. Site and links are part of the***web****web**and it's the internet that is engineered. The link structure of the web isn't 'engineered' at all.Of course the

*Internet*is susceptible to 'epidemics' but how could the*web*be suseptable to epidemics? That makes no sense at all, but you say it is. What would a 'web epidemic' even mean? Rickrolling?And "long tail" is a hype term relating to the popularity and accessibility of products that satisfy niche tastes. It's basically a marketing term, thought up by wired editor Chris Anderson (who named his blog after it). Saying that the web has a "long tail" makes sense. There are lots of niche sites out there that are only possible because of the low distribution costs online.

Saying that the "internet" has a long tail makes sense if you're talking about the content on the internet -- mostly the web, but it doesn't make much sense if you're talking about the

**infrastructure**of the internet... Well It's not even clear what that would mean.Basically "the web" and "the internet" represent different things, which seems to have been confused in the text of this FPP, making it somewhat incoherent.

posted by delmoi at 11:28 PM on April 23, 2009 [8 favorites]

Yeah, your first link talks about the internet, in terms of routers and infrastructure, and your second link, which you label "the internet has a scale-free topology" is actually talking about the web. In other words, they talk about completely different things, and don't contradict each other.

posted by delmoi at 11:36 PM on April 23, 2009

posted by delmoi at 11:36 PM on April 23, 2009

*Applause*

At last something congruent to my own experience: Fitting a line to your log-log plot by least squares is a bad idea. After long years the prophet has finally spoken. Lines are not equivalent to power law.

I've been saying this all along.

But in a world of lines, the squiggle is seen as an aberration.

Rise with me squiggles and overthrow the tyranny of the least sum of squares. How dare they say that we are regressing to the mean?

Also - Chronbach's alpha is a fraud and a misguided construct, alien to efforts foreign to the pursuit of pure data analysis. Give me an honest unidimensional latent construct and I will give you the philosopher's stone.

If the numbers cannot stand on their own, what do they stand for other than proof of certain numeric tyranny?

Google Ron Paul.

posted by isopraxis at 11:41 PM on April 23, 2009 [6 favorites]

At last something congruent to my own experience: Fitting a line to your log-log plot by least squares is a bad idea. After long years the prophet has finally spoken. Lines are not equivalent to power law.

I've been saying this all along.

But in a world of lines, the squiggle is seen as an aberration.

Rise with me squiggles and overthrow the tyranny of the least sum of squares. How dare they say that we are regressing to the mean?

Also - Chronbach's alpha is a fraud and a misguided construct, alien to efforts foreign to the pursuit of pure data analysis. Give me an honest unidimensional latent construct and I will give you the philosopher's stone.

If the numbers cannot stand on their own, what do they stand for other than proof of certain numeric tyranny?

Google Ron Paul.

posted by isopraxis at 11:41 PM on April 23, 2009 [6 favorites]

The subject of this post is the link to a PDF which is on the text "it's all wrong". That paper in its definition of terms states that it's talking about the network topology of the internet: the things that have IP addresses and the connective nodes that traffic passes through between them. The other links are simply explanatory. (But are a nice selection of explanatory links IMO.)

posted by XMLicious at 11:45 PM on April 23, 2009 [1 favorite]

posted by XMLicious at 11:45 PM on April 23, 2009 [1 favorite]

Um, is there a layman's version? The last math class I took was in tenth grade. I really really tried to understand those articles, but it just kinda went

"And obviously if you renoberate the skelberon, you'll framit the P-sphere and end up piklinizing the whole thing."

posted by Scattercat at 11:48 PM on April 23, 2009 [1 favorite]

*whoooosh*over my head. It*sounds*like it's a very interesting idea, but about two paragraphs in they start writing sentences that read (to me) like:"And obviously if you renoberate the skelberon, you'll framit the P-sphere and end up piklinizing the whole thing."

posted by Scattercat at 11:48 PM on April 23, 2009 [1 favorite]

Math is usually used by science via matching something in the real world up to a mathematical model. From what's in these links, it looks like scientists have been recording enormous amounts of data about all of the paths through the internet that all of the computers on the internet use to talk to each other, and have been concluding from samples of that data that the behavior of the internet matches an abstract model that obeys this "power law" mathematical behavior.

But according to the .pdf paper and the "baby Gauss" link all of these scientists making the "power law" analogies and claims have been wrong because they've been sloppy about how they were analyzing the huge amounts of data and the internet does not match a model that would obey a power law.

(I haven't said what a power law is there because I'm not sure I totally understand how it's being applied in this context.)

posted by XMLicious at 11:57 PM on April 23, 2009

But according to the .pdf paper and the "baby Gauss" link all of these scientists making the "power law" analogies and claims have been wrong because they've been sloppy about how they were analyzing the huge amounts of data and the internet does not match a model that would obey a power law.

(I haven't said what a power law is there because I'm not sure I totally understand how it's being applied in this context.)

posted by XMLicious at 11:57 PM on April 23, 2009

So, should any theoretician who decides to model internet usage be required to go obtain a Cisco certification or something before doing so?

The criticism seems to be about the statistical methods used to decide a power law was reflected in the data, and the eagerness to establish a new instance of a power law in "nature", rather than anything having to do with computers. The American Math Society author also is saying that part of the problem is the sheer volume of data being analyzed and that it's good this kind of mistake has been encountered in a field like this first, rather than in medicine or something with more repercussions.

But I don't understand most of what's said very deeply at all so maybe

**escabeche**or someone else can flesh out or correct my interpretation.

posted by XMLicious at 2:04 AM on April 24, 2009

*Chronbach's alpha is a fraud and a misguided construct*

Chronbach would a cool name for a time traveler. Cronbach is what you are looking for.

I understand the london underground map doesn't completely accurately reflect reality either.

This is really just an issue of understanding your data and the limits of your analysis. They all have shortcomings.

posted by srboisvert at 2:07 AM on April 24, 2009

The paper details what type of web/network data was used to assess the power law relatioship:

"In particular, the distributions for the HTTP connections, earthquakes, web links, fires, wealth, web hits, and the metabolic network cannot plausibly be considered to follow a power law; the probability of getting by chance a fit as poor as the one observed is very small in each of these cases and one would have to be unreasonably optimistic to see power-law behavior in any of these data sets."

The paper does a great job demolishing the power law assumptions made in many disciplines and presents code for easy testing by future power law devotees. This in itself is worthy of the post without the Internet angle.

posted by benzenedream at 2:19 AM on April 24, 2009

"In particular, the distributions for the HTTP connections, earthquakes, web links, fires, wealth, web hits, and the metabolic network cannot plausibly be considered to follow a power law; the probability of getting by chance a fit as poor as the one observed is very small in each of these cases and one would have to be unreasonably optimistic to see power-law behavior in any of these data sets."

The paper does a great job demolishing the power law assumptions made in many disciplines and presents code for easy testing by future power law devotees. This in itself is worthy of the post without the Internet angle.

posted by benzenedream at 2:19 AM on April 24, 2009

*"And obviously if you renoberate the skelberon, you'll framit the P-sphere and end up piklinizing the whole thing."*

Glayvin!

posted by Horace Rumpole at 4:45 AM on April 24, 2009 [1 favorite]

Don't confuse the paint and the canvas with the art.

Nevertheless, the art is constrained by the properties of the materials. It is liberated by the genius of the artists. It is appropriate to discuss the interrelationship between the two, and it is a poor artists who does not understand his palette.

posted by Xoebe at 5:43 AM on April 24, 2009

Nevertheless, the art is constrained by the properties of the materials. It is liberated by the genius of the artists. It is appropriate to discuss the interrelationship between the two, and it is a poor artists who does not understand his palette.

posted by Xoebe at 5:43 AM on April 24, 2009

the putative power law being discussed in the first pdf is the probability that a router on the backbone of the internet will be connected to a given number of other routers. i.e. if x is the number of routers connected to a selected router, then if you choose a router from the backbone randomly the power-law would say that the probability that your chosen router is connected to x other routers is proportional to x

posted by geos at 5:55 AM on April 24, 2009

^{-n}where n is likely between 1 and 2.posted by geos at 5:55 AM on April 24, 2009

In case anyone was looking to me for help: I'm a pure mathematician. So the way I understand the problem like this is as follows: someone hands me a large graph, which is to say a set of nodes connected by edges -- or perhaps a random sample of data drawn from this graph. (The word "graph" to mathematicians is probably better translated into non-math English as "network.") And I'm supposd to guess things about the statistics of the graph, or maybe even about its origin. For instance: if I'm a node on the graph, the number of other nodes to which I'm connected is called my

Notice I didn't say anything about the internet, or the web. (As delmoi rightly points out, I don't really know the difference between these two things and mucked up the distinction in the FPP.) That's because, as I said, I'm a pure mathematician; my tendency is to immediately abstract away the physical features of the problem in order to model it by something I can really get my hands on. I think the point of view taken by the applied mathematicians who wrote the Notices article is that

posted by escabeche at 6:09 AM on April 24, 2009 [10 favorites]

*valency*. And I can ask: what proportion of nodes have valency 1? What proportion have valency 2? And so on, and so on. If I can tell you the answers to*all*these questions I've told you what's called the*probability distribution*on the valency. And the probability distribution on valency gives you interesting clues as to how the graph was generated; it's a kind of "signature" of the graph. A completely random graph (in the Erdos-Renyi sense, if you are really going to be precise about it) will give you what's called a Poisson distribution. But other graphs give you a probability distribution known as a power law, in which the number of vertices of valency N is proportional to N^{-alpha} for some exponent alpha. This whole story is part of what one might call probabilistic graph theory.Notice I didn't say anything about the internet, or the web. (As delmoi rightly points out, I don't really know the difference between these two things and mucked up the distinction in the FPP.) That's because, as I said, I'm a pure mathematician; my tendency is to immediately abstract away the physical features of the problem in order to model it by something I can really get my hands on. I think the point of view taken by the applied mathematicians who wrote the Notices article is that

*people like me are part of the problem.*posted by escabeche at 6:09 AM on April 24, 2009 [10 favorites]

Heh, I actually worked with one of these guys (and met another one), and the graduate research group I was in was devoted to this sort of fight.

A few quick thoughts. "long tail" is a term that's been around for a long time. It refers to the difference between a Gaussian probability distribution, for which very large deviations pretty much never happen (and therefore is amenable to description in terms of "average" behavior) and power-law probability distributions, which decay slowly enough that very large events do (rarely) happen, and "average" behavior can violently fail to capture the dynamics.

The reason the scale-free crowd is wrong, is not just that they're finding power laws (which are present in many phenomena). It's that they say A) there's a power law, and B) therefore this network is scale-free, C) therefore it has certain underlying properties/weaknesses, etc. They're totally wrong about B) and C). The classic example that John Doyle (one of the authors) likes to complain about is this. Their analysis of these "connections" produces an insane and easily popularized conclusion.

posted by Humanzee at 6:14 AM on April 24, 2009 [2 favorites]

A few quick thoughts. "long tail" is a term that's been around for a long time. It refers to the difference between a Gaussian probability distribution, for which very large deviations pretty much never happen (and therefore is amenable to description in terms of "average" behavior) and power-law probability distributions, which decay slowly enough that very large events do (rarely) happen, and "average" behavior can violently fail to capture the dynamics.

The reason the scale-free crowd is wrong, is not just that they're finding power laws (which are present in many phenomena). It's that they say A) there's a power law, and B) therefore this network is scale-free, C) therefore it has certain underlying properties/weaknesses, etc. They're totally wrong about B) and C). The classic example that John Doyle (one of the authors) likes to complain about is this. Their analysis of these "connections" produces an insane and easily popularized conclusion.

posted by Humanzee at 6:14 AM on April 24, 2009 [2 favorites]

*In case anyone was looking to me for help: I'm a pure mathematician.*

Isn't saying this sort of like the time you went to the doctor's office and while you are waiting your doctor comes in and tells everyone: "In case anyone was looking to me for help: I'm the king of antarctica."

the basic idea is that you are trying to count the number of routers each router on the internet is connected to (hence traceroute), and since you can't count this for every router on the internet, not even close, you end up doing some hinky stuff and messing up the error analysis.

posted by geos at 6:18 AM on April 24, 2009

I should clarify. Scale-free means there are no length scales in a system. As in, if you were to magnify it and zoom in, you wouldn't know the difference. Fractals are scale free. Engines are not. If you zoom in on an engine, maybe you're looking at a piston. It doesn't look like a whole engine.

Also, they don't hate mathematicians. They hate physicists. Specifically, those who believe in self-organized criticality, and then go dabbling about in fields they don't understand. There are a few physicists out there who want to claim that the brain is scale-free, which is laughable, and sort of how I wound up in neuroscience. I ultimately got out of that fight, not because I think self-organized criticality has any value, but because it's day is over, and no one cares.

posted by Humanzee at 6:19 AM on April 24, 2009

Also, they don't hate mathematicians. They hate physicists. Specifically, those who believe in self-organized criticality, and then go dabbling about in fields they don't understand. There are a few physicists out there who want to claim that the brain is scale-free, which is laughable, and sort of how I wound up in neuroscience. I ultimately got out of that fight, not because I think self-organized criticality has any value, but because it's day is over, and no one cares.

posted by Humanzee at 6:19 AM on April 24, 2009

*Um, is there a layman's version?*

Here's a layman's analogy: A lot of people looked at the number of cars running through America's highway system and saw that highways running through New York City had an astronomical number of cars running through them compared to the number of cars running through the highways in northern Vermont.

From this observation, they erroneously concluded that the physical highway systems

*themselves*had a structure where the links on the highways from large cities had an astronomically larger number of links to other destinations than highway systems in smaller cities.

posted by deanc at 8:30 AM on April 24, 2009 [1 favorite]

The Notices of the AMS article doesn't seem to reference this interesting editorial by Mitzenmacher. It does cite this article by Fabrikant, et al. but doesn't quote the best part (highlighted in the Mitzenmacher article):

posted by mhum at 8:50 AM on April 24, 2009 [3 favorites]

"Power laws ... have been termed 'the signature of human activity'..."^{1}

^{1}"They are certainly the product of one particular kind of human activity: looking for power laws..."posted by mhum at 8:50 AM on April 24, 2009 [3 favorites]

*From this observation, they erroneously concluded that the physical highway systems themselves had a structure where the links on the highways from large cities had an astronomically larger number of links to other destinations than highway systems in smaller cities.*

Don't they? Both in terms of the raw number of exits, and in terms of the different highways that can be connected to, highways from large cities have more links.

posted by effugas at 8:58 AM on April 24, 2009

Linguistics has fallen pray to the quest for power-laws too. Several linguistic networks or relationships have either been found to have power-laws or, with the proper amount of squinting and cutting off of annoying distribution tails, were found to have power-laws.

Take for example the classic Zipf's law. Zipf noticed that word usage in an English text follows a power-law effect. The frequency of a word is inversely proportional to its rank. For example, in a given collection of documents, the most frequent word "the" will occur roughly twice as often as the word "of", and "the" will occur three times as often as "and", the third most frequent word.

This is a true power-law. The problem was that a lot of people started hypothesizing it was related to some innate feature of language production. But random texts exhibit the same feature. Other studies have shown similar results.

There's some controversy over this, and it doesn't mean there isn't some evolutionary basis for the observed power-laws seen in some lingustic networks. But Zipf's law wasn't the only observation of power-laws in linguistic. Syntactic networks, semantic networks, co-occurrence.. the list goes on, and almost all of them were found to have power-law properties. Like other fields, linguistics (or at least corpus linguistics/computational linguistics) fell prey to a scientific fad.

posted by formless at 9:32 AM on April 24, 2009

Take for example the classic Zipf's law. Zipf noticed that word usage in an English text follows a power-law effect. The frequency of a word is inversely proportional to its rank. For example, in a given collection of documents, the most frequent word "the" will occur roughly twice as often as the word "of", and "the" will occur three times as often as "and", the third most frequent word.

This is a true power-law. The problem was that a lot of people started hypothesizing it was related to some innate feature of language production. But random texts exhibit the same feature. Other studies have shown similar results.

There's some controversy over this, and it doesn't mean there isn't some evolutionary basis for the observed power-laws seen in some lingustic networks. But Zipf's law wasn't the only observation of power-laws in linguistic. Syntactic networks, semantic networks, co-occurrence.. the list goes on, and almost all of them were found to have power-law properties. Like other fields, linguistics (or at least corpus linguistics/computational linguistics) fell prey to a scientific fad.

posted by formless at 9:32 AM on April 24, 2009

I guess I've misunderstood what "The Long Tail" was referring to, probably because I hadn't read any of the stuff related to it. My reaction was always, "Doesn't

posted by XMLicious at 3:21 PM on April 24, 2009

*every*unbounded continuous probability distribution have at least one long tail?" (Which, now that I think about it, isn't necessarily true, but hopefully you understand what I mean.)posted by XMLicious at 3:21 PM on April 24, 2009

One way to look at it: let's say you have a Gaussian distribution (bell curve) for some random number X. You can ask, what's the average value of X, what's the average value of X^2, etc. And it's all well defined, and behaves like you'd expect. If you have a power law, that isn't true. If the tail looks like P ~ 1/X^2, the average value of X^2 is infinity. That is, you have a finite average value, but the standard deviation is formally infinite. The "formally" is important, because in reality, every system has some maximum possible event size. But basically, power-laws tend to saturate that event size. You see lots of litte events, and occasional HUGE events. That's the "long" or "heavy" tail, and it means that when dealing with power laws, you can't just describe things in terms of the mean. Well, you can, but it will bite you.

posted by Humanzee at 5:11 PM on April 24, 2009

posted by Humanzee at 5:11 PM on April 24, 2009

Woopsie. if it goes like 1/X^2, even the average value is infinite. A distribution with P~1/X^q will have a finite average and infinite variance if 2 < q <= 3. Anyhoo, no matter how steep (and things like 1/X^3 or steeper are rather rare) eventually some "moment" of X is infinite.

...

I know, no one cares. I'm OCD about stuff like this, and only just refrained from cutting off my pinky to atone for shaming myself.

posted by Humanzee at 5:38 PM on April 24, 2009

...

I know, no one cares. I'm OCD about stuff like this, and only just refrained from cutting off my pinky to atone for shaming myself.

posted by Humanzee at 5:38 PM on April 24, 2009

Oh, okay then, perhaps a hunch of mine is correct: I was thinking that maybe something obeying a power law is essentially asymptotic to both the x-axis and the y-axis graphically. I had decided I was wrong about this because the graphs of the Pareto distribution I was looking at appeared to only necessarily be asymptotic to the x-axis. (Unless I'm misunderstanding what you're saying about infinite average values, or misremembering what a random variable is.)

Or... wait just a second... does it even more specifically mean that the graph approaches the x-axis "faster" than it approaches the y-axis? Er, I probably try to think about things too visually anyways, that's what my professor said back in the day.

posted by XMLicious at 5:44 PM on April 24, 2009

Or... wait just a second... does it even more specifically mean that the graph approaches the x-axis "faster" than it approaches the y-axis? Er, I probably try to think about things too visually anyways, that's what my professor said back in the day.

posted by XMLicious at 5:44 PM on April 24, 2009

All power-law distributions must have cut-offs at the low end, because otherwise they couldn't be normalized (they'd have infinite area under the curve, i.e. infinite probability). So they aren't asymptotic to the y-axis.

They either have a sharp cut-off (minimum allowed size) and therefore terminate and never touch the y-axis, or they deviate from power-law at low event sizes ( a "soft" cut-off) e.g.

P(X) ~ (X + L)^-P (L is the low-size cut-off)

This causes them to turn over (like a bell) and hit the y-axis (at a small finite value). So basically, they look like bell curves, but they approach zero probability much more slowly.

Practically speaking, they also have large size cut-offs (because no physical system can have arbitrarily large events). But that isn't a mathematical necessity.

posted by Humanzee at 6:59 PM on April 24, 2009 [2 favorites]

They either have a sharp cut-off (minimum allowed size) and therefore terminate and never touch the y-axis, or they deviate from power-law at low event sizes ( a "soft" cut-off) e.g.

P(X) ~ (X + L)^-P (L is the low-size cut-off)

This causes them to turn over (like a bell) and hit the y-axis (at a small finite value). So basically, they look like bell curves, but they approach zero probability much more slowly.

Practically speaking, they also have large size cut-offs (because no physical system can have arbitrarily large events). But that isn't a mathematical necessity.

posted by Humanzee at 6:59 PM on April 24, 2009 [2 favorites]

*All power-law distributions must have cut-offs at the low end, because otherwise they couldn't be normalized (they'd have infinite area under the curve, i.e. infinite probability)*

I don't think that's right. The integral of x^(-p) from 1 to infinity converges if p > !.

posted by twoleftfeet at 8:49 PM on April 25, 2009

You've just described a "hard" minimum-size cut-off of 1. Sometimes this is natural, sometimes it isn't. There may be events described by a power-law distribution that have size less than one, in which case the cut-off should be less than one (or it should be "soft"). But there must be a cut-off that is greater than zero, because no power-law has a finite integral from zero to one and a finite integral from one to infinity.

posted by Humanzee at 5:00 AM on April 26, 2009

posted by Humanzee at 5:00 AM on April 26, 2009

« Older The effect of adding another zero | "The best I can do is to live a good life and... Newer »

This thread has been archived and is closed to new comments

posted by fallingbadgers at 11:03 PM on April 23, 2009