Belevitch considered a wide class of more or less well-behaved statistical distributions (normal or whatever), and performed a functional rearrangement that represents the frequency as a function of rank-ordered decreasing frequency, and then did a Taylor expansion of the resulting formula. Belevitch's lovely result is that "Zipf's Law" follows directly as the first-order truncation of the Taylor series. Furthermore, "Mandelbrot's Law" (which seem even more curious and mysterious to most people) follow immediately as the second-order truncation. ("Pareto's Law" lies in between Zipf and Mandelbrot, with different slope of the 45-degree curve.) There is nothing magical or mystical about it!
Jim Horning once asked me about a possible connection with the 80-20 rule. My response was this: ...
* In 36,299 occurrences of English words (Miller et al.), the most frequent 18% of the words account for over 80% of the word occurrences. That's close to the so-called 80-20 rule.
* In over 11 million occurrences of German words (Kaeding -- fascinating book, incidentally), the most frequent .6% of the words account for over 75% of the word occurrences, which is in some sense roughly 20 times more skewed than the so-called 80-20 rule. Perhaps the wider skewing is due to the fact that conjugated forms and declined forms (such as the most frequent der, die, das, etc.) are counted as different words, which linguistically of course, they are.
Both of these language statistical studies closely follow Zipf-Mandelbrot all the way down to the tails. But the parameters are slightly different. Thus, the supposed 80-20 split does not in anyway follow directly from Z-M. It could be 80-20, or 99-1, or worse!
« Older Sweet Home Alabama - Where the minimum wage must... | This is why you always look up. Newer »
This thread has been archived and is closed to new comments