How to tell correlation from causation - "The basic intuition behind the method demonstrated by Prof. Joris Mooij of the University of Amsterdam and his co-authors is surprisingly simple: if one event influences another, then the random noise in the causing event will be reflected in the affected event."
At the Far Ends of a New Universal Law
The law appeared in full form two decades later, when the mathematicians Craig Tracy and Harold Widom proved that the critical point in the kind of model May used was the peak of a statistical distribution. Then, in 1999, Jinho Baik, Percy Deift and Kurt Johansson discovered that the same statistical distribution also describes variations in sequences of shuffled integers — a completely unrelated mathematical abstraction. Soon the distribution appeared in models of the wriggling perimeter of a bacterial colony and other kinds of random growth. Before long, it was showing up all over physics and mathematics. “The big question was why,” said Satya Majumdar, a statistical physicist at the University of Paris-Sud. “Why does it pop up everywhere?”
Hyperreal numbers: infinities and infinitesimals - "In 1976, Jerome Keisler, a student of the famous logician Tarski, published this elementary textbook that teaches calculus using hyperreal numbers. Now it's free, with a Creative Commons copyright!" (pdf—25mb :) [more inside]
Network Theory Overview - "The idea: nature and the world of human technology are full of networks! People like to draw diagrams of networks. Mathematical physicists know that in principle these diagrams can be understood using category theory. But why should physicists have all the fun? This is the century of understanding living systems and adapting to life on a finite planet. Math isn't the main thing we need, but it's got to be part of the solution... so one thing we should do is develop a unified and powerful theory of networks." (via ;)
According to statistician Aki Vehtari of Aalto University in Finland, there is diminished 20% chance that today, December 25th, is your birthday. There is a 5% higher likelihood than chance that your birthday is actually February 14th. [more inside]
Watching one of the exciting snow-bound football games yesterday, the thought may have occurred to you: If I was a coach, would I go for it on this 4th down? This bot from the New York Times will tell you, and maybe even add a little attitude to the answer, which is usually much more aggressive than NFL coaches.
This is one example of a phenomenon I noticed throughout this chart: natural rival franchises tend to have similar numbers of goon seasons. This would suggest that goon employment may be (in some instances) localized arms races between rivals, whose cyclical number of goons tends to reflect the other’s in some perverted game of Mutually Assured Terrible Hockey (MATH)... We also have a team like Detroit near the bottom of the list, with only 8.5 goon seasons in their history. Since 1985-86, the Wings have only had 4.5 goon seasons. They’ve only had 2 goon seasons since 1988-89. Coincidentally, they’ve been pretty damned swell at winning hockey games since that time. The Evolution of Goon Culture in the NHL
The Nature of Computation - Intellects Vast and Warm and Sympathetic: "I hand you a network or graph, and ask whether there is a path through the network that crosses each edge exactly once, returning to its starting point. (That is, I ask whether there is a 'Eulerian' cycle.) Then I hand you another network, and ask whether there is a path which visits each node exactly once. (That is, I ask whether there is a 'Hamiltonian' cycle.) How hard is it to answer me?" (via) [more inside]
The year was 1945. Two earthshaking events took place: the successful test at Alamogordo and the building of the first electronic computer. Their combined impact was to modify qualitatively the nature of global interactions between Russia and the West. No less perturbative were the changes wrought in all of academic research and in applied science. On a less grand scale these events brought about a [renaissance] of a mathematical technique known to the old guard as statistical sampling; in its new surroundings and owing to its nature, there was no denying its new name of the Monte Carlo method (PDF). -N. MetropolisConceptually talked about on MeFi previously, some basic Monte Carlo methods include the Inverse Transform Method (PDF) mentioned in the quoted paper, Acceptance-Rejection Sampling (PDFs 1,2), and integration with and without importance sampling (PDF).
An "Exciting Guide to Probability Distributions" from the University of Oxford: part 1, part 2. (Two links to PDFs)
Larry Gonick is a veteran American cartoonist best known for his delightful comic-book guides to science and history, many of which have previews online. Chief among them is his long-running Cartoon History of the Universe (later The Cartoon History of the Modern World), a sprawling multi-volume opus documenting everything from the Big Bang to the Bush administration. Published over the course of three decades, it takes a truly global view -- its time-traveling Professor thoroughly explores not only familiar topics like Rome and World War II but the oft-neglected stories of Asia and Africa, blending caricature and myth with careful scholarship (cited by fun illustrated bibliographies) and tackling even the most obscure events with intelligence and wit. This savvy satire carried over to Gonick's Zinn-by-way-of-Pogo chronicle The Cartoon History of the United States, along with a bevy of Cartoon Guides to other topics, including Genetics, Computer Science, Chemistry, Physics, Statistics, The Environment, and (yes!) Sex. Gonick has also maintained a few sideprojects, such as a webcomic look at Chinese invention, assorted math comics (previously), the Muse magazine mainstay Kokopelli & Co. (featuring the shenanigans of his "New Muses"), and more. See also these lengthy interview snippets, linked previously. Want more? Amazon links to the complete oeuvre inside! [more inside]
"Value-added modeling is promoted because it has the right pedigree -- because it is based on "sophisticated mathematics." As a consequence, mathematics that ought to be used to illuminate ends up being used to intimidate." John Ewing, president of Math for America and former executive director of the American Mathematical Society, criticizes the "value-added modeling" approach used as a proxy for teacher quality, most famously in a Los Angeles Times story that called out low-scoring teachers by name. A Brookings Institution paper says value-added modeling is flawed but the best measure we have of teacher value, arguing that the metric's wide fluctuations from year to year are no worse than those of batting averages in baseball. (Though the weakness of that correlation is mostly a BABIP issue.) Can we assign a numerical value to teacher quality? If so, how?
Google is known to ask the following question in job interviews: In a country in which people only want boys every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country? Think you know the answer? If so, Steve Landsburg may be willing to bet you up to $5000. [more inside]
"Normal" human pregnancies last 40 weeks, right? Well, no; they can vary quite a bit by the mother's race, age, number of previous children, family history of delivering early or late, home state, work habits, and even the fetus' HLA type. So where does that "40 week" thing come from? Oh, dear. So check out this super-nerdy pregnancy statistics website, from an engineer mom who is collecting data from the public (see the raw data and auto-generated graphs, and read the FAQ about the survey, with more cool graphs). Looking for day-by-day probabilities on when that baby's due? This would be your stats table with daily prediction (adjust dates at top of page as needed). Of course, you could always shut up your constantly inquiring relatives and friends another way.
Measure-theoretic probability: Why it should be learnt and how to get started. The clickable chart of distribution relationships. Just two of the interesting and informative probability resources I've learned about, along with countless other tidbits of information, from statistician John D. Cook's blog and his probability fact-of-the-day Twitter feed ProbFact. John also has daily tip and fact Twitter feeds for Windows keyboard shortcuts, regular expressions, TeX and LaTeX, algebra and number theory, topology and geometry, real and complex analysis, and beginning tomorrow, computer science and statistics.
Kaggle hosts competitions to glean information from massive data sets, a la the Netflix Prize. Competitors can enter free, while companies with vast stores of impenetrable data pay Kaggle to outsource their difficulties to the world population of freelance data-miners. Kaggle contestants have already developed dozens of chess rating systems which outperform the Elo rating currently in use, and identified genetic markers in HIV associated with a rise in viral load. Right now, you can compete to forecast tourism statistics or predict unknown edges in a social network. Teachers who want to pit their students against each other can host a Kaggle contest free of charge.
Interested in teaching yourself some statistics? Here is an excellent online and interactive statistics textbook developed at UC Berkeley, and also used at CUNY, UCSC, SJSU, and Bard. Here is the syllabus for the course at Berkeley. And here are some insightful reflections from the professor on developing Berkeley's first fully approved online course.
It has applications in Economics, Biology, Pharmaceuticals, and is rooted in State Space Modeling, which with Kalman Filtering (paper, breakdown [warning: long]) was used in the Apollo program. Dynamic Linear Models are gaining in popularity. There exists an R package, and both a short doc and a really great (read: worth buying) book (sorry, not a download, but here's chapter 2) by Giovanni Petris, Sonia Petrone, and Patrizia Campagnoli with its own little website.
Oddly obsessive statistical analysis of the rates paid by customers of legal Nevada prostitutes, broken down by sex act, attractiveness, body type and presence or absence of a jacuzzi, among other topics. [via Cynical-C, who swears he found it by accident]
The Logic of Diversity "A new book, The Wisdom of Crowds [..:] by The New Yorker columnist James Surowiecki, has recently popularized the idea that groups can, in some ways, be smarter than their members, which is superficially similar to Page's results. While Surowiecki gives many examples of what one might call collective cognition, where groups out-perform isolated individuals, he really has only one explanation for this phenomenon, based on one of his examples: jelly beans [...] averaging together many independent, unbiased guesses gives a result that is probably closer to the truth than any one guess. While true — it's the central limit theorem of statistics — it's far from being the only way in which diversity can be beneficial in problem solving." (Three-Toed Sloth)
Hey, kids! Statistics is cool! (Amazing introduction to the concept of estimation, and error computing.)