67 Years of Lego Sets
I started to wonder how Legos evolved from the sets I remember from my childhood to what they are today. As an analyst, I turned to data for answers. I used Plotly and Mode Python Notebooks to explore the data.
"We collected data on each of the 127,000 Meetup events created in the United States since 2002 and analyzed this data to understand what people care about across the country. What we found confirmed several city stereotypes: the Bay Area is the home of tech, New York is the epicenter of fashion, and D.C. reigns supreme in multiculturalism. We also looked into what Meetup data tells us about the other homes of tech, and the cities most interested in music, and finding love."
“The rise of the misinformed is now the largest obstacle for success for journalists today (outside the concerns that relate to publishing). If people don't trust the news, you don't have a news business.” Thomas Baekdal writes a strategic analysis for media companies to earn their readers’ trust, looking at data from PolitiFact to understand how misinformation spreads and what journalists can do to stop it.
Dating Historic Images A key to using clues in photos to narrow down the date of construction for historic vernacular architecture, from University of Vermont's Landscape Change digital image project. [more inside]
The Next Rembrandt Can you create a "new" Rembrandt "painting" via data analysis? This project gives it a try.
In the ISS there are two Astro Pi computers, Ed and Izzy, equipped with Sense HATs, two different camera modules (visual and IR), and stored in rather special cases. They are now running code written by UK school children - the winners of a competition. The data will be feeding back soon! [more inside]
291 diseases and injuries + 67 risk factors + 1,160 non-fatal complications = 650 million estimates of how we age, sicken, and die
As humans live longer, what ails us isn't necessarily what kills us: five data visualizations of how we age, sicken, and die. Causes of death by age, sex, region, and year. Heat map of leading causes and risks by region. Changes in leading causes and risks between 1990 and 2010. Healthy years lost to disability vs. life expectancy in 1990 and 2010. Uncertainties of causes and risks. From the team for the massive Institute for Health Metrics and Evaluation Global Burden of Diseases, Injuries, and Risk Factors Study 2010. [more inside]
Peter Turchin is a Professor of Mathematics, and of Ecology and Evolutionary Biology at the University of Connecticut. For the last nine years, he's been taking the mathematical techniques that once allowed him to track predator–prey cycles in forest ecosystems, and using them to model human history -- a pattern identification process he calls Cliodynamics. The goal of cliodynamics (or cliometrics) is to turn history into a predictive, analytic science. By analysing some of the broad social forces that shape transformative events in US society: historical records on economic activity, demographic trends and outbursts of violence, he has come to the conclusion that a new wave of internal strife is already on its way, and should peak around 2020. [more inside]
A corpus analysis of rock harmony [PDF] - The analyses were encoded using a recursive notation, similar to a context-free grammar, allowing repeating sections to be encoded succinctly. The aggregate data was then subjected to a variety of statistical analyses. We examined the frequency of different chords and chord transitions ... Other results concern the frequency of different root motions, patterns of co-occurrence between chords, and changes in harmonic practice across time. More information, analysis, and explanation here.
Mining the Mother of all Data Dumps We now have a relatively massive haul of digital data from the OBL strike. There are several forensic toolkits in use by the private (commercially available) and public sector as well as open-source. Best practices include inventorying all the sources, cloning the sources so as to not damage pristine data, recovering any partial or damaged content, making the cloned sources read-only, adhering to legally-admissible tools standards, and documenting everything. There is an excellent source titled Digital Forensics and Born-Digital Content from the Council on Library and Information Resources [pdf, Resource Shelf]. But what to do next*? [more inside]
It has applications in health care, pharmaceuticals, facial recognition, economics/related areas, and of course, much much more. Previously, MeFi discussed controversial homeland security applications, and the nexus between social networking and mobile devices that further contributes to the pool. With plenty to dig into, let's talk Data Mining in more detail. [more inside]
Word Spectrum; SearchClock; Digg Rings; Bible Cross-references: the gorgeous analytical vizualizations of Chris Harrison. [more inside]
High performance kart racing is frequently misunderstood to be bumper-car-like "fun park" or "trailer park" karting in the US. [more inside]