What is a work of art in the age of $120,000 art degrees? A new report (PDF) by activist collective BFAMFAPhD laments the shrinking job prospects and growing debt burden for art school graduates. [more inside]
What Will It Take to Run a 2-Hour Marathon? (Warning: data viz, annoying design)
JPMorgan Chase Says More Than 76 Million Accounts Compromised in Cyberattack [New York Times]
"The breach is among the largest corporate hacks, and the latest revelations vastly dwarf earlier estimates that hackers had gained access to roughly 1 million customer accounts."
Search for word usage in movies and television over time.
Movies and television shows often reflect cultural trends of the time they are made in. Even movies that take place during the past or future can say something about the present through metadata or production style. Using the Bookworm platform, Benjamin Schmidt, an assistant professor of history at Northeastern University, provides a tool that lets you see trends in movie and television dialogue.
Folger Shakespeare Library Releases 80,000 Images for Creative Common Use. The Folger Shakespeare Library announced yesterday, that they have released the contents of their Digital Image Collection under a Creative Commons Share-Alike (CC-BY-SA) license. Full database can be accessed here.
data.gov.in : the Indian counterpart of the US data.gov, features 10280 resources in 3215 catalogs for public perusal. There's a visualization gallery charting developments like village electrification or domestic air traffic or sales of automobiles. And also a community section featuring apps offering mobile access to some of the data.
You are a particle physics researcher. Particle Clicker is a resource accumulation game in the same mould as Cookie Clicker - but this time with particle physics research, academics, and funding. Click repeatedly on the Collider to generate data. Turn data into research to gain funding and increase your reputation. Spend your funding on Human Resources and Upgrades - don't forget to buy beer to keep your research students happy, and coffee to keep them awake! [more inside]
A link to Good Magazine's infographics. Some of my favorites: How powerful is your passport, Being bike friendly in America, What foods are most susceptible to food fraud. [more inside]
Only 15 million, riiiight. A data experiment out of Florida State University maps the location of 1 million of the 15 million publicly available online images tagged with the word "cat." Using a supercomputer and the map coordinates imbedded in their metadata, I Know Where Your Cat Lives shows where each image was taken, to within an estimated 7.8 meters accuracy. [more inside]
“For the past 105 days, I've been tracking everything about myself.” Anand Sharma shows the progress of his life through a beautifully designed site. [more inside]
Datashine: Census is a site from UCLs Big Open Data: Mining and Synthesis project which provides an easy interface to map UK population data. [more inside]
How to write 225 words per minute. With a pen. Dennis Hollier, in the Atlantic, writes about Gregg shorthand, a piece of analog data-compression technology now largely forgotten and probably forever unequalled.
“But what shall we dream of when everything becomes visible?” Virilio replies: “We’ll dream of being blind."
Clarity Campaign Labs invites you to use TargetSmart U.S. voter data to discover, via seven yes/no/don't care questions, What town matches my politics? Business Insider uses it to determine the most liberal and conservative towns in each state.
An interactive visualization of Boston's subway system in February. With it, you can see where trains on the red, blue, and orange lines were at any moment on February 3 were in space and along their paths between stations, among many other things. [more inside]
Randy Olson is conducting an analysis of chess since 1850. What's the advantage of playing white? Are games getting longer? What openings have fallen in and out of vogue? Are chess players becoming less focused on capturing pieces?
The SmartMime whale tracker lets you know where Hawaii's diverse population of whales are right now (not actually in real time, but based on migration data).
A checklist for those making graphs from Stephanie Evergreen and Ann Emery. This is a useful tool for teaching scientists and others some of the rules of data presentation in graph form.
Not everyone agrees on the best methods for raising kids. That becomes apparent when you examine the results from the 2010-2014 World Values Survey — 82,000 adults across 54 countries were surveyed to gain a better understanding of what they consider most important when raising a child, whether or not they were parents themselves. PBS NewsHour has an interactive quiz you can take to show which country has values closest to yours as well as a widget to compare the values of any two countries. You can see all the data in this google docs spreadsheet.
Meat Atlas: facts and figures about the animals we eat
Equaldex: the collaborative LGBT knowledgebase! A crowd-sourced, verified, beautifully presented representation of equal rights (and how they are specifically denied) for LGBT folks. [via reddit]
Sony just announced that cassette technology might be the future! With a device that can hold 185 terabytes on one tape. (that's three bluray discs worth of data per square inch.)
MetaFilter is well acquainted with numbers stations (previously with previouslies inside of that). Well, they may just have migrated to YouTube. [more inside]
Nik Freeman has created a map, based on census data, to illustrate the 47% of the United States where nobody lives.
How Americans Die - a visual tour through surprising trends in mortality among Americans in the last several decades
Ben Goldacre, The Guardian: "Today we found out that Tamiflu doesn't work so well after all. Roche, the drug company behind it, withheld vital information on its clinical trials for half a decade, but the Cochrane Collaboration, a global not-for-profit organisation of 14,000 academics, finally obtained all the information. Putting the evidence together, it has found that Tamiflu has little or no impact on complications of flu infection, such as pneumonia." [more inside]
# of seasons × # of episodes per season × runtime of episode = total for 1 TV show. Repeat for more TV shows = total time. Tiii.me lets you select the name of a tv show, the number of seasons you've watched, and tells you how much of your life you've spent watching that show. Add more shows and it will keep a running total for you. [more inside]
Each week, the Internet Archive's tumblr account is completely transformed by a digital resident along a theme of their choosing. [more inside]
In a new exhibition titled Beautiful Science: Picturing Data, Inspiring Insight, the British Library pays homage to the important role data visualization plays in the scientific process. The exhibition can be visited from 20 February until 26 May 2014, and contains works ranging from John Snow's plotting of the 1854 London cholera infections on a map to colourful depictions of the Tree of Life. In a Nature Video, curator Johanna Kieniewicz explores some of the beautiful examples of visualizations that are exhibited.[more inside]
Care data is an ambitious attempt to use data to improve the care of patients in the UK. It uses the scale of the NHS dataset to give epidemiologists and medically researchers access to large datasets to improve research. And now it's been thrown into disarray by the responsible body selling the information to insurance companies and even more .... [more inside]
Music Machinery presents a map of each U.S. state's most distinct favorite band or recording artist, as well as an app for playing with the data.
PLOS’ New Data Policy: Public Access to Data "PLOS has always required that authors make their data available to other academic researchers who wish to replicate, reanalyze, or build upon the findings published in our journals. In an effort to increase access to this data, we are now revising our data-sharing policy for all PLOS journals: authors must make all data publicly available, without restriction, immediately upon publication of the article. Beginning March 3rd, 2014, all authors who submit to a PLOS journal will be asked to provide a Data Availability Statement, describing where and how others can access each dataset that underlies the findings." Openscience.org also have a primer on why open science data is important.
Pronbably to no one's surprise, Southern California leads the nation in the number of pleasant days per year (mean temperature between 55° F and 75° F, no precipitation). How does your city stack up?
"Exploring Gender Bias in listening Do men listen to different music than women do? Anecdotally, we can think of lots of examples that point to yes – it seems like more of One Direction’s fans are female, while more heavy metal fans are male, but let's take a look at some data to see if this is really the case." An examination of music listening data from Paul Lamere of The Echo Nest.
Curious about which sport has the best odds of a male or female High School or College player going pro? OSMguy has a data visualization for that. [Via Tableau's Viz of the Day]
You don’t want your privacy: Disney and the meat space data race
The bands are even uniquely colored and monogrammed with your family members’ names so that they won’t get switched up. Why? Because they don’t want their database to get confused and think that you, a 45-year-old man, rode the teacups instead of your little son Timmy. This is one of the first examples I’ve seen of physical design (e.g., monogramming and coloring) for the sake of digital data purity.
If ever there was a testimony to the importance big data has achieved in business it’s this: We will now shape our physical world to create better streams of digital information.
How a Math Genius Hacked OkCupid to Find True Love
“I think that what I did is just a slightly more algorithmic, large-scale, and machine-learning-based version of what everyone does on the site,” McKinlay says. Everyone tries to create an optimal profile—he just had the data to engineer one.[more inside]
"The IPython Notebook is a web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media into a single document". It can be installed faily easily with anaconda or on Amazon EC2. Various interesting notebooks are to be found at the official Notebook Viewer site Another collection of interesting notebooks on many topics. [more inside]
If you use Netflix, you've probably wondered about the specific genres that it suggests to you. Some of them just seem so specific that it's absurd. Emotional Fight-the-System Documentaries? Period Pieces About Royalty Based on Real Life? Foreign Satanic Stories from the 1980s? ... Through a combination of elbow grease and spam-level repetition, we discovered that Netflix possesses not several hundred genres, or even several thousand, but 76,897 unique ways to describe types of movies.
At the core of good science and engineering is the careful and respectful treatment of data. We calibrate our instruments, scrutinize the algorithms we use to process the data, and study the behavior of the models we use to interpret the data or simulate the phenomena we may be observing. Surprisingly, this careful treatment of data often breaks down when we visualize our data.
Get Data [SLYT]
A few months ago there was a list of links to classic video game emulators posted. Very recently, I'm pleased to report, those links all came true. The Internet Archive bespoke upon aforementioned consoles, computers, and mileposts on our way to the tech utopia of today, (seriously, where's my flying car?) and they asked us to do something: Imagine every computer that ever existed, literally, in your browser. And it was so. I have absolutely no affiliation with jscott, btw. Thought I should disclose that.
The Guardian reports on new rules designed to curb the transfer of data to the US, with fines running into billions. [more inside]
GaMuSo is an application of BioGraph-based data mining to music, which helps you get recommendations for other musicians. Based on 140K user-defined tags from last.fm that are collected for over 400K artists, results are sorted by the "nearest" or most probable matches for your artist of interest (algorithm described here). [more inside]
Fraudulent & hoax manuscripts submitted to academic journals typically present false findings by real authors. This time, however, the paper contains real (and previously unpublished) results... by fake authors. (via retractionwatch) [more inside]
The data analysis group that used Facebook and set top TV data to help Barack Obama win the latest election is taking its talents to the private sector. (SL NYTimes)