“Word Embedding Models let us take a stab formalizing an interesting counterfactual question: what would the networks of meaning in language look like if patterns that map onto gender did not exist?” [more inside]
What's the Difference Between Data Science and Statistics? — Not long ago, the term "data science" meant nothing to most people-even to those who worked with data. A likely response to the term was: "Isn't that just statistics?" These days, data science is hot. The Harvard Business Review called data scientist the "Sexiest Job of the 21st Century." So what changed? Why did data science become a distinct term? And what distinguishes data science from statistics?
A hive plot (slides) is a beautiful and compelling way to visualize multiple, complex networks, without resorting to "hairball" graphs that are often difficult to qualitatively compare and contrast. [more inside]
OpenCPU provides a RESTful interface to the popular open-source statistical package R, enabling the user to perform calculations and create publication-quality or web-embeddable visualizations via standard web requests.
Dataists give their hopes and dreams for data, data tools and data science in 2011. Already, Google has provided Google Refine (previously) to help clean your datasets. While great visualizations can be created with online tools or by combining R (great posts previously), with ggplot2, GGobi, and even Google Motion Charts With R (already built into Google Spreadsheets). Need data? Needlebase, helps non-programmers scrape, harvest, merge, and data from the web. Or if you’re introspective, Your Flowing Data and Daytum provide tools to measure and chart details of your own life.
R is quickly becoming the programming language for data analysis and statistics. R (an implementation of S) is free, open-source, and has hundreds of packages available. You can use it on the command-line, through a GUI, or in your favorite text editor. Use it with Python, Perl, or Java. Sweave R code into LaTeX documents for reproducible research. [more inside]