"The first journalist to attempt reporting on the Wikileaks cables was David Leigh of The Guardian. The material arrived as a single 1.7GB CSV file containing 251,287 U.S. diplomatic cables from 1966 to 2010. If you’ve ever tried to open a 1.7GB file, you know you probably can’t. Microsoft Word and Excel will plain refuse. Windows Notepad and Mac TextEdit will try, but slow to a crawl." At Opennews Source, Jonathan Stray has written a helpful beginners' guide to dealing with large amounts of documents
for journalists and interested lay people.
posted by MartinWisse
on Mar 17, 2014 -
. With an ocean of new statistical information available, the NBA could be on the verge of understanding the value of every single movement on the court.
posted by antonymous
on Feb 8, 2014 -
tries to contribute to human sexuality understanding through a Big Data approach. Studies (PDF
(maps the evolution of words frequencies in the titles of porn videos).
posted by motdiem2
on Jan 30, 2014 -
"We live in a world where digital information is exploding
. Some 90% of the world’s data was generated in the past two years. The obvious question is: how can we store it all? In Nature Communications today
, we, along with Richard Evans from CSIRO, show how we developed a new technique to enable the data capacity of a single DVD to increase from 4.7 gigabytes up to one petabyte (1,000 terabytes). This is equivalent of 10.6 years of compressed high-definition video or 50,000 full high-definition movies."
posted by Blazecock Pileon
on Jun 20, 2013 -
advances UC Berkeley’s mission to make sense of big data and to use new technology to document and maintain endangered languages as critical resources for preserving cultures and knowledge. [...] it can also provide clues to how languages might change years from now."
posted by batmonkey
on Feb 11, 2013 -