Coming soon to a cinema near you
July 23, 2006 12:40 PM   Subscribe

The Human Speechome Project - "A baby is to be monitored by a network of microphones and video cameras for 14 hours a day, 365 days a year, in an effort to unravel the seemingly miraculous process by which children acquire language.". Selected video clips. Paper (PDF, 750KB). To test hypotheses of how children learn, Prof Deb Roy's team at MIT will develop machine learning systems that “step into the shoes” of his son by processing the sights and sounds of three years of life at home. Total storage required: 1.4 petabytes.
posted by Gyan (19 comments total) 2 users marked this as a favorite
 
Big Father?
posted by null terminated at 12:43 PM on July 23, 2006


Heh, points for style. One of their data analysis apps is called Total Recall.
posted by Drunken_munky at 12:46 PM on July 23, 2006


When the computers gain sentience, via this, they'll lord it over us for the rest of eternity. "Why were you always waving Bunny at him and not me? Why did you ignore me? I really liked Bunny. He never liked Bunny as much as I did."
posted by blacklite at 1:00 PM on July 23, 2006 [2 favorites]


Will Prof Deb Roy be referring to that data as the Petafiles?
posted by Kraftmatic Adjustable Cheese at 1:02 PM on July 23, 2006


Shades of City of Glass.
posted by Iridic at 1:04 PM on July 23, 2006


I predict a BSOD mere moments before the first words are uttered.
posted by CynicalKnight at 1:06 PM on July 23, 2006


As opposed to other parents, who don't try to do the "right things"?

Anyway, the interesting thing to me is how the baby responds, not what the parents are saying. Any parental behavior within the usual range is fine; correlating the baby's behavior with the specific parental behavior is the interesting thing.
posted by hattifattener at 1:14 PM on July 23, 2006


"how can scientists possibly claim to lead an unbiased controlled experiment involving their own son?"

Oh that's easy. Heisenberg Compensators. Someday, every baby's room and nativity scene will have one.
posted by ZachsMind at 1:24 PM on July 23, 2006


In the 13th century, Frederick II raised dozens of children in silence in an attempt to discover the natural "language of God." He thought it might be Hebrew. The kids never spoke, and all died in childhood.
posted by futility closet at 1:32 PM on July 23, 2006


...which Fred?
posted by ZachsMind at 1:34 PM on July 23, 2006


1.4 petabytes? Pssh... big deal...
posted by hincandenza at 2:21 PM on July 23, 2006


neustile: ok, I'll start: how can scientists possibly claim to lead an unbiased controlled experiment involving their own son? When they teach him language, they'll be making sure they're doing the "right" things, even unconsciously...

You can't. You can conduct a case study though. Such case studies are quite important in science when you don't have a theory strong enough to justify controlled experiments. (Which we don't for early language acquisition.)

In fact, Darwin's theory of evolution is built entirely on case studies. Or for a related methodology, Piaget's theory of development was built entirely on his observations of the mental development of his own kids. Later researchers conducted more rigorous hypothesis testing.

(Wonders how long it will take before the sample size ~ population size misconception appears.)
posted by KirkJobSluder at 2:24 PM on July 23, 2006


Whoops, I should say that Darwin's Origin of Species is built largely on case studies. The great synthesis of the early 20th century made contributed quite a bit for using other methods to talk about evolution.
posted by KirkJobSluder at 2:28 PM on July 23, 2006


Hmm.

Also, a friend [who has worked as a sysadmin] pointed out a couple things:

"Q: How do they archive all that data?
A: They don't.

Q: What happens if there's a fire in their data center?
A: They're screwed.

Q: How long does it take to find something specific on that huge an
array of drives?
A: A very, very long time." (thanks, S-----!)

posted by exlotuseater at 6:10 PM on July 23, 2006


You can fit 2 hours of TV quality video on a CD ROM @ 700MB using compression. At 14 FPS you can probably bump it up to 3 hours. Consider the fact that all of the cameras are stationary, compression goes higher because the background stays the same. So just for kicks let's say 4 hours per GB. On readily available 200GB hard drives that equals 800 hours, or slightly over one month of footage. (That's 24 hours and they're not including night footage.)

Five months = 1 TB. You could nearly fit this entire project on a single 4U Xserve RAID.

Is some shady hardware reseller laughing themselves silly every night over this deal? Sure, I may be overestimating compression levels a bit, but wouldn't most of the video simply be of empty rooms?
posted by greensweater at 9:27 PM on July 23, 2006


ZachsMind : "...which Fred?"

Holy Roman Emperor Frederick II
posted by Bugbread at 9:33 PM on July 23, 2006


...that old fossil?
posted by ZachsMind at 9:55 PM on July 23, 2006


Woah. The Internet never ceases to amaze me. From BugBread's Fred link, I saw "Feral Children.com. A website devoted to the study of children who have had little to no human contact, and therefore in most cases never learn human speech. The proverbial opposite of the Human Speechome Project.
posted by ZachsMind at 9:59 PM on July 23, 2006


Wow, the time lapse photos are absolutely beautiful, in or out of context. Watch the color of the light change in this one for example.
posted by SteelyDuran at 5:19 AM on July 24, 2006


« Older The 'Big Oyster'   |   Still, neither Nixon nor Reagan changed the... Newer »


This thread has been archived and is closed to new comments