Iterated learning using YouTube
April 2, 2013 8:09 AM   Subscribe

"What happens if you repeatedly run Kafka's Metamorphosis through YouTube's auto-transcription? Structure emerges!" via Sean Roberts
posted by knile (18 comments total) 13 users marked this as a favorite

 
This is precisely my kind of geekery.
posted by Peevish at 8:26 AM on April 2, 2013 [1 favorite]


I might be reading particular words more clearly...

You can hear this clearly in the first iteration. The character's name is "Gregor Samsa". He reads it as "Gregor Samsahh...aahn" which is apparently transcribed to "greatest ramps and".
posted by DU at 8:32 AM on April 2, 2013


That's really fascinating. Thanks!
posted by Drexen at 8:34 AM on April 2, 2013


YouTube turns Kafka into Joyce. Interesting.
posted by yoink at 8:41 AM on April 2, 2013 [1 favorite]


YouTube turns Kafka into Joyce. Interesting.

Which is appropriate, given that they tend to sit next to each other on the Fiction shelf.
posted by Rustic Etruscan at 8:46 AM on April 2, 2013


What's happened to me, speedboat?!
posted by heyforfour at 9:14 AM on April 2, 2013


Of course, there’s another source of bias in this apart from the YouTube transcription algorithm: me! I might be reading particular words more clearly or interpreting the prosody in a particular way.

I definitely think this is a big and possibly fatal shortcoming of this experiment. Automatic transcription has problems with rapid, fluent speech. Humans don't, because we have a mental model of the language and meaning that makes speech massively redundant and overspecified. On the other hand, reading a nonsensical text without regard for meaning is a lot more difficult for a human, who will tend to over-enunciate to disambiguate words that normally would be disambiguated at the utterance/syntactic level. The likely result is that once meaning/syntax breaks down, we make language easier for YouTube to transcribe.
posted by Nomyte at 9:29 AM on April 2, 2013 [6 favorites]


"One morning, as Gregor Samsa was waking up from anxious dreams, he discovered that he was sitting in a room, different from the one you are in now."
posted by aparrish at 9:33 AM on April 2, 2013 [4 favorites]


As Gregor Samsa awoke from a night of orgasm after orgasm download free ebook!
posted by Nomyte at 9:36 AM on April 2, 2013 [2 favorites]


monstrous bremen spoke! (with richard cohen like segments...)
posted by ennui.bz at 10:36 AM on April 2, 2013


I definitely think this is a big and possibly fatal shortcoming of this experiment.

Not so sure it's a shortcoming. Auto-transcription involves both computers and humans. These results point out the tensions between what language means to a human and what it means to a computer.
posted by Peevish at 10:47 AM on April 2, 2013


YouTube turns Kafka into Joyce. Interesting.

I think we're going to discover that Joyce is some kind of literary fixed point; if you iterate enough, anything will become Joyce, in the same inevitable way that a saltine turns to mush if you leave it in your mouth too long.
posted by qxntpqbbbqxl at 10:52 AM on April 2, 2013 [1 favorite]


Which is appropriate, given that they tend to sit next to each other on the Fiction shelf.

So is this where I get to complain about going to used book stores looking to see if they had any new Pynchon and always having to span past multiple copies of The Fountainhead?
posted by benito.strauss at 11:47 AM on April 2, 2013


Knowing a little bit about automated transcription, I was a bit skeptical about the conclusions he was making. However, the interesting part of the experiment is how iteration tells you something about a bottleneck, not (conclusively) human language. That it tends towards efficiency in the same way that humans might is not entirely coincidence, but it doesn't really tell you too much about human language independent of human's explicit understanding of said language. :-)

If you repeatedly iterate over a speech recognizer, you're going to get closer and closer to an optimally scoring set of words, modulated by how consistently you pronounce things. Since recognizers align on phonemes, the phoneme length will remain close to the same (there's always error :-)). They also tend to bias towards co-articulation within words rather than between words, unless the words are a fairly common sequence -- hence the words becoming longer in a fixed time frame and less errors happening after each iteration.

And, in fact, he makes these observations. The fact that it isn't a surprise to anyone who's worked on a speech recognizer makes the results banal, but the method of getting to those results is still interesting.
posted by smidgen at 11:58 AM on April 2, 2013 [1 favorite]


As the number of iterations increases, all YouTube Videos converge towards:

I like big butts and I can not lie
You other brothers can't deny
That when a girl walks in with an itty bitty waist
And a round thing in your face
You get sprung, wanna pull out your tough
'Cause you notice that butt was stuffed
Deep in the jeans she's wearing
I'm hooked and I can't stop staring
Oh baby, I wanna get with you
And take your picture
My homeboys tried to warn me
But that butt you got makes me so horny
Ooh, Rump-o'-smooth-skin
You say you wanna get in my Benz?
Well, use me, use me
'Cause you ain't that average groupie
I've seen them dancin'
To hell with romancin'
She's sweat, wet,
Got it goin' like a turbo 'Vette
I'm tired of magazines
Sayin' flat butts are the thing
Take the average black man and ask him that
She gotta pack much back
So, fellas! (Yeah!) Fellas! (Yeah!)
Has your girlfriend got the butt? (Hell yeah!)
Tell 'em to shake it! (Shake it!) Shake it! (Shake it!)
Shake that healthy butt!
Baby got back!

posted by Naberius at 2:08 PM on April 2, 2013 [2 favorites]


The compression ration vs. generation plot curve was very... approximate.
posted by flyingfox at 3:25 PM on April 2, 2013 [2 favorites]


All I can say is it would be nifty if the site would load.

The connection has timed out

The server at www.replicatedtypo.com is taking too long to respond.
posted by Samizdata at 5:27 PM on April 2, 2013


Justin Quillinan has done a follow-up using synthesized speech.
posted by knile at 2:42 AM on April 8, 2013


« Older Apologies function as a social lubricant. They sm...  |  What are Porn Stars' Personali... Newer »


This thread has been archived and is closed to new comments