I Am Sitting In A Video Room 1000
June 2, 2010 7:32 AM Subscribe

The 'photocopy effect' applied to YouTube What happens when you record a video, upload it to YouTube, download it, and re-upload it over and over and over. After a while, it starts to look like this [30x] and this [175] and this [240] and this [750]. Inspired by Alvin Lucier's experimental composition, I am Sitting in a Room discussed here and here.
posted by sswiller (70 comments total) 54 users marked this as a favorite

The 750 is terrifying: Monsters from the Id! Cool.
posted by digaman at 7:35 AM on June 2, 2010

i could have saved this guy a week of his life with 5 minutes of after effects
posted by nathancaswell at 7:35 AM on June 2, 2010 [3 favorites]

It gets so awful at 800 and 1000.
posted by piratebowling at 7:37 AM on June 2, 2010

Seriously, my cat freaked out.
posted by piratebowling at 7:37 AM on June 2, 2010 [2 favorites]

Cool project. I am Sitting in a Room is one of my favorite compositions. Nice to see that Youtube is being used to accomplish a similar effect.
posted by cloeburner at 7:37 AM on June 2, 2010

Or you could just drop acid and watch a Max Headroom marathon.
posted by Blazecock Pileon at 7:39 AM on June 2, 2010 [6 favorites]

I love it.
posted by insectosaurus at 7:41 AM on June 2, 2010

Enjoyed the 750. Modern art.
posted by nickyskye at 7:41 AM on June 2, 2010

Number 1000 makes me feel like a Predator with a bad prostate taking a leak in a crowded men's room.
posted by Mayor Curley at 7:41 AM on June 2, 2010 [9 favorites]

Somebody has too much time on their hands.

Still awesome though.
posted by valkyryn at 7:42 AM on June 2, 2010

I like this.
posted by everichon at 7:42 AM on June 2, 2010

Someone should make an email plug-in where the text quality of a message gets progressively worse (faint, broken, hard to make out) depending on how far down the CC chain you are.
posted by mazola at 7:43 AM on June 2, 2010 [22 favorites]

For me, "I am Sitting in a Room" is one of those things that I never heard of until I heard of it (at the end of last year in Paul Morley's Words and Music) but now I can't get away from it. Not that I mind, it's just Internet weirdness. Thanks for this.
posted by yerfatma at 7:46 AM on June 2, 2010

The obvious difference is that I am sitting in a room uses a natural environment, and the acoustic properties of that environment. So that each iteration has minimal recursive electronic effects and is in a strange way distilling the properties of that room. This, on the other hand, is distilling the properties of youtube's video upload converter.

While the room had a normal lowpass quality and a natural resonance, the video converter has a lossy fft compressing the audio (turning it into a trickling cluster of sine waves with less and less relation to the original) and a lossy video compression that averages regions of the image recursively over time and over the space of the image (in more and more evident rectangular regions).
posted by idiopath at 7:47 AM on June 2, 2010 [4 favorites]

I like this very much. Degrading fascinates me. Is there a name for the "watery" sound his voice has turned into in clip #1000?
posted by Termite at 7:52 AM on June 2, 2010 [2 favorites]

Awesome. Thanks.
posted by nevercalm at 7:52 AM on June 2, 2010

there's this guy...
who sits in a room...
posted by flapjax at midnite at 7:54 AM on June 2, 2010

Metafilter: where the text quality of a message gets progressively worse.
posted by chavenet at 7:54 AM on June 2, 2010 [2 favorites]

Reminds me of this.
posted by Sticherbeast at 7:54 AM on June 2, 2010

This is fantastic.
posted by shakespeherian at 7:56 AM on June 2, 2010

But of course, a lot of what you're hearing in I am sitting in a room is tape hiss and recording artifacts too. It would be interesting if someone rerecorded it with equipment that wasn't available in the 60s.
posted by roll truck roll at 7:57 AM on June 2, 2010 [1 favorite]

There is a maxim that once something is digitized, it can be infinitely reproduced without any loss in quality. Yet, anyone who has used video on-demand sites like Youku and Surfthechannel to watch TV shows and movies has seen digital copies that are watermarked, subtitled, out-of-synch, etc.
posted by sswiller at 7:57 AM on June 2, 2010

I have copied and pasted this comment 1000 times. It distills the natural properties of the clipboard.
posted by echo target at 8:05 AM on June 2, 2010 [6 favorites]

I like how it's called "the photocopy effect" but was inspired by a sound recording.
posted by DU at 8:09 AM on June 2, 2010

In a comment under #1000, the maker states, "not to mention that YouTube changed .flac formats/versions during the middle of this project, somewhere in the 600-700s. It would have ended much differently I'm sure."

Would that have a lot to do with the more extreme WTFness of the later ones?
posted by Sys Rq at 8:11 AM on June 2, 2010

Number 1000 makes me feel like a Predator with a bad prostate taking a leak in a crowded men's room.

Can someone get Robert Rodriguez on the line? I've got the best DVD-extra idea for Predators...
posted by griphus at 8:11 AM on June 2, 2010

Of course it can be reproduced exactly. The people at those video sites expend a whole bunch of CPU time to go out of the way to degrade the quality (to save storage / bandwidth costs).

The "watery" sound may have some other name, but it is, descriptively speaking, a spectral decimation and blur via recursive application of lossy fft and poor quality ifft. The fft takes the time domain (position of speaker / compression of air) and gets a composite derivative form (N "bins", each with a single pure frequency and an amplitude and/or phase - typically the number of bins is between 256 and 4096, and the algorithm usually needs the size to be a power of two for efficiency reasons) spaced evenly in time (usually with an envelope and overlap). According to Fourier, every complex function (for example an audio waveform) can be simplified into a composite of pure (sine) functions. So at each very brief moment, some "window" of the input is analyzed, and the N loudest frequency components are found (with some degree of error), and their amplitudes / phases are stored. The reason to do this is that the raw data of music or a human voice tends to fluctuate all over the place in ways that standard digital compression does not handle well. But the frequency components are much more orderly, and are much more amenable to data compression.
posted by idiopath at 8:13 AM on June 2, 2010 [7 favorites]

sswiller: I see that you're a "multimedia consultant", so you've got to be kidding, right? Anything digital can be infinitely reproduced without loss in quality—that's practically the definition of "digital". You could take any one of these video files and copy it as many times as you like, and every copy would be 100% identical to the original.

Uploading a video to YouTube isn't (just) copying, though. Their video converter deliberately alters the data to achieve a desired effect—namely, compression. The watermarks are also put there deliberately. The poor syncing, while not intentional, is just the video software doing what its (unskilled) operator told it to do. In every case, the artifacts aren't an incidental side effect of making a copy—the computer was specifically instructed to modify the file.

You have to want and try to make a lossless copy before you can expect it to happen.
posted by ixohoxi at 8:13 AM on June 2, 2010 [6 favorites]

I lke hhoiw it'ss cllaed "he ptocoopy efhfect" ut ws iansspid b a sond reccordig.
posted by ennui.bz at 8:14 AM on June 2, 2010 [4 favorites]

Oops, wasn't quite done yet, was I. The ifft is the inverse of the fft, and involves synthesizing a number of sine waves based on the stored freq / amp / phase data, and adding the results together.
posted by idiopath at 8:15 AM on June 2, 2010

Derivative (pun intended) but oh so awesome.
posted by Hutch at 8:21 AM on June 2, 2010

ixohoxi, I am speaking to the paranoia that content owners express when a digital recording is leaked. I am suggesting that the Internet can degrade a work over time unless it is securely hosted.
posted by sswiller at 8:23 AM on June 2, 2010

Awesome. I always wondered how they produced the graphics for Out Of This World.

So???ne ??ou?d m??e ?n e?a?? ?lug-?? wh?r? ?he ??xt q?a???y of ? ?e?sa?? ?et? p?og?e??i?ely w?rs? (?a?nt, b??ken, ??rd t? m??? ?ut) ??pe?d??g on ho? ?ar d??n ?he ?C c?ain y?? a?e.

The Nethack email client implemented this decades ago.
posted by cortex at 8:28 AM on June 2, 2010 [7 favorites]

I am suggesting that the Internet can degrade a work over time unless it is securely hosted.

A given copy of a work can seed a degraded branch out in the wild, yes. This is nothing new: tape traders understood the implications of generational decay decades ago. The difference is that nobody has to concede ownership of the original pristine tape to one person, and creating pristine, bit-perfect duplicates is trivial stuff. There's not a whole lot to say here other than "letting people make lossy copies of stuff produces lossy copies of stuff". Yes, given, and?
posted by cortex at 8:31 AM on June 2, 2010 [2 favorites]

sswiller: "when a digital recording is leaked."

AKA available for playback on a standard computer? The thing is that only getting analog copies out there makes the problem quite a bit worse - make it too hard to rip the compressed audio / video, and they will capture it uncompressed and recompress it, make that too hard to do and they will grab an analog copy, re-digitize it, and compress that. Really, the more effective your DRM is, the worse the illicit copies will be, and the worse your stuff will look to the majority of your audience (aka the people who have not yet decided to pay for your junk).
posted by idiopath at 8:43 AM on June 2, 2010

*starts tape recorder*
Whew. it's cold in here.
*rewinds, listens for ghosts*
posted by Hardcore Poser at 8:45 AM on June 2, 2010

That's pretty damn cool ... and it works a lot better than multi-generational dubbing with analog video (something I tried once, and you can only go about 4 gens deep before all sync is lost)
posted by Relay at 8:46 AM on June 2, 2010

I tried to create a similar setup to I am sitting in a room digitally. After just a very few iterations, it was clear that the signal was getting overwhelmed with hard-drive and screen backlight noise.
posted by scruss at 8:53 AM on June 2, 2010

Someone should make an email plug-in where the text quality of a message gets progressively worse (faint, broken, hard to make out) depending on how far down the CC chain you are.

I would suggest using a translation tool, changing the message into another language and back to the original, once for each layer of CC.
posted by filthy light thief at 8:56 AM on June 2, 2010 [1 favorite]

"letting people make lossy copies of stuff produces lossy copies of stuff". Yes, given, and?

I just think its neat.
posted by sswiller at 8:58 AM on June 2, 2010

approaching this from a more snarky angle, this guy is about 12-18 months too late... the experimental film and video world just went through a "let's play with video compression" phase. i saw a bunch of work in the last two years (light industry, paper rad) that used compression artifacts. sorry guy, it's been banished to the "played out" list for the next few years.
posted by nathancaswell at 9:05 AM on June 2, 2010

I just think its neat.

Fair enough, I do to. The structure of your original comment ("There is a maxim...yet...") suggested you were arguing against the truth of that maxim, against the idea that digital media can be trivially and pristinely duplicated, which would be a pretty silly argument to try and make.

I guess your intent was more along the lines of "while lossy reproduction can be (and is constantly) done, it's interesting to see generational loss is still in play as well", in which case, yeah, high five. That there's still a system that regularly produces subpar qualities for reasons other than fundamental scarcity of the original media is an interesting result of the larger economic/technological picture of popular media distribution.
posted by cortex at 9:07 AM on June 2, 2010

lossy fft and poor quality ifft

More like DCT, quantization, and iDCT. /pedant

A proper quality integer-based DCT/iDCT encoder pair can actually be run over, and over, and over, with the same quantization settings, and suffer no quality loss. A few years ago I ran the Lena image through Imagemagick, compressing and uncompressing it 1000 times over. Example original image, image after 1000 compressions. When you start varying the quantization settings, though, things start getting ugly.

The big difference is that video and audio codecs use a lot more tricks to compress their data, increasing the likelihood of rounding errors. They include running through a psymodel.
posted by zsazsa at 9:09 AM on June 2, 2010 [1 favorite]

Thanks Sticherbeast, I had heard that sampled by DJ Shadow in Endtroducing and had always wondered where it came from.

On second thought it may have been Psyence Fiction, but either way...
posted by daHIFI at 9:23 AM on June 2, 2010

Ho-hum, Looks like another excuse for the rebirth of figurative painting.

Is the media still the message these days?
posted by vhsiv at 9:29 AM on June 2, 2010

s/media/medium
posted by vhsiv at 9:34 AM on June 2, 2010

The ambient-avante-whatever group Tribes of Neurot (side-project of psych-sludge-metal masters Neurosis) did a couple of recording releases using a similar setup to the "I'm Sitting In A Room" methodology.. 'A Resonant Sun' was a release of the Neurosis album 'A Sun that Never Sets' with each track being more and more re-dubbed using Lucier's technique. The tracks in the beginning sound just like the original album, and they progressively become distilled down to these pure tonal elements as the iterations increase.

Their 'Tape Decay' series had them taking analog tape recordings and playing them continuously for a full year, while making interim dubs each season to see how the wearing of the players eroded the tapes themselves. Not quite the same, but another interesting interpretation of how forms of erosion can affect recordings.
posted by FatherDagon at 9:56 AM on June 2, 2010 [2 favorites]

In the early Eighties I used to make art by applying the Photocopy effect with real photocopy machines! See, the early color machines weren't all that great, so the Exacto knife edges of my collages would disappear by a second generation copy. The fourth copy was about right. More than that and it would get a little too artsy.

These days, of course, the photocopy machines are too good to use for this purpose.

As far as the video "I Am Sitting in a Room," it is interesting, but doesn't do anything for me on an aesthetic/emotional level.
posted by kozad at 10:03 AM on June 2, 2010

Well, sure, but you just know that the team on CSI could take iteration #1000, run it through their desktop computers, and before the next commercial break comes along they'd have a 1080p version of the original that would allow them to read the titles on all the books on the shelves, analyze the reflections of the light in the room to pinpoint the exact geographical location where the video was recorded, and enhance the background noise to pull out an incriminating discussion of where the body should be buried.
posted by djwudi at 10:08 AM on June 2, 2010 [3 favorites]

reminds me of the disintegration loop

Basically a guy recording a piece of tape-recording music 'dying' as it was continually transferred from tape to digital. Added to it is the supposed story that he lived miles from ground zero, and played the tapes as the towers collapsed (which is a good fiction, if not necessarily true).
posted by codacorolla at 10:12 AM on June 2, 2010

I would suggest using a translation tool, changing the message into another language and back to the original, once for each layer of CC.

Something like this I imagine...
posted by TwoWordReview at 11:35 AM on June 2, 2010

Also I think this is pretty awesome!
posted by TwoWordReview at 11:36 AM on June 2, 2010

but you just know that the team on CSI could take iteration #1000, run it through their desktop computers, and before the next commercial break comes along they'd have a 1080p version of the original

put your fingers on the keyboard. now move them over one space and type something...;olr yjod.
search for it and google will find it, whatever it is...

my roommate was showing me how google goggles works...we took a picture of a barcode, but i moved the thing and he moved the camera, and the lighting was bad, and it was just a blurry mess. google found it.

your camera could probably still do face recognition on #1000...speech to text on your phone could probably tell you what he is saying.

this is how they see us. this is all they need to hunt.
posted by sexyrobot at 11:56 AM on June 2, 2010 [5 favorites]

I just did a couple of quick dirty Lucier experiments with my iPhone—iterating through my guitar amp in the basement, iterating through my stereo in the living room—and found what I expected to pretty much, which is that without doing some really careful calibration with a pretty dead room and a transparent speaker and a really good mic, you're down to shrill harmonics inside of 6-8 generations. It'd be interesting to try and do a more careful job of it some time.
posted by cortex at 12:00 PM on June 2, 2010 [1 favorite]

cortex: "without doing some really careful calibration with a pretty dead room and a transparent speaker and a really good mic, you're down to shrill harmonics inside of 6-8 generations"

I can't prove any of this, but here is my speculation:

I think the problem may be that your mic and your amp are too good. Old '70s era equipment had poor high frequency response, which is exactly what you want to get a result that is not shrill.

The shrill sounds are not a technical error any more then Lucier's original mellow hums - just as in Lucier's case, they are sounds that you recorded, iteratively convoluted with the resonance of the space and the response curve of your equipment. You don't like the sound, but that is not because the technology is malfunctioning.

Put the playback on each iteration through a lowpass with a gentle curving slope going very slowly downward from 0, getting steeper after the -3db point at about 3k or so, and continuing to curve lower all the way down to a zero at 10k, and I speculate that will give a result much like Lucier got if you have a semi decent mic, sound card, and speakers (I just pulled those numbers out of my ass, but they should be a decent starting point for finding a filter that would duplicate his results). Also mic the speaker in a way that best gets a good warm and clear low-mid.
posted by idiopath at 12:24 PM on June 2, 2010

Oh, I don't mean to imply that the shrillness is a technical error per se, yeah. Just that it reflects a significant amount of frequency-response bias in the stuff I'm using that without, as you suggest, some amount of adjustment to control the high end leads to a quick reinforcement of those specific frequencies and everything else getting severely attenuated.

I'd put most of the blame on that in particular on the iPhone mic; I may try recreating the whole thing with my condenser instead and see if I can control the levels and the EQ a bit better. But these runs were very breakneck things. Mostly I'd like to have the resonance I capture be as much about the room and as little about the most biased piece of recording/playback equipment involved as I can.
posted by cortex at 12:34 PM on June 2, 2010

How strange that the audio degrades so much faster than the video. By 100 iterations, the audio is approaching unintelligibility, but at 500 iterations, the video is still clearly video of a man speaking to the camera. Maybe that's just a reflection of YouTube's encoding priorities.
posted by Western Infidels at 12:36 PM on June 2, 2010

You know what'd be kind of neat as a feedback experiment? Put a pan of liquid on top of a speaker, aim a video camera at the liquid, and play some audio while recording the image of the surface of the liquid over time. Use something opaque like milk, and throw a direct light on it at an oblique angle so you get significant light/dark patterning from the local crests and troughs in the liquid when the speaker vibrates it.

Then use that pattern in the recorded video to calculate some sort of dynamic EQ filter, and apply that to a second pass of the audio, and record the surface of the liquid in response to that. Repeat. Basically creating a two-step physical feedback loop, try and find source sounds that lead to interesting stasis points (or, even better, interesting points of instability).

It'd be really hard to reproduce any given result, of course. Whether that's a pro or a con is I guess a philosophical question. You could instead do something with, instead of a plate of liquid, some frequency-visualization process that chucks out a reliable image for any given input, but that seems a little less fun.
posted by cortex at 12:40 PM on June 2, 2010 [1 favorite]

The people at those video sites expend a whole bunch of CPU time to go out of the way to degrade the quality

Er, that's not really accurate. That _can_ be a side-effect, but it's not quite what we're going for. We transcode all uploads to YouTube for a couple reasons:

* To standardize resolutions and bitrates --- we have a set number of formats we transcode to (360p, 480p, 720p, 1080p, Source, etc) to make selection/organization easier. Most videos are close to and/or exactly one of these resolutions anyway, but we also standardize target bitrate for caching purposes.

* To standardize video formats. You can upload in pretty much any video format, but we only serve in h264 and VP8. So if it's not one of those, obviously we transcode. If it is, we still do, for reasons in point (1).

However, if you upload something >1080p we also present it in "Source" resolution (original). We want to preserve quality as much as possible, while still fulfilling the 2 points listed above.

But it's quite accurate that because we always transcode, there is loss in this process and you get the effects seen here.
posted by wildcrdj at 1:25 PM on June 2, 2010 [3 favorites]

cortex- you might use the video of the liquid to generate audio with this program.
posted by a snickering nuthatch at 1:43 PM on June 2, 2010

Someone did this audio-to-video-to-audio feedback loop at MUTEK Montreal in 2005. I'll see if I can dig up pics from my Flickr set.
posted by Blazecock Pileon at 2:39 PM on June 2, 2010

Actually, it was 2004, when Pure/Dekam did a set before Ilpo Väisänen.
posted by Blazecock Pileon at 2:56 PM on June 2, 2010

Or you could just drop acid and watch a Max Headroom marathon.

Yeah you should definitely do that, but you'd have a lot of time left over afterwards.

THERE WAS NO SECOND SEASON I CAN'T HEAR YOU LA LA LA
posted by fuq at 3:06 PM on June 2, 2010

I did a pure analog audio / video thing just pumping the line out of my amp into a used black and white security monitor from a convenience store and then connecting a photoresistor pointed at the monitor back into the amp (with a cheap distortion pedal, exploiting the minor voltage offset at the input and extreme voltage gain to make the changing resistance of the photoresistor become a changing voltage at the line in). Big flashy blobs of light on the screen followed or ran away from the sensor, and it produced the sorts of electronic sounds I usually work with.
posted by idiopath at 3:23 PM on June 2, 2010

Fun stuff. I love making feedback music. If you set it up right you'll always be surprised at what can come out.
posted by Blazecock Pileon at 4:07 PM on June 2, 2010

Hey bud, i see this on my DirecTV when it's cloudy. Could'a saved you some time.
posted by thisisdrew at 6:11 PM on June 2, 2010

Absolutely enjoyed it. Didn't expect audio to deteriorate so much faster than the video, but then YouTube is more a video sharing site.
posted by vidur at 6:54 PM on June 2, 2010

vidur: "Didn't expect audio to deteriorate so much faster than the video, but then YouTube is more a video sharing site."

Another factor here is I think that humans are much better at interpreting images than sounds. I have been doing experiments lately where I do analogous manipulations to audio and video sources, and the audio definitely leaves the realm of intelligibility faster (example -- warning: flashing lights).
posted by idiopath at 7:08 PM on June 2, 2010

I would like to see a 'fade' from video 1 through to video 1000 as the video plays. I think that would tell me more about how the degradation feels.
posted by niccolo at 7:40 PM on June 2, 2010

Why would you actually upload and download the video again and again? Why not just re-encoded it over and over again on your own machine? Uploading over the network is lossless, so you wouldn't lose anything extra. You could have it done by a batch file if you wanted.
posted by delmoi at 7:04 PM on June 3, 2010

« Older Nimslo, Your Time has Come | If you watch this and say, "Someone has too much... Newer »

This thread has been archived and is closed to new comments

MetaFilter

I Am Sitting In A Video Room 1000
June 2, 2010 7:32 AM Subscribe

Tags

Share

I Am Sitting In A Video Room 1000 June 2, 2010 7:32 AM Subscribe

Tags

Share

I Am Sitting In A Video Room 1000
June 2, 2010 7:32 AM Subscribe