I want to ravish you like a Cacatua sulphurea
June 23, 2010 2:01 PM Subscribe

Nine Inch Niles - The Seattleward Spiral -- MeFi's Own™ cortex recreates NIN's The Downward Spiral album using only soundclips from TV's Frasier. Includes bonus video. [via mefi projects]
posted by not_on_display (87 comments total) 43 users marked this as a favorite

What the hell is wrong with you?
posted by shakespeherian at 2:05 PM on June 23, 2010 [36 favorites]

Les Freres Horreur
posted by Babblesort at 2:08 PM on June 23, 2010

I'm seizing here. Someone put a pencil on my mouth.
posted by GuyZero at 2:10 PM on June 23, 2010

Can someone give me some background on afromb.py?

A quick googling doesn't turn up the sort of summary/retrospective I'm looking for. Maybe it's too new? Anyways, when the source file is more traditionally musical (maybe a chromatic scale, even) does the output turn out more... um... listenable?

I mean, I listened to a couple of the tracks and they're interesting. But it seems like the real story is the computer-generated music angle.
posted by 256 at 2:10 PM on June 23, 2010 [1 favorite]

Okay, considering the circumstances I can actually ask this question: how much of the show do you have to watch to land every one of these samples? I mean, I've seen Newsradio more times than a man should and I would never be able to put something together using audio samples from it. Is it a specific kind of musical/aural memory? Is it just very close attention? This kind of stuff has confounded me for years.
posted by griphus at 2:10 PM on June 23, 2010

this is fucking nuts
posted by DZack at 2:11 PM on June 23, 2010

Post-posting: The reconstitution was done using “afromb.py”, a python script that takes two files, the target track a that you’re trying to recreate, and the source track b that provides the audio content used for that recreation, and slices both up into very small pieces, assembling the bits of track b one at a time according to whichever piece best matches the current bit of track a.

I still want an answer about these sorts of things that don't use that.
posted by griphus at 2:12 PM on June 23, 2010

This is blinding genius.
posted by Adam_S at 2:12 PM on June 23, 2010

cortex, you are one scary motherfucker.
posted by Justinian at 2:12 PM on June 23, 2010 [1 favorite]

i have found my Christmas gift for all of my friends.
posted by Seamus at 2:13 PM on June 23, 2010 [1 favorite]

how much of the show do you have to watch to land every one of these samples?

Probably none. Being handy with either Amazon and a DVD ripper or a Torrent program is probably more important than hours viewed of the show.

Can someone give me some background on afromb.py?

Yeah, did you write this, or did it come from somewhere else?

Really interesting either way.
posted by weston at 2:14 PM on June 23, 2010

Niles Crane taken over by the ghost of Antonin Artaud ..... now that clip would be a blast!
posted by blucevalo at 2:14 PM on June 23, 2010 [1 favorite]

how much of the show do you have to watch to land every one of these samples?

According to the blog entry, it's mostly an automated process.

Can someone give me some background on afromb.py?

It looks alike a python script using this API. It's a bit over my head, but it looks like echo-nest-remix could make a good post in it's own right.
posted by lekvar at 2:15 PM on June 23, 2010

I just clicked on a half-dozen of those tracks and I seriously found them completely unlistenable. Seriously, if this were done by anyone else but Cortex, I doubt we'd pay any attention to it.
posted by crunchland at 2:17 PM on June 23, 2010 [12 favorites]

Geez, didn't any of you people listen to the last podcast?
posted by Rhomboid at 2:17 PM on June 23, 2010

Brilliant. I want this software.
posted by xod at 2:19 PM on June 23, 2010

256: "Can someone give me some background on afromb.py?"

It's part of that project that did those things with the audio. Gave songs shuffle. You know.

ECHO NEST

ECHO NEST
posted by boo_radley at 2:19 PM on June 23, 2010 [1 favorite]

None of them sounded like music. I'm probably missing something here.
posted by Malice at 2:19 PM on June 23, 2010 [2 favorites]

IT IS WRONG.
posted by everichon at 2:20 PM on June 23, 2010

If you used an infinitely large sample set, would you be able to distinguish the final product from the original? Would it be the same thing or a derivative product?
posted by Seamus at 2:22 PM on June 23, 2010

Well that is nifty conceptually, and even occasionally mechanically, but I think I might be with GuyZero. Can you have aural epilepsy? I have serious concerns, guys. Not since I heard jimi hendrix's 'all along the watchtower' being played backwards on top of dylan's have I faced such cognitive dissonance.

On a related note, what would happen if you did it backwards?
posted by LD Feral at 2:24 PM on June 23, 2010

Seriously, if this were done by anyone else but Cortex, I doubt we'd pay any attention to it. posted by crunchland

I know I would. Did you watch any of the videos?
posted by xod at 2:24 PM on June 23, 2010

This is my brain. .....;......;.......;

This is my brain on cortex's dub. .:;!;;!!1;;.;.;1!1.1!;.
posted by cavalier at 2:26 PM on June 23, 2010

ALSO CAN YOU PLS RUN THE FAME MONSTER AGAINST EPISODES OF THE WIRE K THX
posted by everichon at 2:26 PM on June 23, 2010 [5 favorites]

I'll admit, the video was slightly more interesting than the audio, but I expect that had to do with the pretty flashing colors more than anything else.
posted by crunchland at 2:26 PM on June 23, 2010

The reconstitution was done using “afromb.py”, a python script that takes two files, the target track a that you’re trying to recreate, and the source track b that provides the audio content used for that recreation, and slices both up into very small pieces, assembling the bits of track b one at a time according to whichever piece best matches the current bit of track a.

I'm assuming you could slice them up into smaller pieces than the ones being used here? Theoretically, if the slices corresponded to the sample rate you could reproduce the source music nearly perfectly. For the sake of art, you'd want slices a little bigger than that, just to introduce a little weirdness into the mix. But cortex's slices are so big that it doesn't sound like music at all.

Am I right? I don't know much about this stuff and might just be talking out of my butt.
posted by The Winsome Parker Lewis at 2:26 PM on June 23, 2010 [1 favorite]

I would have liked to hear a project which had actual human intervention, perhaps more of a recreation of the progress of the album using actual sound samples rather than using software to create sound pulses which have some resemblance to the rhythms of the music without any actual musicality.

I mean, I *heart* cortex, but some of his music experiments leave me cold.
posted by hippybear at 2:26 PM on June 23, 2010

I think it's great that echo nest is putting a sheen of respectableness on youtube poops.
posted by boo_radley at 2:27 PM on June 23, 2010

I love this more than is healthy.
posted by spinifex23 at 2:29 PM on June 23, 2010

Another example.
posted by boo_radley at 2:30 PM on June 23, 2010

Is it wrong to seriously rather enjoy some of these? The Becraning is just on the edge of totally awesome. Or I need help.
posted by opsin at 2:31 PM on June 23, 2010

Coming Soon: "30 Indie Rock", "How I Met Your Ministry", "Two and a Daft Punk", "Law & Order: Weird Al Victims Unit", "Battlestar Gaga" and the inevitable House meets "House"...
posted by oneswellfoop at 2:34 PM on June 23, 2010 [16 favorites]

If you used an infinitely large sample set, would you be able to distinguish the final product from the original? Would it be the same thing or a derivative product?

That's an interesting question.
posted by lekvar at 2:36 PM on June 23, 2010 [2 favorites]

"30 Indie Rock"

This is actually the name of the next Girl Talk album.
posted by griphus at 2:45 PM on June 23, 2010

I don't think I could even identify Closer as the source track for Closer To Roz. This would be much more interesting, I think, if you split the original studio tracks down to 3 or 4 sub-mixes (e.g. vocals, drums, and background instruments), ran afromb.py on them, then re-combined the results.
posted by 0xFCAF at 3:03 PM on June 23, 2010

Hi!

how much of the show do you have to watch to land every one of these samples?

The source audio for the album is two five minute clips from a couple separate Frasier episodes, plus a clip of the theme song itself. I made three passes against each original song, using each of those three clips as source data (the b in the afromb process with the original song as the a track) and mixed them in a bonehead stereo mix at center, leftish, and rightish.

As noted, there was essentially no human intervention involved in the sample assignments themselves, which is fun from the perspective of keeping the idea pure but also means that there's no opportunity to nudge it toward musicality.

Yeah, did you write this, or did it come from somewhere else?

I did not write the afromb.py script, I'm just a happy experimenter. I became aware of Echo Nest Remix via Music Machinery, which I became aware of last year from this mefi post about click track detection. Great blog, lots of wonderful musical hackery going on. I was thrilled when they picked Seattleward up.

It got mentioned again in this recent post about swinger.py, which lead to me finally really fiddling with Remix and that and a couple other random chances led to this project.

It looks alike a python script using this API. It's a bit over my head, but it looks like echo-nest-remix could make a good post in it's own right.

The Echo Nest Remix API is really great fun to play with, and the good news is that you don't really have to know all that much to use it. They've got installers for major OSes; that, an API key that'll take you thirty seconds to get, and a couple minutes getting acquainted with where stuff is, and you can start churning out your own weird experiments.

I made a whole bunch of other afromb.py experiments back in that other thread, many of which are frankly more listenable (if less ambitious or, uh, high concept) than the average track on this experiment.

Did you watch any of the videos?

Videos are more approachable in general, I think, if only because it's engaging (to the extent that it's engaging, mileage obviously varies significantly) on multiple sensory fronts. I think we're better at stringing a narrative out of crazy fractured images than out of just crazy fractured sounds, in any case. In any case, our capacity to try and make sense even of the fairly explicitly senseless kind of delights me and that's one of the main reasons I enjoy doing these experiments even if the results aren't always exactly musical in a strict sense.

Other video experiments I've done so far include The Beautiful Muppets and O Closerman. With those, as with the Closer to Roz video, I kept some of the original song in the mix too to help give it a feeling of coherence. It's "cheating" in a purist sense but does make for a more watchable experience.

I would have liked to hear a project which had actual human intervention, perhaps more of a recreation of the progress of the album using actual sound samples rather than using software to create sound pulses which have some resemblance to the rhythms of the music without any actual musicality.

Yeah, absolutely. I have about a dozen little ideas for how to improve the musicality of this basic sort of project, but at the moment it's trivial for me to just play with the existing afromb.py (and vafromb.py for the video stuff), and less-than-trivial to (a) get up to speed on Python and (b) start trying to actually code up those ideas. I hope to get to it, I'm finding this stuff frankly pretty exciting, but I've got a bit of a learning curve on the Python front and need to get a better understanding of how the guts of Remix's beat/slice manipulation stuff works at the low levels.
posted by cortex at 3:08 PM on June 23, 2010 [9 favorites]

It sounds like tossed saland and scrambled eggs.
posted by infinitywaltz at 3:11 PM on June 23, 2010 [3 favorites]

Along the same lines, Akufen uses random microsamples recorded off the radio to build up house tracks. For instance, Deck the House.
posted by smackfu at 3:12 PM on June 23, 2010

I don't think I could even identify Closer as the source track for Closer To Roz. This would be much more interesting, I think, if you split the original studio tracks down to 3 or 4 sub-mixes (e.g. vocals, drums, and background instruments), ran afromb.py on them, then re-combined the results.

Absolutely. I don't have stems of anything but my own music sitting around, but I'd like to give exactly that idea a shot—try to rebuild a song track by track and get something more like a full mix out of it at the end, which would almost certainly be more musical in general than a straight up (or as in this case multiplex but still full-mix each time) afromb run.

I'm considering doing a poor man's version of this just by doing a really aggressive series of like EQ stripes on an original track and running afromb against each strip to try and rebuild it in chunks of frequency range, but I'm not really expecting that to work out all that great.

afromb is awesome but pretty limited at the same time; there's a lot that could be done to extend and improve the idea to make the musicality of the output and the flexibility of the processing better.
posted by cortex at 3:13 PM on June 23, 2010

This puts a real Dido in my Aeneas.
posted by Smedleyman at 3:14 PM on June 23, 2010 [2 favorites]

I don't have stems of anything but my own music sitting around, but I'd like to give exactly that idea a shot

I think that remix.nin.com (signup required) has a lot of stems from With Teeth, Year Zero, Ghosts, and The Slip available for download and manipulation. I also came across a website with stems from the Byrne/Eno "My Life In The Bush Of Ghosts", and I know there's some other sources for stems out there. I bet if you hunt around you can find good things to play with. More and more artists are realizing that crowdsourcing the remix thing is a good way to go.
posted by hippybear at 3:39 PM on June 23, 2010

If Richard D. James and Merzbow were ever to bukkake each other, I do believe this would be the result. It's repulsively beautiful!
posted by Cat Pie Hurts at 3:44 PM on June 23, 2010

I'd love to see how it handles a file separated into multitracks - I'm thinking there must be something good in the remix.nin.com files that you could play with, layer by layer.

Thinking this because it's apparent that in Closer to Roz, it's having difficulty where you have a strong vocal component and the bassline too, so it fights between them a bit.

Cortex: would you be prepared to share some of the code? I'm getting OK with Python, but would be interested and grateful to see how it all hangs together.
posted by BishopsLoveScifi at 3:46 PM on June 23, 2010

Doh, hit refresh before posting you fool.
posted by BishopsLoveScifi at 3:48 PM on June 23, 2010

If you used an infinitely large sample set, would you be able to distinguish the final product from the original? Would it be the same thing or a derivative product?

Well, sure all you'll need is 65536 samples, each one one sample long and you can recreate any music ever made. Or one sample of a sinewave, either way.

For a visual interpretation of recombining a song from its own samples, see paul's nickelback experiments (also made w/ EN remix)
posted by brianwhitman at 3:52 PM on June 23, 2010 [1 favorite]

If you're interested in weird sound sources, (Matthew) Herbert has an album of "bodily functions" (and that's the album name, too), Matmos sampled medical instruments interacting with people (and the plucked and bowed cage of our rat Felix on one track), and then there's Nymphomatriarch's self-titled album, sourced from Venetian Snares and Hecate's privately recorded debaucharies committed while on tour. For whatever reason, my mind got stuck in a "physical" sample loop, as there are many albums and tracks sourced from unusual sources.
posted by filthy light thief at 3:53 PM on June 23, 2010

If Richard D. James and Merzbow were ever to bukkake each other, I do believe this would be the result

Pretty sure that was the windowlicker video.
posted by mannequito at 3:58 PM on June 23, 2010

I'm definitely gonna listen to this when it's not so early in the morning. A perusal of the comments here would indicate that this is not early morning listening.

Note: I have never seen Frasier and had not heard of it til now. Will that diminish my appreciation for this, one wonders?
posted by flapjax at midnite at 3:59 PM on June 23, 2010

Wow, that made me hate Frazier even more than I already did.
posted by Jimmy Havok at 4:19 PM on June 23, 2010

I wish this had a warning that says you shouldn't listen to it while hung over!
posted by Bergamot at 4:42 PM on June 23, 2010 [1 favorite]

For what it's worth, there are lots of hip-hop songs that have both a cappellas and instrumentals readily available.
posted by box at 4:54 PM on June 23, 2010

Let's see...doesn't speak Urdu or Arabic, has no contacts, no clue where to start looking, and no real tactical idea on how to mount his attack.

*mathowie silently deletes his Ride the Moonlightning project*
posted by Evilspork at 5:03 PM on June 23, 2010

hip-hop songs!
posted by flapjax at midnite at 5:08 PM on June 23, 2010

Very odd stuff. Reminds me of a less melodic version of Pogo's remixes (Alice, UPular, etc.). It would definitely benefit from a more human touch.
posted by Rhaomi at 5:14 PM on June 23, 2010

I'm going to laugh for awhile at the "Closer to Roz" song name.
posted by fleacircus at 5:17 PM on June 23, 2010 [1 favorite]

So if you fed it the same track on each end, would it merely give you the exact same thing? Or would it give something different? And if it's only a little different, what would happen if you fed that into itself and on and on?
posted by symbioid at 5:20 PM on June 23, 2010

If you use the same track for the a and the b in afromb.py, you get basically the same thing out, yeah, at least the couple times I tried. I may have gotten a couple small glitchy bits where it replaced a part of a song with a very similar different part of the same song, but it didn't produce anything interesting at least at a quick listen.

I like the idea of feeding in very similar things, though; this mix of studio and live Where Is My Mind takes is the closest I've come to exploring that territory, and it's kind of neat and weird in its own right, but I'd love to take like ten different live versions of the same song by the same band and mush them together like this, for example.
posted by cortex at 5:50 PM on June 23, 2010

this makes me want to tear my eyes out, shove them in my ears then set myself ablaze after covering myself in crude from the gulf. dude should have his NIN fan card revoked.
posted by spish at 6:45 PM on June 23, 2010

amazing concept. the execution, on the other hand, is maybe no so great. i was happy to see that this was mostly an automated process given the generally unrecognizable/unlistenable result.

I think the same process would produce amazingly awesome results if it was run on individual tracks rather than the final mix.

but still, A++++ for concept.
posted by mexican at 6:48 PM on June 23, 2010

I'd love to take like ten different live versions of the same song by the same band

Might be a neat thing to do with cover versions, too.
posted by box at 6:54 PM on June 23, 2010

I just want to say that The Beautiful Muppets video that cortex linked to above is itself a thing of beauty.
posted by oozy rat in a sanitary zoo at 6:55 PM on June 23, 2010

Cortex, when I saw this headline, I assumed you had chosen NiN for this crazy experiment expressly because they've released the separate un-mixed audio channels for a bunch of their songs and that you'd used that stuff somehow.
posted by straight at 7:00 PM on June 23, 2010

Not to be all rulesy and stuff, but isn't this dangerously close to breaking the guidelines re: self-linking?

WHITHER BANHAMMER?
posted by Sys Rq at 7:04 PM on June 23, 2010

The video reminded me of this fan video of a very good Coil track. It's built out of Fire Walk With Me video clips, and has its own sort of weird, hypnotic narrative to it.
posted by Joakim Ziegler at 7:04 PM on June 23, 2010

I don't know if it's the song reprogramming my brain or what, but Closer to Roz starts to really rock toward the end.
posted by cmoj at 7:07 PM on June 23, 2010

Not to be all rulesy and stuff, but isn't this dangerously close to breaking the guidelines re: self-linking?

No, the whole point of Projects is self-linking, and if another completely separate MeFite finds something on Projects they think is worth posting to the Blue, there it goes. (Why do you think each entry on Projects has a convenient "post this to mefi" button?)
posted by DevilsAdvocate at 7:27 PM on June 23, 2010

In some of the examples you give of your other attempts, cortex, I'm hearing the steady background of the template song, with bits from the recycled material merely accenting it here and there -- as opposed to the songs you did for Seattleward Spiral. Is this intentional? I DON'T LIKE THAT!

It is intentional, yeah—I was playing around with all sort of variations, and a lot of those experiments were intended to be more listenable at the cost of being less pure an execution of the idea. Seattleward goes in the other direction, obviously.

So, for instance, I would have like to have heard exclusively which parts of O Superman it used in building song x from it, without song x in the background to clue me in.

And, again, the good news is that this is not hard to do on your own; installing Remix is really simple if you're on a system that has python installed already (which means OSX and likely any modern non-weird Linux install), and setting it up on Windows is not reportedly hard either though I haven't tried yet. Running afromb.py is just a command line call of a script with a couple of audio filenames as arguments; it takes a few minutes (depending on your computer and a few little details), et voila!

Request: Can you take, say, a piece of classical music as the recycled material and sync it up over a bluegrass template, and then remove the bluegrass template? KTHX

I'll give it a shot, though I've tried throwing some Brandenberg at a couple things previously and wasn't thrilled with the results. afromb seems to a better job of recreating a rhythmic signature than it does a melodic one.

I think the same process would produce amazingly awesome results if it was run on individual tracks rather than the final mix.

Agreed. This is more interesting as a proof of concept than as listenable music, and I think a per-track go of this sort of thing would be far more interesting in general. I'll definitely give something a go along those lines.

The biggest two problems with more elaborate runs:

1. There's no shiny wrapper for this stuff, which means if you want to do a complicated multi-run job you either have to invoke each call manually or write your own wrapper script to manage the m*n afromb.py calls to do everything. I ended up writing up a little bash script for Seattleward to take the grudge out of it.

2. Each run takes a few minutes; for a random one-off experiment that's fine, it's close to real-time if you're trying one thing after another and listening to experiment n while computing experiment n+1. For something that involves a lot of runs and assembly afterward, it's a bit more of a haul to get results; five minutes turns into an hour to find out if the idea worked.

Neither of these are big problems, though. The just limit my enthusiasm for doing elaborate experiments. (1) seems pretty solvable even if only in a braindead purpose-specific way (and I'd love love love for someone to solve it more thoroughly than that), and (2) is just a matter of impatience.

I've been considering writing up something that will just fire off random afromb jobs while I sleep, grabbing two songs from my music library and throwing at each other, do that twenty times in the middle of the night so that I can blind-test the results in the morning and look for interesting happy accidents. That'd be an ever better idea for ambitious stuff like a per-track multi-track thing like we're discussing, though it'd require more setup too or a very clever script that could automagically raid a Garageband file for individual track bounces or something.

I assumed you had chosen NiN for this crazy experiment expressly because they've released the separate un-mixed audio channels for a bunch of their songs and that you'd used that stuff somehow

No, I had no idea! It's pretty awesome, would be a good place to start. I actually chose NiN for this because I made a stupid "Nine Inch Niles" joke a day earlier and then realized that Remix would let me totally (if rather ham-handedly) accomplish said joke.
posted by cortex at 7:46 PM on June 23, 2010

No, the whole point of Projects is self-linking, and if another completely separate MeFite finds something on Projects they think is worth posting to the Blue, there it goes.

Yeah, I know. Doy. But that all depends upon one's definition of "completely separate." The BLT is awfully light on the Kevin, is all I'm saying.

(Also, I was being allcaps bold with an exclamation point facetious. I guess I could've underlined and blunk it, though.)
posted by Sys Rq at 7:58 PM on June 23, 2010

Do you see what happens when you don't give the mods enough to do in MeTa?

Call out random people, hell, call out me, and we can prevent this kind of thing!

This is cool as shit, Josh. I've no idea how you did it or had the time, but whatever, cool, brother
posted by Ufez Jones at 8:07 PM on June 23, 2010 [1 favorite]

The BLT is awfully light on the Kevin

Roger that, Jimmy Dino. We will send it to the Kitchen for some work, I repeat, we will send it to the Kitchen for some work.
posted by weston at 8:23 PM on June 23, 2010

.
posted by blaneyphoto at 8:28 PM on June 23, 2010

Ten - Twenty-Four. The Foot has Loosened.
posted by Sys Rq at 8:31 PM on June 23, 2010

Huh. It's like automated Plunderphonics? Awesome.
posted by unknowncommand at 11:00 PM on June 23, 2010

Better link, eh.
posted by unknowncommand at 11:01 PM on June 23, 2010

I wonder how the window size affects it. For example, wouldn't you be better off doing some kind of beat analysis first, and slicing the song according to the beats? I'd also expect different results depending on how you match the samples - if you weight correlation at the edge of the windows higher, for example.

These are both good questions worth exploring. I should be clear that beat detection is part of what Remix does as part of the afromb process—given an audio file, the API can be told to essentially go in and find the beats and I guess create an index of some sort for where they are in a song. That accomplished, afromb then apparently slices up the "a" song by beat/sub-beat, and then slices up the b song as well, and uses those respective latter slices to match the former slices.

My understanding of this is only gestural at this point, I really need to dig into the guts of how it works, but my impression is that part of what afromb could do better is to compensate for differences between the lengths of the a slices and the b slices—if the average slice of b is shorter than the average slice of a, for example, you're likely ending up with a reconstituted song that has a lot of little slices of dead air from when you get a short b slice filling the space of a longer original a slice. Sort of a stutter effect, which seems kind of common in fact in the output of afromb at times.

Anyway, that's the big obstacle for me: not knowing precisely how the built in beat-detection works, not knowing how tweakable that is. If I can get to know it better on that front and figure out how to manipulate it more elegantly, that'll be a big step toward making afromb's output a bit more musical.

I'd also love to work in some sort of pitch-shifting facility in the splice-matching—as is, I've done a couple rough experiments involving intentionally matching (or manually pitching shifting to match) key and overall harmonic content between the a and the b track and gotten slightly more musical results out of that, but if the pitch shifting were done on-demand by the process itself to get otherwise sonically congruent slices into the same harmonic space, that might be really excellent.

Lots to learn.

Anyway, has anyone done this with porn yet? If not, I'll have to have a go.

Man, it really needs to be done. I don't quite want to be the one to do it, but it's such a gimme—if you look at The Beautiful Muppets and swap out muppets for meat puppets the weird glitchy surreality of it is clear from the word go.
posted by cortex at 11:42 PM on June 23, 2010

Get down, get down

We Will Rock You

syncopated ordinance demonstration #1
posted by xod at 8:39 AM on June 24, 2010 [1 favorite]

I'm guessing that this thing would allow you to do analysis of tracks and then you could take that and sorta make a "music dna" of a song.

And from there utilize it like audiosurf does for a game to dynamically generate something?

I have this idea of a 2 player turn based strategy game where each player has a playlist that's their "deck"/spellbook, and the world is generated from the two playlists. Sound effects and music can be mashups of the two decks/spellbooks.

OK, someone, please get cracking on that! :P (no, really I'd do it, but I can barely get an asteroidy type game coded up let alone a strategy game)
posted by symbioid at 9:13 AM on June 24, 2010

CAT PLAYING PIANO synced to Ben Folds Five.
my ruling: edit out the BF5 please. next case
posted by not_on_display at 10:57 AM on June 24, 2010

Anyway, that's the big obstacle for me: not knowing precisely how the built in beat-detection works, not knowing how tweakable that is.

not tweakable at all (sadly ATM). remix passes all audio to a webservice that we host to do beat tracking, segmentation, section detection, chroma, key, etc etc. it is all completely un-tweakable: mp3 in, data out.

one thing you can do is play with the different levels of beat tracking: we do tatums (lowest level beat structure), then "beats," then bars. also segments, which are less beats and more 'sound events," for example every note in a guitar solo is a different segment whether it falls on a beat or not.

and modifying afromb to do better at time/duration matching is definitely on our list, but like all things remix we are hoping others in the community take it up as we're kind of busy doing... things... at the moment
posted by brianwhitman at 7:08 PM on June 25, 2010 [1 favorite]

So I spent the morning putting this audio together (and the afternoon wasting far too much time wrestling with weird silent failures from vafromb.py trying to put together the otherwise trivial video version), and so, hey:

The Crane That Feeds.

This is an attempt to take the multitrack idea and run with it. I grabbed the Garageband session for The Hand That Feeds from remix.nin.com (sadly, no Downward Spiral sessions available at the moment at least) and went at it with the same Frasier audio I used for Seattleward. I think it came out a lot better in terms of overall musicality—it's got a beat, for one thing—but there's still a football field's distance between the remake and the original. You'd have to be a pretty serious NIN fan to recognize it without prompting.

More details on the blog post at that link. This was fun and promising, and gives me some other ideas for what to try next.

not tweakable at all (sadly ATM). remix passes all audio to a webservice that we host to do beat tracking, segmentation, section detection, chroma, key, etc etc. it is all completely un-tweakable: mp3 in, data out.

Yeah, I'm finally cracking a bit more into the guts. While it'd be nice to be able to tweak the analysis, now that I really get how Analyze works as a web service I can see why that's a no go. I'll base my experiments around what I know I can do instead.

One of the things I might toy with sooner than later is the distance matrix that afromb uses to calculate similarity; depending on what I want to pull off, it may make sense to value things differently than how afromb currently does (e.g. put more focus on pitch and timbre and less on loudness, and deal with the unevenness in volume some other way while hopefully getting a bit more of a melodic/harmonic match from the heuristic).

Fun stuff. Brain-eating.
posted by cortex at 4:15 PM on June 28, 2010 [1 favorite]

HFS That's even better. Thanks, cortex.
posted by not_on_display at 10:17 PM on June 28, 2010

That last one kind of reminded me of Get in the back of the van!.
posted by Rhomboid at 10:50 PM on June 28, 2010 [1 favorite]

Another little tidbit, from some experiments yesterday evening:

Tori's Diner

In which I use a modified afromb.py to repurpose Tori Amos' a capella vox from "Me and a Gun" to rebuild Suzanne Vega's a capella "Tom's Diner" and mix the old and the new together about evenly. Because they have not-dissimilar voices and overlapping vocal ranges and both songs are sung in fair soft, low tones, the result is a very weird sort of echo/chorus effect.

You can hear Tori's bit falling into rapid repetitions throughout the song; this is the change I made to afromb, as an experiment in space filling. I think the change is interesting and works at least as a very rough solution to afromb's "stuttering" problem: in a typical afromb run, you end up with a lot of short bits of audio followed by short bits of silence, rather than the script creating a non-stop flow of replacement samples all cozied up next to each other. The result is a sometimes really jarring audio stutter feel where you'll sort of be constantly slamming between sound and silence, sometimes a few times a second.

And so instead of letting silence fill up the gap before the next replacement sample, I altered to code to have it repeat the same sample until it had at least filled that gap up. So instead of the normal

"sample1, [silence], [silence], [silence], sample2, [silence], sample3, [silence], [silence]...",

this modified afromb script produces

"sample1, sample1, sample1, sample1, sample2, sample2, sample3, sample3, sample3..."

Here's a short (?) version of what's going on is this: the Echo Nest analysis chops every song it runs through their Analyze process into variously sized chunks according to different metrics; the smallest of these chunks are called "segments", and they basically start whenever something "new" happens in the song—a new note, a drum strike, any sudden change. So segments tend to be tiny, but more importantly for this point they tend to be all sorts of different sizes (whereas most "beat"-sized chunks in a song tend to be about the same size).

So you chop up your a song, and you've got a bunch of segments of varying size. Some are tenth of a second long, some are a half a second long, that kind of significant variation. How afromb does its work is it takes the pile of segments for your a song, and also the pile of segments for the b song you're using to do the rebuilding, and it goes like this:

1. Grab the next segment of the a song.
2. Do some math to find the most similar (in pitch, timber, and loudness—notably here, not in length) segment of the b song.
3. Append that b segment to the new song we're building, with some caveats.
4. Move on to the next segment of song a at (1).

The part we're interested in here is those caveats on point 3. There are three possibilities we need to look at:

I: original segment a and replacement segment b are exactly the same length. Awesome, we can replace a with b unmodified and keep the running length of the song the same.

II: original segment a is shorter than replacement segment b. afromb deals with this by trimming the end of segment b off so that it's exactly the same length as segment a. (This works well and is very simple, but it'd be interesting to see what'd happen if instead the whole of segment b were squeezed down in running time to be the same length as a. I'll try this at some point.)

III: original segment a is longer than segment b. The stock afromb script deals with this by appending a necessary amount of silent padding to the end of segment b so that it's now exactly the same length as segment a.

As a result of how case III is handled, a fair amount of a typically reconstituted song is actually silence—there's less audio in the replacement track than there was in the original, but it retains the same overall shape and timing because the padding keeps things synced on a segment-by-segment basis.

So my change is to the handling to case III: instead of appending silence to fill out the running time, I divide the duration of segment a by the duration of segment b to find out roughly how many times longer a is than b; and then I just round up (because this needs to be a whole number, not a decimal) and replace segment a with that many repetitions of b. Bam, the space is filled not by a sample plus silence but by the same sample repeated however many times it needs to—a b-plex, if you will.

There's a problem with how my little hack works, though: the running time of the repetitions-of-b will almost always be a hair longer than the running time of the original segment a, since the duration of a will very rarely be an exact multiple of the duration of b. As a result, the new version of the song that afromb creates will be a bit *longer* than the original and would get out of sync with it if played side by side. The nice thing here is that if you use the "mix" option on afromb it'll keep the original audio as part of the output mix and keep it synced up on a segment by segment basis, so you don't end up hearing the original audio get out of time; instead you just end up with tiny bits of silence between the original segments instead of between the replacement segments.

(So e.g. if segment a is 0.30 seconds long and segment b is 0.12 seconds long, the script will jam three of b together to make sure to fill the space of a, and that 0.12*3 = 0.36 seconds. The modified afromb then makes that 0.36 second b-plex the latest bit of audio in the mp3 its building as it goes, and if you're mixing in the original audio it will also run the 0.30 seconds of segment a and 0.06 seconds of silent padding in parallel to the new b-plex.)
posted by cortex at 8:38 AM on June 29, 2010

But so anyway, I could fix that oversized b-plex issue pretty easily I think but applying the case II check that already exists in afromb to the b-plex segment once it has been built, and I think I'll give that a shot just to see.

But the whole experiment happened because I was trying and failing to make the change I originally intended, which was to apply time-stretching to the b segment to make the single iteration fill the same space as a longer a. (And to go in the other direction for case II situations as well, because once I figure out how to do it one way the other way should be trivial.) And there's a nice little "modify" library in Remix that supports just that sort of move, but I'm still mucking around so blindly in Python and the Remix object hierarchy that I'm running into errors. I may need to start pestering the Remix google group.

I'll keep playing with the "b-plex" idea and see if I can find examples that really showcase the fun Headroomy nature of what it does. It occurs to me that it'd be pretty easy to build a purpose-built headroom.py script that would do nothing but take some audio and insert random strings of glitches in it without otherwise rearranging it, which when applied to e.g. audio or video of someone giving a speech would be pretty much a perfect Naive Random Headroom function. (A really good Headroomifier would probably need a library capable of doing speech recognition and language analysis in order to pick the best spots to glitch out. Someone else can work on that.)
posted by cortex at 8:45 AM on June 29, 2010

So the post today of old Empire Strikes Back phoneline promos was raw material on a platter for me; I took the recordings of the various cast members and afrombed them against Tom's Diner and did a little mix of that (including giving different voices solos/duets throughout, to liven things up a little) and produced this:

Tom's Cantina

I also did an alternate version using my modified "Headroom" b-plex code:

Headroomed Mix

Which is a decent example of the difference in feel from what I was describing in those earlier couple of comments.

I find it interesting though not totally surprising that I was able to get fairly melodic results out of these—the natural lilt and range-of-pitch of conversational speech is enough to generate a lot of "notes" on the scale when taken out of context in small chunks. Leia does the best job of actually "singing" the melody, which is not shocking since her range here is much closer to Vega's than any of the male cast members. 3PO is a close second, Luke's not bad either, but Han is an octave down most of the time and pitchy to boot. Vader's hardly even usable, there's barely any melodic content.

I should note that in both cases there I was also using a modified similarity calculation for afromb, basing the comparison between a and b segments entirely on pitch content without accounting for either timbre or loudness. I'm hoping that provided better melodic matching if possibly at the cost of worse sound-alike matches in terms of any segment's phonetic content, if you will, compared with the original slice of Vega's lyric.
posted by cortex at 2:32 PM on June 29, 2010

« Older 'BP And The Axis Of Evil' | Moosonee: I never cared for the satellite... Newer »

This thread has been archived and is closed to new comments

MetaFilter

I want to ravish you like a Cacatua sulphurea
June 23, 2010 2:01 PM Subscribe

Tags

Share

I want to ravish you like a Cacatua sulphurea June 23, 2010 2:01 PM Subscribe

Tags

Share

I want to ravish you like a Cacatua sulphurea
June 23, 2010 2:01 PM Subscribe