The Interdimensional Jukebox
April 13, 2024 3:17 PM   Subscribe

Dune the Broadway Musical [Showtunes] - Baby On Board [Barbershop] - Carolina-O [Indie Country] - Sabrosito Amor [Latin] - Rising Sun Gospel [Soul] - Allegro Consort in C [Classical] - You Spilt a Coffee on my Dog [R&B] - Potion Seller [60s Folk] - I'm Not Your Star [Screamo] - SNES Greensleeves [Chiptune] - Syncopated Rhythms [Jazz] - Tavern Serenades [Fiddle] - My Tamagotchi died in '98 [Country Pop] - Senna Tea Blues [Bluegrass] - Unexpected Item in Bagging Area (A Cowboy's Lament) [Americana] - Herb's Whisper [Hip-hop] - Metropolis Pt. 3 [Prog metal] - F**k You Elmo [Acoustic Guitar] - Lorem Ipsum Dolor Sit Amet [Orchestral] - ムーンライト【.】【3】【1】[Vaporwave] - Dreaming Miku [Vocaloid] - The Deku Tree’s Decree [Broadway] - Website on the Internet [50s A Capella] // Meet Udio — the most realistic AI music creation tool I’ve ever tried

While the site lets you submit your own lyrics (or generate them with LLMs), it's notable for synthesizing all the melodies, instrumentation, and realistic vocal stylings on its own. You can coax it with stage directions like [CHORUS] and (backing vocals), but it will ignore or rewrite requests for specific artists. It's free while in open beta, giving two 30-second clips per attempt and generous daily allowances if you want to test it out.

It's just the latest advance in a long line of similar vocal synthesis projects, from vocoders, voice changers, Autotune, and Vocaloids to machine-learning wonders that generate the whole waveform from scratch, like OpenAI's surreal Jukebox and the more recent synth-y Suno.ai (and who could forget the incredible Microsoft SongSmith?). Perhaps most prolific is the open-source So-VITS-SVC, whose lack of copyright DRM unleashed a torrent of bizarre and hilarious covers in the last year and change:
Obama + Biden - Boy's a liar Pt. 2 (and a whole mini-genre of AI Presidents videos)

Freddie Mercury - Thriller, Let It Go

Frank Sinatra - The Winner Takes it All, Smells Like Teen Spirit, Bad Romance, Gangsta Paradise, You Got a Friend in Me, Creep

Thom Yorke - Yellow, Crazy Little Thing Called Love, Mambo No. 5

David Bowie - No Surprises

Michael Jackson - Careless Whisper, I Feel it Coming, When I Was Your Man

Johnny Cash - Hotel California, Blank Space, Barbie World

Linkin Park - Somebody That I Used To Know

Kurt Cobain - Fortunate Son, Dream On

SpongeBob cast - Hallelujah, Thriller ft. Plankton, Billie Jean

Battle Droid - My Way

GLaDOS - Welcome to the Internet
And, perhaps aiming to make up for the real one's spiral into hate, a whole slew of faux Kanyes singing increasingly unlikely covers:
Someone Like You - Circles - Paparazzi - Get Lucky - Yesterday - Wonderwall - Can You Feel the Love Tonight - Mr. Brightside - You've Got a Friend in Me - Call Me Maybe - Isn't She Lovely - A Thousand Miles - American Pie - Hey Soul Sister - Complicated - Kiss Me - Teenage Dirtbag - Hallelujah - Viva La Vida - Hey There Delilah - Shake It Off
Ars Technica: MIT License text becomes viral “sad girl” piano ballad generated by AI

ABA Journal: AI-generated music is everywhere; is any of it legal?

Billboard: Billie Eilish, Pearl Jam, Nicki Minaj Among 200 Artists Calling for Responsible AI Music Practices

Under US law, at least, generated music (like the famous monkey selfie case) remains uncopyrightable and in the public domain absent human authorship, putting a large question mark on the business model of AI music companies beyond novelty.
posted by Rhaomi (32 comments total) 18 users marked this as a favorite
 
Amazing post--thank you. I see Suno.ai mentioned here. I have hesitations around this stuff, but when its v3 engine was released in March, I made [or rather had it generate] some simple folk songs based on poems written by Emily Brontë:

Written in Aspin Castle, v. 1 (v. 2)
he night is darkening round me, v. 1 (v. 2)
The lady to her guitar, v. 1 (v. 2)
High waving heather, v. 1 (v. 2)
Song, v. 1 (v. 2)

While I was messing around with it, I also looked at their trending public songs, and honestly the one I played more than a couple times was just Cheeky Little Mouse. Anyway, #TeamEmily.
posted by Wobbuffet at 3:54 PM on April 13


Entertaining and terrifying. Enterterrifying.
posted by saturday_morning at 5:02 PM on April 13 [3 favorites]


>Under US law, at least, generated music (like the famous monkey selfie case) remains uncopyrightable and in the public domain absent human authorship,

I'm sure music-generative AI will be every bit as respectful of human authorship and copyright as image-generative AI.

But while we all wait to be replaced by this kind of garbage, how cool to be able to non-consensually make the corpse of Kurt Cobain sing something.
posted by Sing Or Swim at 5:03 PM on April 13 [6 favorites]


Entertaining and garbage. Garbage.
posted by GenjiandProust at 5:57 PM on April 13 [1 favorite]


As someone who's been following this very, very closely, the Udio stuff just made my draw jop.
posted by daHIFI at 6:10 PM on April 13


I can't stop listening to I Glued My Balls to My Butthole Again, in incredible 60s soul style. The chorus is fully an ear worm!!

The account Obscurest Vinyl has a number of similarly crass and hilarious AI tracks up.
posted by wemayfreeze at 7:18 PM on April 13 [10 favorites]


There's quite a bit of a last-mile problem when it comes to generative tools, but Udio and Suno have finally broken through the "does it sound like a song" barrier. In the next couple years or so they'll probably solve the weird-sonic-artifacts problem that all of these have, and then it will truly be the music equivalent of Walmart coming to town.
posted by tclark at 7:34 PM on April 13


Sinatra smells like teen spirit is interesting.

Pat Boone singing crazy train is better.

It feels stupid, and contagious
enter prompt now, entertain us

posted by clavdivs at 7:56 PM on April 13 [1 favorite]


I am a convert to AI. My kids were around 4,5, and 6 when tickle me Elmo was the craze. My childlesss (at the time) brother thought it a good idea to gift us a Tickle Me Elmo. Fuck You Elmo is a masterpiece. Tickle Me Elmo was the impetus for us banning toys with batteries for many years.
posted by JohnnyGunn at 8:47 PM on April 13


GenjiandProust: "Entertaining and garbage. Garbage."

If you're talking about the social or economic downsides, I won't argue with you (though I do tend to think it's more a problem with capitalism writ large -- the technology itself is fascinating and borderline-magical). But if you mean the quality, well, I just can't agree.

Jukebox was the first real foray into this space, and even at its best it's like tuning into a fuzzy AM radio band from a parallel universe. It also has the problem of only sounding truly authentic when you let it speak gibberish -- try to conform it to written lyrics and it's like the world's shittiest text-to-speech engine. It's also pretty structureless on timescales larger than 30 seconds or so and tends to devolve into silence or chaotic noise.

So-VITS-SVC is a big improvement, but relies on imperfectly transforming existing IRL vocals (the tracks in the post are based on either the original songs or top-tier covers). Suno is better, but to my ear the vocals seem to always have a conspicuous synthy quality that's a dead giveaway, like somebody put the vocal track through a subtle robot filter.

But these tracks from Udio are verging on flawless, imho. You can get dud gens, and the ChatGPT-provided lyrics are predictably bland and cliched, but with the right verses, prompt, and a bit of luck you can produce something that would blend in seamlessly with modern radio and streaming. Those barbershop numbers would be right at home on the vintage Fallout soundtrack, for ex, and with a proper completion you could easily slot the Carolina-O track into a modern pop chart with nobody noticing. It's not garbage, at least not on a melodic/vocal/quality level.

(It also bears repeating that this thing is not cut-and-pasting snippets of real music together or simulating discrete artificial instruments like a glorified MIDI, but rather generating the entire raw waveform from scratch. It just... conceives of the final-mix audio, including the vocals (including context-aware and emotive performances), instruments, layered production effects, and even period-accurate details like audio crackle for old recordings and stereo-panned instruments for 60s-era tracks. I will never not be flabbergasted that this is possible.)
posted by Rhaomi at 10:44 PM on April 13 [3 favorites]


A friend shared a an autonomous car that needs it's door closed.
posted by caphector at 11:02 PM on April 13 [1 favorite]


This is quite fun for making silly theme tunes about people you know.
posted by mokey at 1:59 AM on April 14


Metafilter (A Cowboy's Lament)

I must say, it is really impressive. It reminds me of that post the other day about the guy who writes 50 songs a day and uploads them to Spotify. And the ones about pooping have literally millions of streams. I could spend the rest of the day creating 1,000 songs about pooping and really corner the market!
posted by mokey at 2:58 AM on April 14 [7 favorites]


If you never watched The Congress (2013), now is a good time. It feels more prescient than ever as our world bristles with generative hallucinations devouring the past.
posted by neonamber at 2:59 AM on April 14 [1 favorite]


whoa, good call on The Congress, just watched the trailer and just wow... will watch shortly.

If I might recommend some reading: The Coming Wave by Mustafa Suleman, Power and Progress, and the entire Culture series by Banks
posted by daHIFI at 7:22 AM on April 14 [2 favorites]


ChatGPT-provided lyrics are predictably bland and cliched

Man, this thing is pretty weak at lyrics, and I'm saying that as someone who writes some pretty bad lyrics myself. I'm also saying that as someone who is really, really not fond of GPT-4's (and Claude's) writing style. But since I wanted to give the toy a go, I asked my (currently kinda private due to the dataset being a WIP collab) torturously fine-tuned/merged-to-fuck-and-back Mixtral to turn out something for me to play with.

And, okay, this Udio thing is converging on lifelike in a kind of surprising way. I'm inexpert—to say the least—in my ability to evaluate the output on its technical merits but as a generic-ass listener, it's impressive.
posted by majick at 7:53 AM on April 14


When the Metafilter TV ads go out I've got the music covered.
posted by mokey at 7:55 AM on April 14 [1 favorite]


Count me as one who thinks Udio crossed some kind of threshold. I fed it an old GPT-2 poem I had saved, and I got back an acapella rendition that had dynamics and ennui that was unexpected.

I'm still searching for the "use case" besides one-off songs for special occasions and jingles for local businesses that can't afford to buy ads. I think it could be great for idea generation, I mean type in "shoegaze J-pop about John Stamos's mullet becoming sentient" and you get pretty much that. And it could be the ultimate sample loop generator, should the legal stuff get sorted. But I was able to generate a perfect 30-second Bob Pollard song, which makes me think they're gonna go through some stuff first.
posted by credulous at 8:27 AM on April 14 [1 favorite]


If you never watched The Congress (2013), now is a good time. It feels more prescient than ever as our world bristles with generative hallucinations devouring the past.

It's worth noting that the original novel, The Futurological Congress, by Stanislaw Lem, is not about AI. Instead, it is literally hallucinogenic, as everyone's reality comes from the drugs they consume.
posted by CheeseDigestsAll at 8:48 AM on April 14


This was a really bad idea.

And by that, I mean both creating this in the first place, and giving someone like me access to it.
posted by delfin at 9:17 AM on April 14 [2 favorites]


> Metafilter (A Cowboy's Lament)

You can turn off the website now, there is nothing that can be posted that is better than this.
posted by jferg at 10:19 AM on April 14 [2 favorites]


I knew that it would be my eventual destination anyway, but now I am definitely going straight to a deep circle of Hell for what I am doing to the genre of blues right now.

I'm visualizing male and female vocalists trading the mic back and forth belting out salty disses at each other for the enjoyment of all present, timeless classics like "I Hope Your Next Shit is Square" and "Your Love Ran Down My Leg And Now You're Gone."
posted by delfin at 10:59 AM on April 14 [1 favorite]


I broke the sound generator by simply typing "fartcore"
posted by credulous at 11:47 AM on April 14 [1 favorite]


I'm sure music-generative AI will be every bit as respectful of human authorship and copyright as image-generative AI.

I have to assume that part of has allowed these startups to leap ahead of the big companies in this domain is willingness to risk getting sued. I was playing with audio yesterday and it gave me a Slowdive song. It wasn’t actually what I asked for, and wasn’t one of the stronger musical results, either, but there is no way Souvlaki didn’t go into this thing because it had the guitar tones and effects dead to rights.
posted by atoxyl at 2:42 PM on April 14 [1 favorite]


I'm sure music-generative AI will be every bit as respectful of human authorship and copyright as image-generative AI.

Without even reading a single thing about it (stealth startup, nothing to read but it appears to be the Google Lyria team starting their own thing in frustration after their work went unreleased for years), it is nearly certain no respect for IP was a consideration in the training. They are working with artists (will.i.am is an investor) so I expect there will be some watermarking (which IIRC aligns with the upcoming EU bill on AI/generative media) but the output is simply too good for it not to be quietly slipped into all kinds of places, regardless of what the US Copyright Office says.

It’s not the end of music but it might be the end of royalty free stock music. I typically play down job impact but if I were a purely commercial musician with no live music revenue stream I would be shitting bricks right now: unlike Sora, this can pass as human. The only saving grace, at all, is that in theory “transformative” changes by a human on the output would restore copyright eligibility… but that’s still going to cut the job market down to a tiny fraction of what it was.
posted by Ryvar at 2:44 PM on April 14


I need to try Udio. I tried Suno since it's built into Microsoft's Copilot which is built into Windows now and it did not do well. Or I provided the wrong inputs. But "create a traditional gospel song in the style of Boards of Canada" did not create a traditional gospel song and certainly not with any influence by Boards of Canada. It just sounded like vague CCM worship music.
posted by downtohisturtles at 2:54 PM on April 14


I played with Udio this weekend & was pretty impressed with some of the things it generated.

Here's the song Morning Muffin Melody with all music & lyrics AI-generated based on the following prompt: "a song about making muffins on a beautiful spring Saturday morning in the style of celtic folk"
posted by belladonna at 7:16 PM on April 14


Honestly, I feel like everyone else can pack it in at this point. Udio is on another level.
posted by emelenjr at 8:47 AM on April 15


This is shockingly good. The most complex lyrics it generated were on bluegrass songs, of the styles I tried. I did manage to get it to consistently glitch on one song by adding yet another section on the end. It gave vocal sounds that did not match the lyrics or any words known to me. I also fed it profanity, which led to moderation errors. Potty humour seems to be allowed, and as funny as ever. Otherwise, wow.
posted by sillyman at 11:29 AM on April 15


I did my first experiment building a song from a standard verse-chorus-verse-chorus-bridge-versus-chorus-outro format, and while the transitions need some tightening, I'm pretty happy with it for everything but the lyrics being AI-generated.

You're Getting Hepatitis for Christmas

How long until it becomes a standard?
posted by delfin at 6:44 PM on April 15 [1 favorite]


One more for the pile, some 50s doo-wop:
Mean Maureen (And My Burnt-Up Peen)

There is definitely a big difference between letting it go off on its own with a vague prompt, and providing your own lyrics and prompting only for genre and things like "female vocalist" or a specific decade. With it in its current state, you can throw out a verse and, if you like it, extend it forwards and backwards with new segments until you've got something actually pretty good.
posted by delfin at 11:19 AM on April 16


I tried this, and spent 3 hours in the rabbithole. Luckily, though, somehow I am not addicted; I just stopped that evening, and picked it up again two days later. I may post siomething to MeFi music, because it comes up with very unexpected stuff when placed on manual mode and given instructions to go experimental. Tonight, I'm having it recite passages (in what it thinks is an Irish brogue) from a manual on meat processing, against an airy ambient backdrop of melodicas, pianos, etc.

It's TOTALLY FUCKED UP AND I LOVE THE FUTURE
posted by not_on_display at 8:43 PM on April 25 [1 favorite]


« Older Jack Conte | SXSW 2024 Keynote   |   Apparently, Meta deems climate change too... Newer »


You are not currently logged in. Log in or create a new account to post comments.