Ham4Algorithm
June 20, 2016 7:01 PM   Subscribe

We Wrote an Algorithm to Unravel the Rhymes of Hit Musical ‘Hamilton’ (WSJ) Let's start with the first verse of the musical's opening number. Our algorithm breaks words into their component sounds and then groups similar-sounding syllables into rhyme families, which are color-coded.

Interested in the deeper process of creating this visual display? - The Hamilton Algorithm Methodology (8 cites)

Bonus: OpenNews postmortem breaking down the two month process of going from concept & Python prototype to polished Javascript presentation.
posted by CrystalDave (28 comments total) 27 users marked this as a favorite
 
This is so cool. And I think it's a testament to the genius of Hamilton that it has prompted, and warrants, so many different kinds of rich analysis. I'm struggling to think of another piece of art from the last few decades that has prompted so much and multivaried analysis.
posted by lunasol at 7:06 PM on June 20, 2016 [5 favorites]


Can we use this algorithm to find a polynomial time solution for the Hamiltonian path problem....
posted by miyabo at 7:07 PM on June 20, 2016 [6 favorites]


Algorithmic Hamilton...its name is Algorithmic Hamilton...
posted by uosuaq at 7:08 PM on June 20, 2016 [32 favorites]


They just explained scansion.

*slow clap*

/snark

It's a great visualization of how it works, though, especially when imperfect, so props.



#yayhamlet

posted by mandolin conspiracy at 7:15 PM on June 20, 2016 [13 favorites]


And also...what I wouldn't have given for an algorithm like this as an English major...
posted by mandolin conspiracy at 7:18 PM on June 20, 2016 [3 favorites]


It's great to use algorithms to do things that humans are bad at. Like analyzing incremental changes in usage in thousands of texts over a timespan of decades.

Using an algorithm to do, in a modestly OK way, things that poets, rappers, and many other humans are already really good at, is missing the point.
posted by escabeche at 7:22 PM on June 20, 2016 [14 favorites]


Using an algorithm to do, in a modestly OK way, things that poets, rappers, and many other humans are already really good at, is missing the point.

Oh, absolutely. Was only thinking of late-night shortcuts for assignments when analyzing scansion was gonna be a thing.

OTOH, it's for the best I had to do it under my own steam.
posted by mandolin conspiracy at 7:26 PM on June 20, 2016


Algorithmic Hamilton...its name is Algorithmic Hamilton...

Your request was too synchronous...just you wait...just you waiiiiiit...
posted by middleclasstool at 7:30 PM on June 20, 2016 [2 favorites]


A good friend of mine and one of his grad students did a rap-music project sort of similar to this, and they were a little miffed that the Wall Street Journal acted like it had done something original, when it had been done earlier and better by them. One of my friend's colleagues bugged the WSJ a bit and they finally included an acknowledgement in their "methodology" section.
Finally, while we independently developed our approach to programatically detect rhymes, we are not the first to tackle this problem. Hussein Hirjee and Daniel G. Brown developed a different method of scoring rhymes in their paper “Using Automated Rhyme Detection to Characterize Rhyming Style in Rap Music.”
This is an effort I made sometime in the last year to try to explain dan's work to someone:
dan started out as a math major in college, and then a graduate student in computer science (probably something more specific than that). Various forms of happenstance got him hooked up with people working on the Human Genome Project, and he spent a post-doc year as one of the many, many people working on it, which led to him being what he is today, which is a bioinformetrician. Bio, info—he does work on, like, extracting information from biological data, and also (if I recall correctly—he has told me about these things and I find it fascinating but my brain is not super-retentive of the details) on modeling biological growth and change. He’s done some stuff to have to do with evolution, and I think some stuff having to do with taxonomy.

His webpage at the University of Waterloo includes this description of one of his projects:

In 2010 and 2011, Jakub Truszkowski and I developed what we believe is the first error-tolerant O (n log n)-runtime algorithm for phylogenetic reconstruction. QTree uses a complex search structure and a random walk approach to incrementally add new taxa to a growing tree, with each addition happening in logarithmic runtime.

“Logarithmic runtime” would make a good name for a band.

OK, so all of that.

But dan ALSO does work on music lyrics. He and a former student do interesting things with rap music, looking for patterns in lyrics written by different rappers. They have a model that is able to correctly identify the writer of a rap song (from a limited set) about 50% of the time, which is about 10x more often than random. That’s pretty robust. Their model has even, to dan’s delight, identified a few songs that were ghostwritten. Ghostwriting is a no-no in rap culture, but dan and Hussein have confirmed that several songs credited to one rapper but identified by their model as a different rapper’s song, were in fact written by the other rapper. This is so cool!

When we visited two weeks ago, dan told me quite a bit about his work on rap music. We talked a lot about rhyme, internal rhyme, the relatively flexible rules of rhyme in rap music, people who like to play with language, etc. It’s all pretty fascinating.

But, you may ask, what does rap music have to do with dna? With evolution? With taxonomy? With biological data?

I don’t know. dan never said.

But I figured it out for myself the other day!

In many contexts, data is relatively orderly, extracted and created under controlled circumstances. In my former field, political polling and market research, surveys are designed to collect only the information we’re looking for, and in carefully organized ways: multiple choice questions that can be easily tabulated; open-ended questions designed to elicit certain kinds of answers, which can be coded fairly easily; all of it then tidily run through certain routine statistical analyses.

When you’re dealing with people, there is an almost infinite amount of information available. But not all of that information is data, and you can pull data out of the big mess of information by controlling your sample and the questions you ask.

Here’s my guess about dan’s work: when you’re working with biological information, the useful data is buried in a lot of extraneous information, and it’s very challenging to figure out which is which. DNA for instance, which, you will recall, turned out to have all kinds of junky genes that don’t do anything, and redundant genes, and genes that could do something but usually don’t, and genes whose effects are additive with the effects of other genes (but which other genes?). And you can’t ask the dna, “Which genes do you use to determine whether a given person develops Darwin’s tubercule on the helix of the ear or not?”

It’s a big mess of disorganized information, and the usual methods of isolating and extracting relevant data don’t apply. So people like dan work on ways to make sense of the big mess.

If I’m more or less right, the applicability to rap lyrics is pretty clear, because that corpus is also a bunch of information which was not organized with data collection in mind. What characteristics distinguish one rapper’s lyrics from another’s? How do you define and extract that information? Is meaningful information even present that can answer these questions? I think this is the kind of thing dan does.

If I'm wrong, I'm sure that what he does is much more interesting.
I was only sort of right. dan replied:
The reason why we can do the stuff I did with Hussein is because poetic lyrics and DNA are actually pretty tractable data sets: they’re sequences of sounds or of DNA letters (A,C,G,T). What we study in both cases is approximate matches between the sequences of phonemes or of DNA symbols. Then some machine learning does the kinds of extractions of signal from really large and complex data.

Here is a talk I gave to a group of high schoolers, which I’m actually pretty pleased with, even 5 years later; a bunch of people have watched it and said it’s not too confusing.
The thing is, this kind of analysis is interesting from the perspective of poetics—like mandolin conspiracy I'd have loved something like this as an English scholar—but also interesting from the perspective of how information can be extracted, organized, and made use of.

tl;dr: I have a friend who is really cool and super-smart, and I want everybody to know it.
posted by not that girl at 7:46 PM on June 20, 2016 [20 favorites]


This is so cool. And I think it's a testament to the genius of Hamilton that it has prompted, and warrants, so many different kinds of rich analysis. I'm struggling to think of another piece of art from the last few decades that has prompted so much and multivaried analysis.

Well, entire degree programs have been created around Buffy Studies, so there's that.

On the other hand, that was seven seasons of material, vs a single evening's entertainment. I don't think the above detracts from the marvelous complexity of Hamilton.
posted by Superplin at 7:53 PM on June 20, 2016 [1 favorite]


That said, I thought the WSJ article was a pretty good intro to scansion and the variety of rhymes. I'd point out that internal rhyme, near- and slant-rhyme, assonance and consonance, and so on, have always been used in poetry, though I do think it's worth paying attention to the virtuosity with which they'd used, and how densely packed they can be, in rap music, which seems to sometimes push rhyme to its very limits but also pushes the verse to the limits of how much rhyme it can contain.
posted by not that girl at 7:56 PM on June 20, 2016 [2 favorites]


If anyone got interested by this, there was another recent post on a different set of (very introductory) illustrations of rap scansion.
posted by LobsterMitten at 7:57 PM on June 20, 2016 [2 favorites]


All right, according to the algorithm, the secret ingredient is... love? Who's been screwing with this thing?
posted by The Pluto Gangsta at 9:36 PM on June 20, 2016


A good friend of mine and one of his grad students did a rap-music project sort of similar to this, and they were a little miffed that the Wall Street Journal acted like it had done something original, when it had been done earlier and better by them.

It took me a minute to realize you weren't saying "my friend totally wrote Hamilton years ago, this is totes stealing, GIVE HIM CREDIT FOR THIS THING YOU NEVER HEARD" so my brain took this weird rattling swerve from "oh good lord, there are actual Hamilton truthers" to "oh, hey, effin' cool!" in the space of a paragraph and then the rest was really fun to read. So thanks!
posted by middleclasstool at 5:14 AM on June 21, 2016


Well, entire degree programs have been created around Buffy Studies, so there's that.

I mean, I have a degree in cultural studies, so I know that it's possible to get really analytical about basically any pop culture artifact, but the sheer diversity of different kinds of analysis at play with Hamilton is pretty striking to me.
posted by lunasol at 6:36 AM on June 21, 2016


Runs, of scripts.
And so the balance shifts.
posted by mandolin conspiracy at 6:40 AM on June 21, 2016 [6 favorites]


What's interesting is that so much of Hamilton's success reflects how rap and hip hop have incrementally developed a new form of poetic language where rappers went from really simple rhymes at the ends of bars (which can correlate to some very sing-song type lyrics) to very complex rhyming structures in which internal and imperfect rhymes can give skilled lyricists like Miranda the freedom to incorporate extremely powerful and dense narratives within the song in a way that many other musicals cannot. You don't have to have extended periods where dialogue and monologues have to interrupt the musical numbers so the effect is that there is more or less one big musical number that has only minimal dialogue interrupting them.

This is kind of cool because you can basically get the entire body of the musical just from listening to the cast album. Yes there are parts missing and sometimes it can be hard to follow the action without visuals but there is quite a bit of story just being related in the auditory realm in a way that I only rarely have seen in other musicals.

This density of narrative present within the lyrics and supplemented by the arrangement which can really help the rhymes pop out in a way that they might not otherwise creates a very interesting and brilliant work and it's really because there is a synthesis of two very distinct art forms in a way that each complements rather than clashes with each other.

Combined with the narrative of the musical and the meta-narrative that is created by the casting decisions and Miranda and company have created an exceedingly dense piece of art that is shocking not just in it's accessibility to the general public but also because it seems to create a document that is much stronger than the sum of it's components.

Honestly this it kind of why I would love for Hamilton with the original cast to be filmed at least once before the Miranda and others leave the production because while it can't capture the feel of being in the audience it seems like it would be a great historical record for theatre majors to learn from decades from now.
posted by vuron at 6:51 AM on June 21, 2016 [11 favorites]


You don't have to have extended periods where dialogue and monologues have to interrupt the musical numbers so the effect is that there is more or less one big musical number that has only minimal dialogue interrupting them.

holy cow, I can't wait until this kind of innovation reaches opera
posted by beerperson at 7:02 AM on June 21, 2016 [4 favorites]




Shit spray/Kips Bay
I love that rhyme
posted by Biblio at 8:20 AM on June 21, 2016 [3 favorites]


Lin-Manuel Miranda confirmed via Twitter the original cast will indeed be recorded (!!) but is not sure yet how it'll be used.
posted by a good beginning at 9:15 AM on June 21, 2016


holy cow, I can't wait until this kind of innovation reaches opera

I get what you're saying here, but the effect is markedly different when people are rapping rather than singing. It comes off more as a modern take on e.g. Shakespeare than a "rap opera" (hip-hopera?), and to my ears and brain is easier to hook into than if everyone were singing all of their lines through the entire thing, particularly if they're singing in that room-filling operatic style.

It's okay for us to borrow ideas from the past and repurpose them for art forms that are still culturally relevant to younger generations. Indeed, it's how new ideas get made. Though I admit I am surprised to refer to a Broadway musical (especially one about the US's first Treasury Secretary) as "culturally relevant to younger generations". So who knows, opera may be the next comeback.
posted by middleclasstool at 9:35 AM on June 21, 2016


Though I admit I am surprised to refer to a Broadway musical (especially one about the US's first Treasury Secretary) as "culturally relevant to younger generations".

When we saw the show recently, I witnessed a girl who couldn't have been more than 12 or 13 badger her parents to buy her a copy of the Chernow Hamilton biography at the merch table.

That's pretty cool.
posted by mandolin conspiracy at 9:46 AM on June 21, 2016 [2 favorites]


Using an algorithm to do, in a modestly OK way, things that poets, rappers, and many other humans are already really good at, is missing the point.
posted by escabeche at 10:22 PM on June 20 [12 favorites +] [!]
Same thing with chess and facial recognition. AI is a science, so its benefits are not always obvious. But I think the really cool thing is taking something so basic and obvious to humans (such as rhythm and rhyming) and modeling it. In the process of modeling, you realize how amazing your mind really is and how much you do unconsciously.

Really, in a way, AI tackles neurology from another angle, by proposing specific ways our minds could work while FMRIs and other studies are limited in the details they can find.

Anyway, here's one of my favorite visualizations of a "useful" algorithm. Here we see GZIP shrink Poe's The Raven.
posted by mccarty.tim at 10:47 AM on June 21, 2016 [1 favorite]


> It's okay for us to borrow ideas from the past and repurpose them for art forms that are still culturally relevant to younger generations. Indeed, it's how new ideas get made. Though I admit I am surprised to refer to a Broadway musical (especially one about the US's first Treasury Secretary) as "culturally relevant to younger generations". So who knows, opera may be the next comeback.

To be fair, the relevance of the Broadway musical as an art form to younger generations wasn't exactly in crisis. Into every generation there are born several musicals which add to the canon of memorized-obsessive-singalong fodder for juvenile theater geeks. And repurposing styles of music into Broadway musicals doesn't just keep it fresh for the kids, it brings in adults, too.

(When I was a teenager, everyone was boggling at how teenagers were suddenly obsessed with 19th century France thanks to both Phantom of the Opera and Les Miserables. Oh, the scorn heaped upon those who mistook Les Miz as being about the French Revolution.)
posted by desuetude at 1:43 PM on June 21, 2016 [2 favorites]


I will send a Python algorithm to remind you of my love.

Ba da da da....
posted by mandolin conspiracy at 4:09 PM on June 21, 2016 [5 favorites]


Cool. A bit erratic though. I tried it on:
The art of losing isn’t hard to master;
so many things seem filled with the intent
to be lost that their loss is no disaster.

Lose something every day. Accept the fluster
of lost door keys, the hour badly spent.
The art of losing isn’t hard to master.
It got the first stanza pretty well, visualizing the internal rhyming density pretty nicely. But the second stanza it kind of falls apart on, missing many of the "o" sounds, and oddly scoring the last line different from the first. It also seems very sensitive to how "loose" you set it, though their default setting seems to definitely be the best. Neat!
posted by chortly at 10:18 PM on June 21, 2016


I will send a Python algorithm to remind you of my love.

Nah the king would be mandating use of big iron and COBOL and submitted jobs. The colonists would be the ones demanding desktop jobs and autonomy, with Hamilton calling for strong typing and his enemies fans of unchecked pointers.

We all bring our own personal bugaboos to art, don't we?
posted by phearlez at 8:47 AM on June 22, 2016 [1 favorite]


« Older WE’RE NOT LOOKING FOR BITCOIN THIS TIME.   |   What you are is suspect. What you feel is... Newer »


This thread has been archived and is closed to new comments