Yesterday, I had the misfortune of reading a blog post presenting an analysis of popular music lyrics using language related metrics. Earlier today, I was cringing as the post and the Complex article about it popped up repeatedly as friends and acquaintances shared these links over social media. Then, I realized I’m a linguist with a blog, and I was suddenly reliving the same fantasies about media integrity, civic duty, and raising the bar of public discourse that I have when I watch The Newsroom. So allow me to tell you why I think this particular analysis is both methodologically sloppy and ideologically gross.
So, let’s start with the methodological stuff. The author, Andrew Powell-Morse, reports that he uses readability-score.com, which uses Flesch-Kincaid and other measures of readability to estimate a “grade level” reading score. Powell-Morse seems to be using the average returned by all of the metrics that readability-score.com reports (although he doesn’t say this explicitly).
The first thing you need to know is what Flesch-Kincaid (F-K) and other such metrics are. Generally, they’re measures of text difficulty. They attempt to estimate how many years of formal education someone would need to fully comprehend a given text. How do they do this? Well, that’s the thing that causes serious problem for Powell-Morse’s analysis. F-K uses the formula below which I’ve borrowed from Wikipedia, and other measures used by readability-score.com use similar formulas (Automated Readability Index, Coleman-Liau index, SMOG, Gunning-Fog index).
Crucially, notice that you divide the total number of words by the total number of sentences. For a computer program, a “sentence” is just all the stuff between two periods (or other sentence-final punctuation), which means technically (1) through (4) below are “sentences”.
(1) No. (F-K grade level = -3.4)
(2) My name is Nic. (F-K grade level = -2.2)
(3) Sometimes we all need a reminder that numbers need to be contextualized. (F-K grade level = 9.7)
(4) Its vanished trees, the trees that had made way for Gatsby’s house, had once pandered in whispers to the last and greatest of all human dreams; for a transitory enchanted moment man must have held his breath in the presence of this continent, compelled into an aesthetic contemplation he neither understood nor desired, face to face for the last time in history with something commensurate to his capacity for wonder. (F-K grade level = 30)
For the kinds of texts that F-K was designed to be applied to, written texts like instructional manuals or K-12 textbooks, this definition works fine. Looking at the number of words in sentences defined by where we place the punctuation usually provides a good proxy for a sentence’s grammatical complexity. For example, we can generally expect that as the number of subordinate clauses increases so to will the number of words in the sentence. F-K is obviously not perfect, but I’m not trying to outright reject it as a tool for analyzing texts (when used on appropriate texts).
The problem here is what Powell-Morse did to make “sentences” out of song lyrics. He writes, in a parenthetical, “punctuation added by me, since most songs lack it altogether” (credit to Jonathan Dresner for noting this). Well, most song lyrics lack punctuation all together because they’re spoken (or sung) texts. Punctuation is a feature of writing, and sentences are units of written language. Thus, Powell-Morse had to artificially impose both, and this process is no where near as simple as it might sound. It also has profound effects on the results you get.
To illustrate this, I’ll just use one example, but feel free to try punctuating some on your own. In particular, Powell-Morse includes a couple of tables with individual songs ranked by average grade level. You can see if you can replicate his scores by searching for the lyrics and punctuating them yourself.
In the meantime, consider the song “Dani California” by the Red Hot Chili Peppers which according to Powell-Morse is one of the “smartest songs over the last 10 years”. Here’s the beginning of the song as it appears on this website:
Getting born in the state of Mississippi
Papa was a copper
And Mama was a hippie
In Alabama she would swing a hammer
Price you gotta pay when you break the panorama
I’m honestly not entirely sure where to place periods because the lyrics are not structured into sentences in the same way written texts are. I tried two approaches:
Getting born in the state of Mississippi,
Papa was a copper,
And Mama was a hippie.
In Alabama she would swing a hammer,
Price you gotta pay when you break the panorama.
Getting born in the state of Mississippi.
Papa was a copper,
And Mama was a hippie.
In Alabama she would swing a hammer.
Price you gotta pay when you break the panorama.
It would be nice if there were some clear convention that could help us decide which of these approaches is “the correct one”. However, the lyrics illustrate nicely the difficulty of trying to fit spoken language into the structural units of written language. For example, what is the relationship between the last line (which starts “Price you gotta pay”) with the previous one? Should we assume it’s something like this: “In Alabama she would swing a hammer, which is the price you gotta pay when you break the panorama” (Approach 1). Alternatively, it could be something like this “In Alabama she would swing a hammer. That’s the price you gotta pay when you break the panorama” (Approach 2). Notice that the sentence structure we impose changes our punctuation choices.
This has drastic effects on the results. Approach 1’s punctuation results in an average grade level of 7.6 for this section of the song. Approach 2’s punctuation results in an average grade level of 5.1 for this section of the song.
This introduces a substantial threat of reliability into Powell-Morse’s analysis. Of course, there are always limitations to any attempt to quantify complex phenomena. However, there are other relevant linguistic measures that would not have required Powell-Morse to impose punctuation onto the texts (for example, lexical diversity, see this paper for an overview).
This leads me into the other, and, in my mind, much more serious issue with this analysis. It concerns Powell-Morse’s interpretation of his results. Even if it had employed appropriate methodological tools, this problem would not be solved.
The analysis conflates the complexity of the language or the readability with (a) the complexity of the ideas in the song and, at times, (b) the actual intelligence of the people writing the lyrics or the singers themselves. For example, he writes that his results show that “women seem to be a bit smarter than men, except for when they’re not”. Using this logic, Powell-Morse is a little smarter than a typical sixth grader, since his blog post has a readability of 6.7. (Before you go running to get the readability scores for my blog post, you clever person you, I’m about as smart as a high school sophomore: 10.0).
Powell-Morse would have us believe that this is all just “fun”, and we shouldn’t take it too seriously: “These numbers are fun and interesting, so just enjoy them.” My concern is about the kind of “fun” we’re supposed to be having. In presenting a methodologically flawed analysis distilled through an ideological conflation of readability scores and people’s intelligence, presented to us as “science”, Powell-Morse invites us to laugh at musicians and, by extension, the people who listen to their music. His message is clear: popular music may all be “dumb”, but it’s not all as “dumb” as R & B and hip-hop: “Sorry 1st graders, but you’ll have to settle for R&B and Hip Hop from 2007.”
It gets even worse when Powell-Morse’s work is picked up by Complex. Here’s the current title to their piece:
Notice that Complex has decided to pick out three black Hip-Hop and R&B artists to foreground. Considering that Powell-Morse’s post clearly finds a number of artists across a few genres that would satisfy this criterion, it seems beyond coincidental that these three would be chosen. They’re not even the three lowest scoring.
Why then is a blog post that grossly overstates its poorly executed research resonating with people? There’s probably a number of reasons. The one we’re probably most willing to admit is that it feeds our musical elitism. There are doubtlessly many music junkies boldly proclaiming the ‘scientifically-proven’ superiority of their preferred genre over their peers’ (even if they’re not entirely serious). However, more concerning is the reason we’re probably less willing to say out loud. It reinforces our stereotypes about the illiteracy of hip hop artists, the young people that listen to hip-hop, and Black people in general. In other words, this is just racist clickbait.