Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 28 March 2024

Song lyrics have become simpler and more repetitive over the last five decades

  • Emilia Parada-Cabaleiro 2   na1 ,
  • Maximilian Mayerl 4   na1 ,
  • Stefan Brandl 3 ,
  • Marcin Skowron 5 ,
  • Markus Schedl 1 , 3 ,
  • Elisabeth Lex 6 &
  • Eva Zangerle 4  

Scientific Reports volume  14 , Article number:  5531 ( 2024 ) Cite this article

13k Accesses

2698 Altmetric

Metrics details

  • Computational science
  • Computer science

Music is ubiquitous in our everyday lives, and lyrics play an integral role when we listen to music. The complex relationships between lyrical content, its temporal evolution over the last decades, and genre-specific variations, however, are yet to be fully understood. In this work, we investigate the dynamics of English lyrics of Western, popular music over five decades and five genres, using a wide set of lyrics descriptors, including lyrical complexity, structure, emotion, and popularity. We find that pop music lyrics have become simpler and easier to comprehend over time: not only does the lexical complexity of lyrics decrease (for instance, captured by vocabulary richness or readability of lyrics), but we also observe that the structural complexity (for instance, the repetitiveness of lyrics) has decreased. In addition, we confirm previous analyses showing that the emotion described by lyrics has become more negative and that lyrics have become more personal over the last five decades. Finally, a comparison of lyrics view counts and listening counts shows that when it comes to the listeners’ interest in lyrics, for instance, rock fans mostly enjoy lyrics from older songs; country fans are more interested in new songs’ lyrics.

Similar content being viewed by others

research paper song lyrics

The role of population size in folk tune complexity

Sally E. Street, Tuomas Eerola & Jeremy R. Kendal

research paper song lyrics

The diachronic development of Debussy’s musical style: a corpus study with Discrete Fourier Transform

Sabrina Laneve, Ludovica Schaerf, … Martin Rohrmeier

research paper song lyrics

Exploring the foundations of tonality: statistical cognitive modeling of modes in the history of Western classical music

Daniel Harasim, Fabian C. Moss, … Martin Rohrmeier

Introduction

We are surrounded by music every day; it is pervasive in today’s society 1 and serves many functions. For instance, people listen to music to relieve boredom, fill uncomfortable silences, foster social cohesion and communication, or regulate their emotions 2 , 3 , 4 . When it comes to listeners liking or disliking a track, the most salient components of music, alongside the ability of a song to evoke emotion and the singing voice, are a song’s lyrics 5 . Likewise, the interplay between melody and lyrics is imperative as lyrics have been shown to influence the emotional valence of music; particularly, lyrics can enhance the negative emotion in angry and sad music 6 . Music containing lyrics has also been shown to activate different regions in the brain compared to music without lyrics 7 .

Seen from a different angle, lyrics can be considered a form of literary work 8 . Usually written in verse form, lyrics use poetic devices such as rhyme, repetition, metaphors, and imagery 9 , and hence can be considered similar to poems 8 . This is also showcased by Bob Dylan winning the Nobel Prize in literature in 2016 “for having created new poetic expressions within the great American song tradition” 10 . Just as literature can be considered a portrayal of society, lyrics also provide a reflection of a society’s shifting norms, emotions, and values over time 11 , 12 , 13 , 14 , 15 .

To this end, understanding trends in the lyrical content of music has gained importance in recent years: computational descriptors of lyrics have been leveraged to uncover and describe differences between songs with respect to genre 16 , 17 , or to analyze temporal changes of lyrics descriptors 11 , 17 , 18 . Lyrical differences between genres have been identified by Schedl in terms of repetitiveness (rhythm & blues (R&B) music having the most repetitive lyrics and heavy metal having the least repetitive lyrics) and readability (rap music being hardest to comprehend, punk and blues being easiest) 16 . In a study of 1879 unique songs over three years (2014–2016) across seven major genres (Christian, country, dance, pop, rap, rock, and R&B), Berger and Packard 19 find that songs with lyrics that are topically more differentiated from its genre are more popular in terms of their position in the Billboard digital download rankings. Kim et al. 20 use four sets of features extracted from song lyrics and one set of audio features extracted from the audio signal for the tasks of genre recognition, music recommendation, and music auto-tagging. They find that while the audio features show the largest and most consistent effect sizes, linguistic and psychology inventory features also show consistent contributions in the investigated tasks.

Studies investigating the temporal evolution of lyrics predominantly focus on tracing emotional cues over the years. For instance, Dodds et al. 17 identify a downward trend in the average valence of song lyrics from 1961 to 2007. Napier and Shamir 21 investigated the change in sentiment of the lyrics of 6150 Billboard 100 songs from 1951 through 2016. They find that positive sentiments (e.g., joy or confidence) have decreased, while negative sentiments (e.g., anger, disgust, or sadness) have increased. Brand et al. 13 use two datasets containing lyrics of 4913 and 159,015 pop songs, spanning from 1965 to 2015, to investigate the proliferation of negatively valenced emotional lyrical content. They find that the proliferation can partly be attributed to content bias (charts tend to favor negative lyrics), and partly to cultural transmission biases (e. g., success or prestige bias, where best-selling songs or artists are copied). Investigating the lyrics of the 10 most popular songs from the US Hot 100 year-end charts between 1980 and 2007, DeWall et al. 18 find that words related to oneself (e.g., me or mine) and words pointing to antisocial behavior (e.g., hate or kill) increased while words related to social interactions (e.g., talk or mate) and positive emotions (e.g., love or nice) decreased over time.

Alongside changes in emotional cues, Varnum et al. 11 find that the simplicity of lyrics in pop music increased over six decades (1958–2016). Similarly, Choi et al. 9 study the evolution of lyrical complexity. They particularly investigate the concreteness of lyrics (concreteness describes whether a word refers to a concrete or abstract concept) as it has been shown to correlate with readability and find that concreteness increased over the last four decades. Furthermore, there is also a body of research investigating the evolution of lyrical content (i.e., so-called themes). For instance, Christenson et al. 22 analyzed the evolution of themes in the U.S. Billboard top-40 singles from 1960 to 2010. They find that the fraction of lyrics describing relationships in romantic terms did not change. However, the fraction of sex-related aspects of relationships substantially increased.

Studies on the temporal evolution of music have also looked into temporal changes of acoustic descriptors , beyond lyrics. Interiano et al. 23 investigate acoustic descriptors of 500,000 songs from 1985 to 2015. They discover a downward trend in “happiness” and “brightness”, as well as a slight upward trend in “sadness”. They also correlate these descriptors with success and find that successful songs exhibit distinct dynamics. In particular, they tend to be “happier”, more “party-like”, and less “relaxed” than others.

Despite previous efforts to understand the functions, purposes, evolution, and predictive qualities of lyrics, there still exists a research gap in terms of uncovering the complete picture of the complex relationships between descriptors of lyrical content, their variations between genres, and their temporal evolution over the last decades. Earlier studies focused on specific descriptors, genres, or timeframes, and most commonly investigated smaller datasets. In this paper, we investigate the (joint) evolution of the complexity of lyrics, their emotion, and the corresponding song’s popularity based on a large dataset of English, Western, popular music spanning five decades, a wide variety of lyrics descriptors, and multiple musical genres. We measure the popularity of tracks and lyrics, where we distinguish between the listening count, i. e.,  the number of listening events since the start of the platform, and the lyrics view count, i. e.,  the number of views of lyrics on the Genius platform ( https://genius.com ). Thereby, we investigate the following research questions in this paper: (RQ1) Which trends can we observe concerning pop music lyrics across the last 50 years, drawing on multifaceted lyrics descriptors? We expect that descriptors that correlate more strongly with the release year lead to better-performing regression models. (RQ2) Which role does the popularity of songs and lyrics play in this scenario? We expect that lyric views vary across genres, and these variations can be attributed to changes in lyrics over time.

Our exploratory study differs from existing studies in several regards: (1) we provide the first joint analysis of the evolution of multiple lyrics descriptors and popularity, (2) we investigate a multitude of lyrics descriptors capturing lyrical complexity, structure, and emotion, (3) we provide an in-depth analysis of these descriptors’ evolution, not only over time but focusing on different genres , and (4) we leverage a substantially larger dataset than most existing works.

For our analyses, we create a dataset containing 353,320 English song lyrics from the Genius platform ( https://genius.com/ ), spanning five decades (1970–2020) in terms of the songs’ release years. Based on this collection of lyrics, we extract a wide variety of lyrics’ descriptors and popularity data for each song. In particular, we extract lexical, linguistic, structural, rhyme, emotion, and complexity descriptors and focus on five genres: rap, country, pop, R&B, and rock, as these are the most popular genres according to the widely used music streaming platform last.fm ( https://www.last.fm/ ) 24 , 25 , 26 , 27 , disregarding genres for which lyrics are less frequent (e. g., jazz and classical music). Our analysis is based on two complementary analyses, as shown in Fig. 1 . In our first analysis, we are interested in the evolution of pop music lyrics and the impact of descriptors in a regression task (i.e., we aim to find the predictors that are best suited to model a linear trend in a release year regression task). The second set of analyses investigates the relationship between lyrics view count, descriptors, and corresponding songs’ release year in a multiple linear regression analysis. Assessing lyrics’ view count, besides the typically analyzed measure listening play count, enables us to take into account another perspective of music popularity. In particular, lyrics’ view count allows us to expressively investigate the role played by lyrics in music consumption patterns over time (through the songs’ release year) as well as to relate such patterns with the lyrics characteristics (through the lyrics’ descriptors). Note that while listening play counts do not give any information about a listener’s interest in the lyrics, lyrics’ view count is a clear indicator of lyrics’ importance, which might not necessarily relate to a musical genre’s general popularity.

figure 1

Overview of data and analyses performed. Based on a wide variety of descriptors capturing the lyrical characteristics from listening data and lyrics content, we perform two analyses. Analyses 1 identifies descriptors that are characteristic of the release year and genre. Analyses 2 investigates the relationship between the identified lyrics descriptors, popularity (listening counts and lyrics view counts), and release year.

Basic dataset and lyrics acquisition

For gathering song lyrics, we rely on the LFM-2b dataset 28 ( http://www.cp.jku.at/datasets/LFM-2b ) of listening events by last.fm users, since it is one of the largest publicly available datasets of music listening data. Last.fm is an online social network and streaming platform for music, where users can opt-in to share their listening data. It provides various connectors to other services, including Spotify, Deezer, Pandora, iTunes, and YouTube, through which users can share on last.fm what they are listening to on other platforms ( https://www.last.fm/about/trackmymusic ). While last.fm services are globally available, their user base is unevenly distributed geographically, with a strong bias towards the United States, Russia, Germany, United Kingdom, Poland, and Brazil. In fact, an analysis of a representative subset of last.fm users found that the users from these six countries comprise more than half of last.fm’s total user base 29 .

The LFM-2b dataset contains more than two billion listening records and more than fifty million songs by more than five million artists. We enrich the dataset with information about songs’ release year, genre, lyrics, and popularity information. For quantifying the popularity of tracks and lyrics, we distinguish between the listening count, i. e.,  the number of listening events in the LFM-2b dataset , and lyrics view count, i. e.,  the number of views of lyrics on the Genius platform ( https://genius.com ). Release year, genre information, and lyrics are obtained from the Genius platform. Genres are expressed by one primary genre and arbitrarily many additional genre tags. Lyrics on the Genius platform can be added by registered members and undergo an editorial process for quality checks. We use the polyglot package ( https://polyglot.readthedocs.io/ ) to automatically infer the language of the lyrics and consider only English lyrics. Adopting this procedure, we ultimately obtain complete information for 582,759 songs.

Lyrics style and emotion descriptors

Following the lines of previous research in the field of lyrics analysis 30 , 31 , 32 , we characterize lyrics based on stylistic (including descriptors of lexical, complexity, structure, and rhyme characteristics) and emotion descriptors. Lexical descriptors include, for instance, unique token ratio, repeated token ratio, pronoun frequency, line count, or punctuation counts, but also measures of lexical diversity 33 , 34 as these have been shown to be well-suited markers for textual style 35 . To quantify the diversity of lyrics, we compute the compression rate 36 , effectively capturing the repetitiveness of lyrics and several diversity measures. For structural descriptors of lyrics, we utilize the descriptors identified by Malheiro et al. 37 , which, for instance, include the number of times the chorus is repeated, the number of verses and choruses, and the alternation pattern of verse and chorus. For descriptors that capture rhymes contained in lyrics, we extract, for instance, the number of subsequent pairs of rhyming lines (i. e., clerihews), alternating rhymes, nested rhymes, or alliterations, but also rely on general descriptors such as the fraction of rhyming lyrics 30 as these have been shown to be characteristic for the style of lyrics 30 . For measuring readability, we use standard measures like the number of difficult words, or Flesch’s Reading Ease formula 38 . Furthermore, we extract emotional descriptors from lyrics by applying the widely used Linguistic Inquiry and Word Count (LIWC) dictionary 39 , which has also been applied to lyrics analyses 18 , 37 , 40 . We provide a complete list of all descriptors, including a short description and further information about how the descriptors are extracted in Table 1 .

Statistical analyses

Figure 1 provides an overview of the methodological framework used for the analyses presented. The two analyses conducted aim to (1)  investigate the evolution of descriptors over five decades by performing a release year regression task to identify the importance of descriptors, and (2)  investigate the interplay of lyrics descriptors, release year, and lyrics view count by performing a regression analysis on a dataset containing 12,000 songs, balanced for both genres and release years. The combination of these two analyses provides us with complementary findings; while the first analysis uses the entirety of our collected dataset and therefore derives general findings on descriptor importance, the second analysis, performed on a carefully balanced, reduced dataset, provides us with a more in-depth analysis on the strength of relationships of the individual lyrics and popularity descriptors and temporal aspects.

Analysis 1: evolution of descriptors and descriptor importance

In this analysis, we investigate which descriptors are most strongly correlated with the release year of a song. We expect that descriptors that correlate more strongly with the release year lead to better-performing regression models. Therefore, we train a release year regressor for each of the five genres. We are mostly interested in determining each descriptor’s importance, thereby identifying the descriptors that are most effective at predicting a linear trend across release years.

First, we perform z-score normalization of the descriptors. Subsequently, we remove multicollinear descriptors using the variance inflation factor (VIF). Here, we iteratively remove descriptors that exhibited a VIF higher than 5 until all of the remaining descriptors have a VIF lower than 5 (as also performed in Analysis 2). Preliminary analyses showed that the data associated with individual descriptors are heteroscedastic (i.e., the variance is not homogenous, but dependent on the release year). To overcome this bias, we use Huber’s M regressor 41 , a widely used robust linear regressor.

Notably, we perform these analyses on all available songs for the five genres for which we can successfully extract all descriptors (totaling to 353,320 songs). For each genre, we train the regression model and analyze the model’s performance and the computed regression coefficients to identify the most important descriptors for determining the release year of the songs. The models for this analysis are built in Python, using the statsmodels package 42 (via the Robust Linear Models ( RLM ) class; statsmodels version 0.14). The plots in Fig. 2 are generated using the Matplotlib library 43 (version 3.7.0) and used gaussian_kde of scipy.stats for the density computation 44 .

Analysis 2: interplay of lyrics descriptors, lyrics view count, and release year

In this analysis, we first assess whether lyrics’ view count is related to the underlying musical genre and to which extent such a connection might vary over time. Subsequently, we evaluate the evolution of pop music lyrics over time within each musical genre. We assume that lyric’s view count varies amongst musical genres, and these variations can be related to changes in the lyrics over time. Thus, to further deepen our understanding of the relationship between lyrics’ view count and genre, as well as whether the release date has a role in this relationship, we start by performing a multinomial logistic regression analysis considering genre as the dependent variable and the interaction between popularity and release year as predictors, where the number of views of the lyrics of a song is used to capture lyrics popularity. Subsequently, since the lyrics from different musical genres might develop differently over time, to investigate the relationship between release date and lyrics descriptors, a different linear model (considering release year as a dependent variable) is fitted for each genre individually. To find the model that best fits the data for each genre, we apply the backward stepwise method as appropriate in our exploratory study. From the stepwise methods, we consider backward elimination over forward selection to minimize the risk of excluding predictors involved in suppressor effects 45 .

To carry out a fair comparison, before starting the analyses, the collected dataset is randomly downsampled to guarantee a balanced distribution of songs across musical genres and years. To enable this, due to the highly skewed distribution of data over time, only the last three decades (1990–2020) could be considered for this analysis. The means and standard deviations of both the whole and the downsampled datasets are mostly comparable across the musical genres. There is, however, a larger difference between the standard deviations for pop and country. Unstandardized means and standard deviations for the whole dataset vs. the downsampled for both the lyrics views and the play count are shown in Table  2 .

A total of 2400 items, i. e., songs, are considered for each musical genre. Due to the high diversity across the measurement unit of the predictors, i. e., popularity scores and lyrics descriptors, these are z-score normalized and multicollinear outliers are identified by computing Mahalanobis distance 46 and subsequently removed. Highly correlated descriptors are also discarded until all of them presented a variance inflation factor less than 5. The results from the multinomial logistic regression show that lyrics view count differs across decades for the evaluated genres. Therefore, we investigate the relation between lyrics view count and particular lyrics descriptors by also fitting a multiple linear regression model containing the interaction between the lyrics view count and the other predictors. However, the model with the interaction is not significantly better than the baseline model (for all the musical genres, analysis of variance yields \(p>.01\) ); thus, only the model without interaction is considered in the evaluation of the multiple linear regression results for each genre. The statistical models of Analysis 2 are built on the statistical software R 47 version 4.1.2 (2021-11-01). Multinomial logistic regression is carried out using the mlogit package 48 (version 1.1-1) while the linear models for each genre are fitted with the nlme package 49 (version 3.1-155) and multiple comparisons across genres are performed with the multcomp package 50 (version 1.4-25). The graphic shown in Fig. 3 is generated with the ggplot2 package 51 (version 3.4.3).

Results and discussion

Analysis 1: evolution of lyrics and descriptor importance.

In this analysis, we are particularly interested in the most important and hence, most characteristic features for the task of per-genre release year regression. The top ten descriptors (i.e., the descriptors with the highest regression coefficients) for each of the five genres in our dataset are given in Table 3 . The R \(^2\) values obtained per genre are 0.0830 for pop, 0.0717 for rock, 0.3374 for rap, 0.2600 for R&B, and 0.1267 for country.

We can identify several descriptors that are among the most important for multiple genres. The number of unique rhyme words is among the top ten descriptors across all five genres analyzed. The number of dots and repeated line ratio descriptors are among the top descriptors in pop, rock, rap, and R&B. Four descriptors are featured in the top descriptors of three genres each ( ratio verses to sections , ratio choruses to sections , average token length , number of difficult words (words with three or more syllables)). Interestingly, when considering a higher abstraction level (i.e., descriptor categories such as lexical, emotion, structure, rhyme, readability, or diversity), we observe that lexical descriptors emerge as the predominant category across all five genres. Furthermore, at least one rhyme descriptor is also among the top descriptors for each genre. Four out of the five genres also feature at least one structural descriptor, while R&B does not. While for pop, the top-10 descriptors contain lexical, structural, and rhyme descriptors, rock additionally features a readability descriptor. For rap, country, and R&B, five categories of descriptors are within the top 10. Interestingly, descriptors measuring the lexical diversity of lyrics are among the top descriptors for rap ( Summer’s S that essentially captures the ratio of token types and token count; and Measure of Textual Lexical Diversity MTLD that captures the average length of sequential token strings that fulfill a type-token-ratio threshold), country ( compression ratio , i.e., ratio of the size of zlib compressed lyrics compared to the original, uncompressed lyrics; and Summer’s S ), and R&B ( Summer’s S ). Emotion descriptors only occur among the most important descriptors for country ( positive emotion ) and R&B ( positive emotions and anger ). Readability descriptors are among the top descriptors for rock, rap, and notably, the second most important descriptor for R&B (number of difficult words ).

figure 2

Evolution over time for the descriptors repeated line ratio and ratio chorus to sections for each genre. The linear regression lines (in red) show the evolution of descriptor values over time for each descriptor and genre (Huber’s M robust regression models are trained individually for each descriptor and genre combination). Blue colors denote the density of data points in a given region. R \(^2\) and p-values are provided in the yellow boxes.

Figure 2 shows the distribution of descriptor values for repeated line ratio and ratio of choruses to sections over time, separately for each of the five genres. Each genre is analyzed separately, with a robust regression model trained for each descriptor-genre combination; the resulting regression lines are depicted in red. The repeated line ratio increases over time for all five genres, indicating that lyrics are becoming more repetitive. This further substantiates previous findings that lyrics are increasingly becoming simpler 11 and that more repetitive music is perceived as more fluent and may drive market success 52 . The strongest such increase can be observed for rap (slope \(m = 0.002516\) ), whereas the weakest increase is displayed by country ( \(m = 0.000640\) ). The ratio of chorus to sections descriptor behaves similarly across different genres. The values for this descriptor have increased for all five genres. This implies that the structure of lyrics is shifting towards containing more choruses than in the past, in turn contributing to higher repetitiveness of lyrics. We see the strongest growth in the values of this descriptor for rap ( \(m = 0.008703\) ) and the weakest growth for R&B ( \(m = 0.000325\) ). The fact that the compression ratio descriptor (not shown in the figure) also shows an increase in all genres except R&B further substantiates the trend toward more repetitive lyrics. Another observation is that the lyrics seem to become more personal overall. The pronoun frequency is increasing for all genres except one ( country with \(m = -0.000145\) ). The strongest increase can be observed for rap ( \(m = 0.000926\) ), followed by pop ( \(m = 0.000831\) ), while rock ( \(m = 0.000372\) ) and R&B ( \(m = 0.000369\) ) show a moderate increase. Furthermore, our analysis shows that lyrics have become angrier across all genres, with rap showing the most profound increase in anger ( \(m = 0.015996\) ). Similarly, the amount of negative emotions conveyed also increases across all genres. Again rap shows the highest increase ( \(m = 0.021701\) ), followed by R&B ( \(m = 0.018663\) ), while country shows the lowest increase ( \(m = 0.000606\) ). At the same time, we witness a decrease in positive emotions for pop ( \(m = -0.020041\) ), rock ( \(m = -0.012124\) ), country ( \(m = -0.021662\) ), and R&B ( \(m = -0.048552\) ), while rap shows a moderate increase ( \(m = 0.000129\) ).

Analysis 2: interplay of lyrics descriptors, view counts, and release year

The second set of analyses first aims at investigating the interplay between lyrics descriptors, release year, and listening as well as lyrics view count. The employed multinomial logistic regression fits significantly better the data than the baseline model, i. e., a null model without predictors, indicating an increase in the explained variability (likelihood ratio chi-square of 314.56 with a \(p<.0001\) and McFadden pseudo R \(^2\) of 0.01).

To assess the effect of the predictors, the genre class rap (i. e., the one with the highest average lyrics view count), is considered as the reference class of the dependent variable. Our results show that the probability of a song being from country or rock instead of rap, according to its lyrics view count , varies across decades. As lyrics view count increases, the effect of the year slightly augments (in 1.07 odds) the probability of a song being from country instead of from rap: \(\beta (SE)=0.07(0.02)\) , \(z=3.29\) , \(p=.0009\) . Differently, as lyrics view count increases, the effect of a raising year decreases (in 0.94 odds) the probability of a song being from rock instead of from rap: \(\beta (SE)=-0.05(0.01)\) , \(z=-5.89\) , \(p<.0001\) . In other words, the lyrics of older rock songs are generally more popular than new ones in comparison to rap, and vice versa for country. This is visualized in Fig. 3 , showing the estimated effects of the multinomial logistic regression model. The interaction between lyrics view count and release year did not show a significant effect for rap with respect to pop and R&B (cf. date * popularity for pop and R&B). Differently, for country and rock, the estimated regression coefficients are positive and negative, respectively (cf. date * popularity for country and rock). This shows that compared to rap, lyrics’ popularity increases over time for country, while decreasing for rock. The same analysis is performed considering song listening count instead of lyrics view count, i. e., we perform multinomial logistic regression considering genre as the dependent variable and the interaction between listening count and release date as predictors. This analysis shows that the release date does not affect the relationship between listening count and genre , as no significant effects are shown. While track listening counts do not show any effects, lyrics view counts do indeed show effects; suggesting that for some musical genres, fans’ interest in lyrics goes beyond their listening consumption. In other words, while the play counts for a given genre might not significantly differ, when it comes to the listeners’ interest in lyrics, clear patterns are displayed: rock fans mostly enjoy lyrics from older songs; country fans are more interested in new songs’ lyrics. However, the small determination coefficient of the multinomial logistic regression shows that the explanatory power of the model is limited; thus, documented significant p-values might partially result, due to the huge sample size, from random noise.

figure 3

Forest plot displaying the estimated multinomial logistic regression coefficients (standardized beta) for the prediction of musical genre. As reference class, rap i. e., the genre with the highest average lyrics view count, is considered.

In Table 4 , the results from the individual multiple linear regression performed for each genre are given. We find that for rap, the most variance of the release year (the dependent variable) can be explained by the predictors. \(32\%\) (i. e., \(R^{2}=0.32\) ) of the variance in the release year for rap can be explained by the descriptors extracted from the lyrics. This is not surprising as rap, characterized by the use of semi-spoken rhymes, is a musical style that has grown in the context of practices marked by high-level linguistic competencies, such as competitive verbal games 53 . Indeed, among the evaluated musical genres, rap is the one in which lyrics play the most prominent role. The second genre for which a higher amount of variance in the release year is explained by the descriptors extracted from the lyrics is R&B ( \(R^{2}=0.20\) ). This might be explained, from a musicological perspective, by taking into account the relationship between R&B and other musical genres. For instance, R&B was simplified concerning the lyrics (besides the music) by eliminating adult-related themes and topics 54 . As such, it was a precursor to the development of rock-and-roll, which explains the higher importance of lyrics in modeling the evolution of R&B with respect to rock. Note that although we utilize R&B as a general musical genre, other terms subsequently introduced relating to R&B, such as soul , are also considered under the umbrella of R&B. This is particularly important as we investigate music released in more recent decades (1990–2020) with respect to the time when the term R&B was originally coined. At the same time, it highlights the importance of historically contextualizing the musical genres assessed in the comparison, since beyond their intrinsic characteristics (e. g., lyrics having a central role in rap), also their heterogeneous nature, in this case, R&B being more heterogeneous than rock e.g., blues and funk being highly repetitive, while soul has undergone substantial changes and is now lyrically in pop song form), might also have an impact in the role played by lyrics. For pop, rock, and country, the amount of variance in release date explained by the predictors is lower than for rap and R&B, with an \(R^{2}\) value of 0.09 for pop and rock, 0.11 for country. This indicates that, unlike rap, and to some extent R&B, lyrics might not be a very meaningful indicator of the development of other musical genres.

The results show that lyrics’ view count has a relevant effect in predicting the release years of songs only for R&B and rock music. For R&B, there is a positive relationship between release year and lyrics view count : \(\beta = 0.30\) , \(p=.003\) ; cf. lyrics view count for rap in Table 4 . This indicates that new songs are more popular than old ones in terms of lyrics views. Differently, for rock, as expected from the outcomes obtained in the multinomial logistic regression, a strong negative relationship between i and i is shown: \(\beta = -1.47\) , \(p<.001\) ; cf. lyrics view count for rock in Table 4 . This indicates that old rock songs are more popular than recent ones, which can be interpreted from a sociological perspective. Unlike pop, which can be seen as more “commercial” and “ephemeral”, targeting a young audience and whose value is typically measured by record sales, rock has commonly targeted a middle-class audience more interested in tradition and often (ideologically) defeating commercialism 55 .

Properties of the lyrics related to complexity and readability, i. e., indicators of the repetitiveness and the difficulty associated with the understanding of a written text, respectively, seem to exhibit meaningful changes over time for rap, and to a lesser extent for pop, rock, and R&B. Confirming previous work 11 , the complexity and difficulty of the lyrics seem to decrease with time for some musical genres. Concerning complexity, this is displayed by the positive \(\beta\) for compression ratio (essentially capturing the repeatability of lyrics) shown by both rap and rock (cf. \(\beta = 1.15\) and \(\beta = 0.82\) , respectively, in Table 4 ). This indicates that the lyrics of these two genres become easier to comprehend over time, something that can be interpreted as a sign of increasing repetitiveness and, therefore, simplicity. However, the opposite trend is shown for R&B (cf. \(\beta = -0.73\) , in Table 4 ), which suggests that the simplification over time might depend on the musical genre; indeed, this descriptor is not relevant neither for pop nor for country. The decline in lyrics’ difficulty observed over time for rap is confirmed by the negative \(\beta\) for Simple Measure of Gobbledygook (SMOG) readability measure (in a sample of 30 sentences, words with three or more syllables are counted and used to compute the final SMOG score). This indicates a detriment in complexity concerning the lyric’s readability (cf. \(\beta = -0.64\) in Table 4 ). This contradicts, to some extent, the increasing use of difficult words , over time, shown for rap ( \(\beta = 0.57\) ), while supporting the increase in complexity shown for R&B ( \(\beta = 2.08\) ); cf. \(\beta\) for difficult words in Table 4 . This contradiction supports, however, the conclusion extracted from the compression ratio , which shows that the lyrics become more repetitive. Thus, it seems that the increase (in absolute terms) of difficult words is only due to repetitions in the lyrics. Differently, when weighting the number of difficult words according to the number of sentences (which is performed when computing SMOG ), the effect is negative, meaning that in proportion to the number of sentences, the complexity of the text actually decreases with time. The increase in readability over time is also confirmed for rock, as shown by the positive slope for Dale-Chall readability score (cf. \(\beta = 1.07\) in Table 4 ; Dale-Chall is computed based on a list of 3000 words that fourth-graders should be familiar with. The number of words contained in the list of easy words is counted and used as input to the score computation.).

As expected, the results also show that lexical descriptors have a more prominent role in rap, i. e., the musical genre for which lyrics are most relevant. Indeed, when calculating the predictors block-wise across feature types, this is the type of feature showing the highest adjusted R \(^{2}\) : for rap (0.22), followed by R&B (0.13). Block-wise adjusted R \(^{2}\) per genre for each feature type are as follows. For rap: Complexity (0.04), Readability (0.06), Lexical (0.22), Structure (0.10), Rhyme (0.13), Emotion (0.02); for pop: Readability (0.004), Lexical (0.06), Structure (0.02), Rhyme (0.01), Emotion (0.01); for rock: Complexity (0.01), Readability (0.01), Lexical (0.04), Structure (0.03), Rhyme (0.01); for R&B: Complexity (0.01), Readability (0.01), Lexical (0.13), Structure (0.02), Rhyme (0.02), Emotion (0.04); for country: Readability (0.014), Lexical (0.09), Structure (0.02), Rhyme (0.02), Emotion (0.01). Repeated line ratio is the only descriptor showing a meaningful impact for all the genres, confirming the results of Analysis 1. The relationship between this descriptor and the release year is positive for all of them (cf. positive \(\beta\) in Table 4 ), which indicates that lyrics become more repetitive over time in all the evaluated genres. This trend is confirmed by the negative relationship between release year and the Maas score, a measure for lexical diversity proposed by Maas 56 (the score models the type-token ratio (i. e., the ratio of the total number of words and the total number of unique terms) on a log scale)., shown for all the genres except country (for which this descriptor is not included in the model as it did not show a significant contribution), which indicates that vocabulary richness decreases with time (cf. negative \(\beta\) for Maas in Table 4 ). As already mentioned, step-wise backward elimination is used to find the best-fitting model for each musical genre. The trend toward simplicity over time can also be observed in the structure, which shows a decrease in the number of sections , most prominently shown for R&B and rock (cf. \(\beta = -0.75\) and \(\beta = -0.70\) , respectively in Table 4 ); as well as a general increment (except for country) in the ratio between verses to chorus and verses to sections (cf. positive \(\beta\) for ratio verses to sections and ratio chorus to sections in Table 4 ). Similarly, the results for rhyme-related descriptors further confirm the tendency towards simpler lyrics over time for all musical genres. This is particularly shown by the increment of the rhyme percent in rap in R&B (cf. \(\beta = 1.34\) and \(\beta = 0.68\) , respectively) and by a detriment in the number of rhyme words (cf. negative \(\beta\) for all the genres), which shows a decline in the rhymes’ variety over time. However, for block-wise predictors, slightly higher adjusted R \(^{2}\) for structure and rhyme are only shown for rap (0.10 and 0.13, respectively), but not for the other musical genres.

Concerning emotion descriptors, the musical genre in which these play the most important role is rap, followed by R&B. For R&B the results show that the content of the lyrics becomes more negative with time, as shown by the increase in concepts related to anger and a detriment in positive emotions (cf. \(\beta = 1.75\) and \(\beta =-0.86\) , respectively, in Table 4 ). Differently, for rap, there is a general increase in the use of emotion-related words with time, both negative and positive (cf. positive \(\beta\) for all the emotion descriptors), which indicates a tendency towards the use of more emotional words. Confirming outcomes from previous work 13 , as shown for R&B, also for pop and country, a tendency toward more negative lyrics is displayed over time; for rock, emotion seems to play a negligible role in the evolution of lyrics. As a final note, we would like to emphasize that since both the overall and block-wise adjusted R \(^{2}\) are very low, these results should be interpreted cautiously, and taken as tendencies rather than strong differences and could partly result from partly non-randomness in subsampling.

Result summary for both studies

Regarding RQ1 (Which trends can we observe when correlating multifaceted lyrics descriptors with temporal aspects in an evolution analysis?), we come to the following conclusion: Despite minor contradictory outcomes concerning complexity and readability for rap and rock in comparison to pop and R&B, the interpretation of the lyric’s lexical component, structure, and rhyme, for all investigated genres, generally shows that lyrics are becoming simpler over time 11 , as shown both analyses. This is shown by a decline in vocabulary richness for some specific genres, i. e., rap and rock, and by a general increase in repetitiveness for all the evaluated musical styles. Besides this, lyrics seem to become more emotional with time for rap, and less positive for R&B, pop, and country. Also, we observe a trend towards angrier lyrics across all genres. Potential reasons for the trend towards simpler lyrics are discussed by Varnum et al. 11 . They speculate that this might also be related to how music is consumed, technological innovation, or the fact that music is mostly listened to in the background. As for RQ2 (Which role does the popularity of songs and lyrics play in this scenario?), we conclude that while song listening counts do not show any effects, lyrics view counts do indeed show effects. This suggests that for rap, rock, and country, lyrics play a more pronounced role than for other genres and that listeners’ interest in lyrics goes beyond musical consumption itself.

Limitations

While our analyses resulted in interesting insights, they have certain limitations, which we would like to discuss in the following. Most of these relate to the various challenges pertaining to data acquisition, and the resulting biases in the data we investigated.

The two main data sources for our investigation are last.fm and Genius. Given the nature and history of these platforms, in particular last.fm, the studied LFM-2b dataset is affected by community bias and popularity bias. As for community bias , while last.fm does not release official statistics of their users, research studies conducted on large amounts of publicly available demographic and listening data have shown that the last.fm’s user base is not representative of the global (or even Western) population of music listeners 28 , 29 . In particular, the last.fm community represents music aficionados who rely on music streaming for everyday music consumption. They are predominantly male and between 20 and 30 years old. The community is also strongly dominated by users from the US, Russia, Germany, and the UK 29 , whose music taste does not generalize to the population at large 57 . The findings of our analyses, in particular Analysis 2, which investigates user-generated music consumption data, are therefore valid only for this particular subset of music listeners. Also directly related to our data sources, and, particularly, the Genius platform is the genre information used in our analyses. Annotators and editors on the Genius platform may assign one of six high-level genres and an arbitrary number of so-called secondary tags (i.e., subgenres) to each song. The alignment of genre and subgenre assignments is quality-checked by the community, no genre hierarchy is used to check the validity of the genre and subgenre assignments, which can introduce malformed genre assignments.

In addition, the last.fm data on listening counts and the Genius data on lyrics view counts are prone to popularity bias . More precisely, these counts for songs released before the emergence of the platforms (2002 and 2009, respectively, for last.fm and Genius) underestimate the true frequencies of listening and lyrics viewing. On the one hand, this is due to the platforms’ demographic structure of users (see above); but also because a majority of vinyl and cassette (and even some CD) releases have never found their way into these digital online platforms. This kind of popularity bias in our investigated data might have significantly influenced the trends identified for the 1970s to 2000s. However, only Analysis 2 might have been affected by this since popularity estimates are not used in Analysis 1. And even for Analysis 2, popularity values are Z-score normalized, which to some extent accounts for this kind of bias. Still, it should also be pointed out that the randomization strategy might have led to a sampling bias in terms of popularity. This might have partly affected the results, eventually introducing a bias for the genres pop and country, which due to the mentioned limitation, should be taken particularly cautiously.

Both limitations, related to demographic bias and popularity bias, could be overcome by resorting to other data sources, notably the often-used Billboard Charts. However, using this data would introduce other distortions, among others, a highly US-centric view of the world, a much more limited sample size, and a lower granularity of the popularity figures (only ranks instead of play counts). In addition, Billboard Charts are only indicative of music consumption, not for lyrics viewing, which we particularly study in this paper.

Another limitation of the work at hand is the restriction to English lyrics . This choice had to be made to ensure a language-coherent sample of songs and, consequently, the comparability of results. While some of the descriptors could have been computed for other languages as well, due to the different characteristics of languages (e.g., different lexical structures), a cross-language comparison of the descriptors would not be meaningful. Also, most of the resources required to compute the readability scores and emotional descriptors are only available for English. Nevertheless, in future work, we could include more languages and conduct analyses on songs within each language class on a limited set of suited descriptors.

Furthermore, we also acknowledge the changing record distribution landscape, a further limitation of this work. These changes are shown, for instance, by the IFPI’s Global Music Report 2023 58 , provides evidence of the decline of physical sales revenue vs. the steady increase in streaming revenue in the last two decades. This not only changed, for instance, the number of songs on an album as this was physically restricted on vinyl or CDs, but also the way songs are sequenced 59 . On streaming platforms, a song is considered consumed if it is played for at least 30 seconds. Hence, artists aim to start their songs with easily identifiable melodies and lyrics.

Regarding the models employed, we note that these models assume that the changes in individual features across the analyzed genres are linear. While the change of some of the descriptors has been shown to be linear (for instance, lyrics simplicity (compressibility) 11 or brightness, happiness, or danceability 23 ), this might not be the case for all of the descriptors we employ in our studies. In fact, for instance, concreteness has been shown to decline until the 1990s and then increase 9 .

Our study examines the evolution of song lyrics over five decades and across five genres. From a dataset of 353,320 songs, we extracted lexical, linguistic, structural, rhyme, emotion, and complexity descriptors and conducted two complementary analyses. In essence, we find that lyrics have become simpler over time regarding multiple aspects of lyrics: vocabulary richness, readability, complexity, and the number of repeated lines. Our results also confirm previous research that found that lyrics have become more negative on the one hand, and more personal on the other. In addition, our experimental outcomes show that listeners’ interest in lyrics varies across musical genres and is related to the songs’ release year. Notably, rock listeners enjoy lyrics from older songs, while country fans prefer lyrics from new songs.

We believe that the role of lyrics has been understudied and that our results can be used to further study and monitor cultural artifacts and shifts in society. For instance, we could combine and compare the studies on the changing sentiment in societies and shifts in the use of emotionally loaded words and the sentiment expressed in the lyrics consumed by the different audiences (age, gender, country/state/region, educational background, economical status, etc.). From a computational perspective, establishing a deeper understanding of lyrics and their evolution can inform further tasks in music information retrieval and recommender systems. For instance, existing user models could be extended to also include the lyric preferences of users, allowing for better capturing of user preferences and intent, and ultimately improving retrieval tasks such as personalized music access and recommender systems.

Data availability

The datasets generated and analyzed during the current study are available on Zenodo: https://doi.org/10.5281/zenodo.7740045 . The source code utilized for our analyses is available at https://github.com/MaximilianMayerl/CorrelatesOfSongLyrics .

Donald, R., Kreutz, G., Mitchell, L. & MacDonald, R. What is music health and wellbeing and why is it important? In Music, Health, and Wellbeing 3–11 (Oxford University Press, 2012).

Schäfer, T., Sedlmeier, P., Städtler, C. & Huron, D. The psychological functions of music listening. Front. Psychol. 4 , 1–34. https://doi.org/10.3389/fpsyg.2013.00511 (2013).

Article   Google Scholar  

Lonsdale, A. J. & North, A. C. Why do we listen to music? A uses and gratifications analysis. Br. J. Psychol. 102 , 108–134. https://doi.org/10.1348/000712610x506831 (2011).

Article   PubMed   Google Scholar  

Gross, J. Emotion regulation: Conceptual and empirical foundations. In Handbook of Emotion Regulation 2nd edn (ed. Gross, J.) 1–19 (The Guilford Press, 2007).

Google Scholar  

Demetriou, A. M., Jansson, A., Kumar, A. & Bittner, R. M. Vocals in music matter: The relevance of vocals in the minds of listeners. In Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 514–520 (ISMIR, 2018).

Ali, S. O. & Peynircioğlu, Z. F. Songs and emotions: Are lyrics and melodies equal partners?. Psychol. Music 34 , 511–534 (2006).

Brattico, E. et al. A functional MRI study of happy and sad emotions in music with and without lyrics. Front. Psychol. 2 , 308 (2011).

Article   PubMed   PubMed Central   Google Scholar  

Moi, C. M. F. Rock poetry: The literature our students listen to. J. Imagin. Lang. Learn. 2 , 56–59 (1994).

Choi, K. & Stephen Downie, J. A. Trend analysis on concreteness of popular song lyrics. In 6th International Conference on Digital Libraries for Musicology 43–52 (ACM, 2019). https://doi.org/10.1145/3358664.3358673 .

The nobel prize in literature 2016—Bob Dylan. https://www.nobelprize.org/prizes/literature/2016/dylan/facts/ (2016).

Varnum, M. E. W., Krems, J. A., Morris, C., Wormley, A. & Grossmann, I. Why are song lyrics becoming simpler? A time series analysis of lyrical complexity in six decades of American popular music. PLoS ONE 16 , e0244576. https://doi.org/10.1371/journal.pone.0244576 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lambert, B. et al. The pace of modern culture. Nat. Hum. Behav. 4 , 352–360. https://doi.org/10.1038/s41562-019-0802-4 (2020).

Brand, C. O., Acerbi, A. & Mesoudi, A. Cultural evolution of emotional expression in 50 years of song lyrics. Evolut. Hum. Sci. 1 , e11. https://doi.org/10.1017/ehs.2019.11 (2019).

Eastman, J. T. & Pettijohn, T. F. Gone country: An investigation of Billboard country songs of the year across social and economic conditions in the United States. Psychol. Pop. Media Cult. 4 , 155–171. https://doi.org/10.1037/ppm0000019 (2015).

Pettijohn, T. F. & Sacco, D. F. The language of lyrics. J. Lang. Soc. Psychol. 28 , 297–311. https://doi.org/10.1177/0261927x09335259 (2009).

Schedl, M. Genre differences of song lyrics and artist wikis: An analysis of popularity, length, repetitiveness, and readability. In The World Wide Web Conference on—WWW ’19 3201–3207 (ACM, 2019). https://doi.org/10.1145/3308558.3313604 .

Dodds, P. S. & Danforth, C. M. Measuring the happiness of large-scale written expression: Songs, blogs, and presidents. J. Happiness Stud. 11 , 441–456. https://doi.org/10.1007/s10902-009-9150-9 (2009).

DeWall, C. N., Pond, R. S., Campbell, W. K. & Twenge, J. M. Tuning in to psychological change: Linguistic markers of psychological traits and emotions over time in popular U.S. song lyrics. Psychol. Aesthet. Creativity Arts 5 , 200–207. https://doi.org/10.1037/a0023195 (2011).

Berger, J. & Packard, G. Are atypical things more popular?. Psychol. Sci. 29 , 1178–1184. https://doi.org/10.1177/0956797618759465 (2018).

Kim, J., Demetriou, A. M., Manolios, S., Tavella, M. S. & Liem, C. C. S. Butter lyrics over hominy grit: Comparing audio and psychology-based text features in MIR tasks. In Proceedings of the 21th International Society for Music Information Retrieval Conference, ISMIR 861–868 (2020).

Napier, K. & Shamir, L. Quantitative sentiment analysis of lyrics in popular music. J. Pop. Music Stud. 30 , 161–176 (2018).

Christenson, P. G., de Haan-Rietdijk, S., Roberts, D. F. & ter Bogt, T. F. What has America been singing about? Trends in themes in the us top-40 songs: 1960–2010. Psychol. Music 47 , 194–212 (2019).

Interiano, M. et al. Musical trends and predictability of success in contemporary songs in and out of the top charts. R. Soc. Open Sci. 5 , 171274. https://doi.org/10.1098/rsos.171274 (2018).

Article   PubMed   PubMed Central   ADS   Google Scholar  

Schedl, M. The LFM-1b dataset for music retrieval and recommendation. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval (ACM, 2016). https://doi.org/10.1145/2911996.2912004 .

Vigliensoni, G. & Fujinaga, I. The music listening histories dataset. In Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 96–102 (2017).

Huang, J. et al. Taxonomy-aware multi-hop reasoning networks for sequential recommendation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM ’19 573–581 (ACM, 2019). https://doi.org/10.1145/3289600.3290972 .

Huang, J., Zhao, W. X., Dou, H., Wen, J. -R. & Chang, E. Y. Improving sequential recommendation with knowledge-enhanced memory networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (2018).

Schedl, M. et al. LFM-2b: A dataset of enriched music listening events for recommender systems research and fairness analysis. In ACM SIGIR Conference on Human Information Interaction and Retrieval 337–341 (2022).

Schedl, M. Investigating country-specific music preferences and music recommendation algorithms with the lfm-1b dataset. Int. J. Multim. Inf. Retr. 6 , 71–84. https://doi.org/10.1007/s13735-017-0118-y (2017).

Mayer, R., Neumayer, R. & Rauber, A. Combination of audio and lyrics features for genre classification in digital audio collections. In Proceeding of the 16th ACM International Conference on Multimedia 337–342 (ACM Press, 2008). https://doi.org/10.1145/1459359.1459382 .

Fell, M. Lyrics classification. Master’s Thesis, Saarland University (2014).

Hu, X. & Downie, J. S. Improving mood classification in music digital libraries by combining lyrics and audio. In Proceedings of the 10th annual joint conference on Digital libraries—JCDL ’10 159–168 (ACM Press, 2010). https://doi.org/10.1145/1816123.1816146 .

Tweedie, F. J. & Baayen, R. H. How variable may a constant be? Measures of lexical richness in perspective. Comput. Hum. 32 , 323–352. https://doi.org/10.1023/a:1001749303137 (1998).

McCarthy, P. M. & Jarvis, S. MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behav. Res. Methods 42 , 381–392. https://doi.org/10.3758/brm.42.2.381 (2010).

Stamatatos, E. A survey of modern authorship attribution methods. J. Am. Soc. Inf. Sci. Technol. 60 , 538–556 (2009).

Morris, C. Are pop lyrics getting more repetitive? https://pudding.cool/2017/05/song-repetition (2017).

Malheiro, R., Panda, R., Gomes, P. & Paiva, R. P. Emotionally-relevant features for classification and regression of music lyrics. IEEE Trans. Affective Comput. 9 , 240–254. https://doi.org/10.1109/taffc.2016.2598569 (2018).

DuBay, W. H. The principles of readability. Online Submission, Institute of Education Sciences (2004).

Tausczik, Y. R. & Pennebaker, J. W. The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29 , 24–54. https://doi.org/10.1177/0261927x09351676 (2009).

Mihalcea, R. & Strapparava, C. Lyrics, music, and emotions. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning 590–599 (2012).

Huber, P. J. Robust Statistics Vol. 523 (John Wiley & Sons, 2004).

Seabold, S. & Perktold, J. Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference Vol. 57, 10–25080 (Austin, 2010).

Hunter, J. D. Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 9 , 90–95 (2007).

Virtanen, P. et al. Scipy 1.0: Fundamental algorithms for scientific computing in python. Nat. Methods 17 , 261–272 (2020).

Field, A., Miles, J. & Field, Z. Discovering Statistics Using R (Sage Publications, 2012).

Bagheri, A., Midi, H. & Imon, A. R. Two-step robust diagnostic method for identification of multiple high leverage points. J. Math. Stat. 5 , 97–106. https://doi.org/10.3844/jms2.2009.97.106 (2009).

R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2021).

Croissant, Y. Estimation of random utility models in R: The mlogit package. J. Stat. Softw. 95 , 1–41 (2020).

Pinheiro, J. et al. Package ‘nlme’. Linear Nonlinear Mixed Effects Models Version 3 , 274 (2017).

Hothorn, T. et al. Package ‘multcomp’. Simultaneous inference in general parametric models. Project for Statistical Computing, Vienna, Austria (2016).

Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).

Book   Google Scholar  

Nunes, J. C., Ordanini, A. & Valsesia, F. The power of repetition: Repetitive lyrics in a song increase processing fluency and drive market success. J. Consum. Psychol. 25 , 187–199 (2015).

Toop, D., Cheney, C. & Kajikawa, L. Rap. In New Grove Music online (Oxford music online).

Rye, H. Rhythm-and-blues. In New Grove Music online (Oxford music online).

Middleton, R. Rock. In New Grove Music online (Oxford music online).

Mass, H.-D. Über den Zusammenhang zwischen Wortschatzumfang und Länge eines Textes. Zeitschrift fur Literaturwissenschaft und Linguistik 2 , 73 (1972).

Lamere, P. Social tagging and music information retrieval. J. New Music Res. 37 , 101–114 (2008).

International Federation of the Phonographic Industry. Global music report 2023. https://globalmusicreport.ifpi.org/ (2023).

Rolling Stone. Think you have a hit? Make sure it’s the first song on your album. https://www.rollingstone.com/music/music-features/why-the-first-song-on-the-album-is-the-best-803283/ (2019).

Miller, G. A. WordNet: A lexical database for English. Commun. ACM 38 , 39–41 (1995).

Nielsen, F. Å. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903 (2011).

Hu, M. & Liu, B. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 168–177 (2004).

Wilson, T., Wiebe, J. & Hoffmann, P. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing 347–354 (2005).

Go, A., Bhayani, R. & Huang, L. Twitter sentiment classification using distant supervision. CS224N project report Stanford 1 , 2009 (2009).

Sebastiani, F. & Esuli, A. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th International Conference on Language Resources and Evaluation 417–422 (European Language Resources Association (ELRA), 2006).

Mohammad, S. M. & Turney, P. D. NRC emotion lexicon. Natl. Res. Counc. Can. 2 , 234 (2013).

Dodds, P. S., Harris, K. D., Kloumann, I. M., Bliss, C. A. & Danforth, C. M. Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLoS ONE 6 , e26752 (2011).

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Download references

Acknowledgements

This research was funded in whole, or in part, by the Austrian Science Fund (FWF): 10.55776/P33526 ( https://doi.org/10.55776/P33526 ) and 10.55776/DFH23 ( https://doi.org/10.55776/DFH23 ). For open access purposes, the author has applied a CC BY public copyright licence to any author accepted manuscript version arising from this submission.

Author information

These authors contributed equally: Emilia Parada-Cabaleiro and Maximilian Mayerl.

Authors and Affiliations

Multimedia Mining and Search Group, Institute of Computational Perception, Johannes Kepler University Linz, Linz, Austria

Markus Schedl

Department of Music Pedagogy, Nuremberg University of Music, Nuremberg, Germany

Emilia Parada-Cabaleiro

AI Lab, Human-centered AI Group, Linz Institute of Technology, Linz, Austria

Stefan Brandl & Markus Schedl

Department of Computer Science, University of Innsbruck, Innsbruck, Austria

Maximilian Mayerl & Eva Zangerle

Austrian Research Institute for Artificial Intelligence, Vienna, Austria

Marcin Skowron

Graz University of Technology, Graz, Austria

Elisabeth Lex

You can also search for this author in PubMed   Google Scholar

Contributions

E.P.-C.: conceptualization, methodology, software, investigation, writing; M.M.: conceptualization, methodology, software, investigation, writing; S.B.: data curation; M.S.: conceptualization, methodology, software, investigation, writing; M.S.: conceptualization, methodology, writing; E.L.: conceptualization, methodology, writing; E.Z.: conceptualization, methodology, software, investigation, writing.

Corresponding author

Correspondence to Eva Zangerle .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Parada-Cabaleiro, E., Mayerl, M., Brandl, S. et al. Song lyrics have become simpler and more repetitive over the last five decades. Sci Rep 14 , 5531 (2024). https://doi.org/10.1038/s41598-024-55742-x

Download citation

Received : 30 March 2023

Accepted : 27 February 2024

Published : 28 March 2024

DOI : https://doi.org/10.1038/s41598-024-55742-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Song lyrics

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research paper song lyrics

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Why are song lyrics becoming simpler? a time series analysis of lyrical complexity in six decades of American popular music

Roles Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

* E-mail: [email protected] (MEWV); [email protected] (JAK); [email protected] (CM); [email protected] (IG)

Affiliation Department of Psychology, Arizona State University, Tempe, Arizona, United States of America

ORCID logo

Roles Conceptualization, Writing – original draft, Writing – review & editing

Affiliation Department of Psychology, Oklahoma State University, Stillwater, OK, United States of America

Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Writing – original draft, Writing – review & editing

Affiliation Toronto, Canada

Roles Data curation, Validation, Visualization, Writing – review & editing

Roles Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

Affiliation Department of Psychology, University of Waterloo, Waterloo, ON, Canada

  • Michael E. W. Varnum, 
  • Jaimie Arona Krems, 
  • Colin Morris, 
  • Alexandra Wormley, 
  • Igor Grossmann

PLOS

  • Published: January 13, 2021
  • https://doi.org/10.1371/journal.pone.0244576
  • Peer Review
  • Reader Comments

Table 1

Song lyrics are rich in meaning. In recent years, the lyrical content of popular songs has been used as an index of culture’s shifting norms, affect, and values. One particular, newly uncovered, trend is that lyrics of popular songs have become increasingly simple over time. Why might this be? Here, we test the idea that increasing lyrical simplicity is accompanied by a widening array of novel song choices. We do so by using six decades (1958–2016) of popular music in the United States ( N = 14,661 songs), controlling for multiple well-studied ecological and cultural factors plausibly linked to shifts in lyrical simplicity (e.g., resource availability, pathogen prevalence, rising individualism). In years when more novel song choices were produced, the average lyrical simplicity of the songs entering U.S. billboard charts was greater. This cross-temporal relationship was robust when controlling for a range of cultural and ecological factors and employing multiverse analyses to control for potentially confounding influence of temporal autocorrelation. Finally, simpler songs entering the charts were more successful, reaching higher chart positions, especially in years when more novel songs were produced. The present results suggest that cultural transmission depends on the amount of novel choices in the information landscape.

Citation: Varnum MEW, Krems JA, Morris C, Wormley A, Grossmann I (2021) Why are song lyrics becoming simpler? a time series analysis of lyrical complexity in six decades of American popular music. PLoS ONE 16(1): e0244576. https://doi.org/10.1371/journal.pone.0244576

Editor: Ronald Fischer, Victoria Univ Wellington, NEW ZEALAND

Received: July 3, 2020; Accepted: December 11, 2020; Published: January 13, 2021

Copyright: © 2021 Varnum et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data and reproducible code for analyses reported in the manuscript are available on the Open Science Framework ( https://osf.io/qnsmj/ ).

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared no competing interests exist.

Introduction

Music is a human universal [ 1 , 2 ], and it is known to influence cognition, affect, and behavior [ 3 – 5 ]. Because songs—and particularly popular song lyrics—can be so rich in meaning [ 6 , 7 ], social scientists have long explored the ways that such lyrics intersect with some fundamental social processes, including identity formation and person perception [ 8 – 13 ].

More recently, social psychologists have begun to view music as a cultural product and to examine the ways that popular music lyrics reflect important aspects of psychology at the cultural level; the content in popular lyrics indexes changing norms, affect, and/or values [ 5 , 14 – 19 ]. For example, DeWall and colleagues explored popular song lyrics as a “window into understanding U.S. cultural changes in psychological states” [5, pp. 200], finding that popular songs lyrics from 1980–2007 reflected an increase in self-focus and a decrease in other-focus.

Here, we demonstrate that popular music lyrics have become increasingly simple over time, and we test one possible explanation for this surprising trend, namely that the amount of novel song choices has increased.

Novel song choices and lyrical simplicity

Several lines of evidence suggest that people may have baseline preferences for songs with simpler lyrics. One of the most widely known phenomena in psychology is the mere exposure effect, a phenomenon where repeated exposure to a non-aversive stimulus increases preference for it [ 20 – 22 ]. One implication of this principle for the present question is that simpler, more repetitive lyrics as these pieces essentially have this effect baked into them and thus may tend to be preferred all other things being equal. Further, songs with more repetitive lyrics may enjoy certain advantages in terms of information transmission as they are easier to remember [ 23 ] and likely easier to transmit with fidelity [ 24 – 26 ]. Further, recent work has shown that naïve listeners find simpler, more repetitive pieces of music to be more enjoyable, engaging, and memorable [ 27 , 23 ].

Why might pop songs become lyrically simpler in times when more new songs are produced? Theory and research from diverse literatures suggest that songs with simpler lyrics might be especially successful when there are more new songs to choose from. First, humans are cognitive misers. People have limited information-processing capacities [ 28 ], and are known to conserve mental resources [ 29 ]. Consequently, humans often use shortcuts in decision-making [ 30 , 31 ]. For example, when confronted with the task of evaluating persuasive messages and/or complex decision environments, people are more likely to use heuristics, peripheral cues, and other automatic cognitive processes to evaluate these messages if cognitive resources are limited in some fashion [ 32 , 33 ]. Thus, when there are more products to be evaluated, people may increasingly prefer simpler products as they may require less mental effort to engage with. The mere exposure effect might also have a greater influence on decision making in such contexts as well, given that it too can be thought of as a heuristic or even instinctive evaluation. Further, across real-world studies and in-laboratory experiments, when people are confronted with a greater number of options to choose from, they are more likely to choose simpler, less cognitively demanding products [ 34 ]. Taken together, this work suggests that pop songs on average might become lyrical simpler in times when people are exposed to greater amounts of new songs and that success of such songs might be more strongly linked to lyrical simplicity in such times.

Here, we test the hypothesis that the trend toward increasingly simple popular music lyrics might be accompanied by the increasing number of songs released each year, using six decades’ worth of song data. We also do so while including a number cultural and ecological control variables, as prior work demonstrates that well-studied ecological features, such as resource levels, pathogen threat, and sources of external threat (e.g., climatic stress, armed conflict) can impact markers of cognition and behavior at the cultural-level [ 35 – 38 ], and might plausibly affect preferences for simplicity in aesthetic products. For example, both resource scarcity and pathogen prevalence have been associated with conformity, innovation, and creativity in prior work [ 35 , 39 , 40 ].

We gathered cross-temporal data covering a period of six decades (1958–2016) on lyrical compressibility (as an index of simplicity/complexity of song lyrics), amount of novel songs produced (as an index of available novel song choices), and ecological, socioecological, and cultural variables linked to patterns of cultural change in previous research or plausibly related to trends in aesthetic content.

Lyrical compressibility of successful music

We gathered data from 14,661 songs that entered the Billboard Hot 100 charts spanning the period from 1958 (the charts inception) to 2016. The Billboard Hot 100 tracks the 100 most popular songs each week based on music sales, radio airplay, and internet streaming. To operationalize lyrical complexity (vs. simplicity), we estimated text compressibility. By operationalizing complexity via a compressibility index, we avoided some of the conceptual ambiguity associated with operationalization of complexity in prior research [ 40 – 42 ]: Whereas multi-purpose use of a single product may reflect product’s complexity from the operational standpoint, it may also represent greater simplicity from the standpoint of consumer psychology. Further, song lyrics are tractable to work with when using an automated compression algorithm.

research paper song lyrics

Higher compression scores signify more repetition and therefore higher simplicity. A score of 0 means no compression was possible (e.g. if the input were random noise), a score of 1 means a 50% reduction in size, a score of 2 means a 75% reduction, and so on. For example, Daft Punk’s 1997 song “Around the World” repeats the title 144 times and has a compressibility score of 5.42 (the maximum in this sample). Nat King Cole’s “The Christmas Song” (1961) has a low compression score of 0.11.

We computed mean compressibility for each year based on all songs that entered the Hot 100 charts in a given year for which we were able to scrape lyrics (1958–2016). Because we used an automated procedure for song scraping, which depends on the readability of the song lyrics, the percentage of songs scraped varied between 27% of top 100 songs in 1958 and 91% of songs in 2015 ( M = 57%, Md = 57%, SD = 19%). Because percentage of scraped songs has been increasing over time, and correlated with the compressibility index, τ = .73, p < .001, in additional analyses we controlled for this trend.

Song success

Some of the theoretical positions we draw on to evaluate possible reasons for changes in lyrical complexity suggest that more compressible songs may be more likely to be successful. To evaluate this proposition, we additionally gathered data on the highest position of each song in the sample achieved on the Billboard charts.

Novel music production

In the spirit of the multiverse analyses [ 43 ], we used three separate indicators to assess the amount of new music to which people are likely exposed in a given year. For each year (1958–2016) we computed the total number of songs which made the Hot100 chart, the number of musical releases per year according to Discogs (Discogs.com), and the number of Wikipedia entries about songs first published or performed each year (Wikipedia.org).

Possible ecological drivers of cultural change in aesthetic preferences and music production

We assessed a range of well-studied socioecological factors (e.g., resource levels, pathogen threat, sources of external threat), which could plausibly bear on aesthetic preferences or might affect lyrical simplicity (and whether the predicted association between novel music production and simplicity holds even controlling for these or other ecological and cultural variables discussed below). Resource scarcity has been linked to greater conformity [ 39 ] and cross-temporal work has found that greater resource levels are linked to more innovation and creative output [ 40 ] and less conformity [ 44 , 45 ]. Higher levels of infectious disease have also been linked to more conformity [ 46 , 47 ], traditionalism [ 48 ], and tight social norms [ 35 , 49 ]. External threats , due to climate or war, have also been linked to more traditional outlooks and tight social norms [ 49 ], which might similarly bear on trends in lyrical simplicity. We thus included publicly accessible data indexing these factors GDP per capita, GDP growth, unemployment, pathogen prevalence, climatic stress, and participation of the US in major armed conflicts. The data used in our analyses covered the years 1958–2016. Data on GDP per capita and GDP growth were gathered from macrotrends.net, and data on the other markers came from Varnum & Grossmann [ 50 ] and updates from the original data sources used in that publication.

We also explore the possible impact of other socioecological factors that might plausibly affect lyrical simplicity. One might speculate that immigration could drive increases in lyrical simplicity. For example, simpler lyrics in American pop songs might be linked to shifts in the amount of people for whom English may not be a first language. In a similar way, it might be that ethnic fractionalization , so far linked to changes in individualism and uniqueness over time [ 51 ]¸ may also increase preferences for, memory of, and/or dispersal of simpler, more repetitive lyrics, as such content would be easier to convey and understand to a wide range of audiences. To assess the possibility that a rise in simpler English lyrics might be linked to shifts in the amount of people for whom English may not be a first language, we used data on the number of green cards issued from the Department of Homeland Security as a marker of immigration. To assess possibilities linked to ethnic fractionalization, we used data on ethnic fractionalization from the US Census Bureau.

Research on the consequences of residential mobility also suggests that perhaps this variable might also affect lyrical trends. Previous studies have linked residential mobility to greater susceptibility to the mere exposure effect and greater preference for familiar cultural products [ 52 ]; thus, it may be that mobility is also linked to temporal variations in lyrical complexity of pop songs. To assess residential mobility, we gathered data on percentage of the US population that changed residence within the US from the US Census Bureau.

At the same time, a simpler variable might also be driving this effect. Perhaps products that succeed with a larger audience are merely simpler, akin to a lowest common denominator effect. Because the U.S. population grew substantially in recent decades, we also test whether population trends might be associated with lyrical simplicity. Thus, we also gathered data on the total size of the US population from macrotrends.net to explore population size.

Cultural factors

Prior work has found conservatives show a preference for simple and unambiguous art, speech patterns, and literature [ 53 – 57 ] (though see also Conway et al., 2016 [ 58 ]). Thus, one might suspect that possible changes in conservatism could be driving lyrical simplicity. Somewhat similarly, other evidence suggests that cross-cultural differences in aesthetic preferences and expression are linked to orientations toward collectivism [ 59 , 60 ]. Thus, we also gathered data on indicators of conservative ideology , operationalized conservatism as the average percent of annual survey respondents in Gallup polls identifying as conservative, and we included as an index of cultural level collectivism based on frequency of collectivism related words in the Google Ngrams American English corpus [ 45 ].

Analytic procedure

Where possible, we use non-parametric ordinal-level measures of correlation or partial correlation ( Kendall ’s rank correlation coefficient τ), which provides estimate of similarity of the orderings of the data when ranked by each of the quantities. Since Fechner’s initial work on time series analyses, Kendall ’s τ has been a preferred metric for examining cross-temporal relationships [ 61 ]. It provides a conservative estimate, which is preferred because time series data is rarely normally distributed. Results were comparable when we used Pearson’s r or partial Pearson correlations. In the initial step, we examined zero-order relationships between each of the three indices of available novel song choices and average lyrical compressibility of popular songs. Next, we created a composite index of novel song choices and assessed the robustness of the hypothesized link between amount of novel song choices and average lyrical compressibility of popular songs by controlling for a host of ecological, socioecological, and cultural factors that might plausibly influence cultural level success for simplicity vs. complexity. Our chief analyses focused on a set of corrective analyses, in which we controlled for the possibly spurious nature of the relationship between our key time series due to temporal autocorrelation.

Given the range of possibilities of correcting for temporal autocorrelation, we opted to perform three different types of analyses that correct or account for the possibility that observed relationships might be spurious as a function of autocorrelation in the time series. First, we computed adjusted significance thresholds based on the Tiokhin-Hruschka procedure [ 62 ]. Second, we detrended our novel song production and lyrical compressibility time series by residualizing for year and assessed the correlation between our detrended variables. Finally, for central univariate and multivariate analyses, we used an automated auto-regressive integrated moving average forecasting model (auto.ARIMA) to assess the relationship between novel song choices and lyrical compressibility [ 63 ]. This technique involves a machine learning algorithm that tests a number of different possible models which vary in autoregressive components, differencing, and moving average components, as well as whether they include an exogenous predictor. Additionally, we used auto.ARIMA to generate a forecast for future patterns of lyrical compressibility (2017–2046).

For multivariate analyses we entered multiple predictors of lyrical compressibility over time. To avoid multicollinearity and overfitting (and due to limited number of units at the yearly level of analysis), we first aggregated covariance scores attributed to additional socioecological and cultural factors (see Table 1 ) by performing a principal component analysis on these covariates and saving component scores for further multivariate time series analyses. The first principal component explained 50% of the variance in the covariates, with strong loadings (absolute value >.85) for Population Size, GDP/capita, Residential Mobility, Pathogen Prevalence, Ethnic Heterogeneity and Immigration, moderate loadings for Armed Conflicts (.49) and weak loading of GDP growth (.44). Other covariates (Climatic Stress, Unemployment, Conservatism, Collectivism) showed very weak loadings (.21 < absolute value ≤ .27). Next, we entered both yearly music production scores and covariate-PCA scores as independent predictors of lyrical compressibility, simultaneously accounting for the time series structure in the data.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0244576.t001

Data availability

All data and reproducible code for analyses reported in the manuscript are available on the Open Science Framework ( https://osf.io/qnsmj/ ).

Indicators of novel song choices and average lyrical compressibility

As Fig 1 indicates, mean lyrical compressibility (i.e., simplicity) of songs increased over time, Kendall’s τ = .726, p < .001, as did number of songs making the Hot 100 charts per year, Kendall’s τ = .425, p < .001, number of music releases according to Discogs per year, Kendall’s τ = .973, p < .001, and number of Wikipedia entries for songs by year of publication, Kendall’s τ = .871, p < .001.

thumbnail

Light purple indicates 95% confidence bands, dark purple indicates 80% confidence bands.

https://doi.org/10.1371/journal.pone.0244576.g001

Analyses of the composite index of novel song choices

Hot100 songs, Discog music releases, and Wikipedia song entries were highly correlated, .41 < Kendall’s τ’s ≤ .87, and formed a single principle component with highest loadings by the Wikipedia song entries (.98), and weakest loading by the Hot 100 songs (.88). To avoid multicollinearity, we used component scores for further analyses. Overall, this index of novel music production was strongly positively related to compressibility, Kendall ’s τ = .714¸ p < .001. Consistent with our predictions, mean lyrical compressibility per year was positively correlated with amount of novel music produced per year as operationalized by three distinct indicators, Kendall’s τ ( n songs in Hot 100 charts/year) = .429, p < .001, Kendall’s τ ( n Discogs music releases / year) = .721, p < .001, Kendall’s τ ( n Wikipedia entries about songs/year) = .680, p < .001.

Relationships between socioecological factors and compressibility

Although several ecological dimensions were associated with changes in average lyrical compressibility over time (see Table 1 ), these relationships were often in the opposite direction that prior research or theorizing would suggest. For example, there were significant negative correlations between GDP per capita and pathogen prevalence and average lyrical compressibility. Further, our two cultural variables were either unrelated to lyrical compressibility (conservatism) or correlated in the opposite of the predicted direction (collectivism). We did observe theoretically sensible relationships between compressibility and residential mobility, immigration, ethnic fractionalization, and population size. However, when controlling for the potentially confounding effect of temporal auto-correlation by residualizing out the effect of year, only three of these relationships are statistically significant, and only the relationship between pathogen prevalence and average lyrical complexity remains in a theoretically sensible direction (see Table 1 ).

Robustness analyses: Control variables

This PCA-based composite index of music production remained significantly related to lyrical compressibility when including percentage of scraped songs/year as a covariate, Kendall ’s τ p = .261¸ p = .003. Further, it remained significant when controlling separately for each of the 12 specified control variables, .220 < partial Kendall’s τ’s < .770, p’s < .02 (see Table 2 for details). Full correlations between these variables are presented in S1 Fig .

thumbnail

https://doi.org/10.1371/journal.pone.0244576.t002

Robustness analyses: Auto-correlation

Importantly, the correlation between this composite index of novel song choices and average lyrical compressibility remained significant when adjusting significance thresholds using the Tiokhin-Hruschka method to account for observed auto-correlation in the two time series, r = .877¸ corrected p < .001. As an alternative method for dealing with autocorrelation, we also detrended the time series by residualizing out the linear impact of year. The correlation for our detrended variables remained significant, Kendall’s τ = .222, p = .010.

Given the time series nature of our data, another way to test the hypothesized link between amount of new songs available and average compressibility of these songs while also addressing the issue of autocorrelation can involve an automated ARIMA algorithm (auto.ARIMA) within the forecast package [ 64 ] in R 4.0.0 [ 65 ]. This machine-learning algorithm inspects the time-series data to fit the optimal forecasting function. The auto-regressive ( AR(p) ) component refers to the use of past values in the regression equation for the series Y. The auto-regressive parameter p specifies the number of lags used in the model. A moving average ( MA(q) ) component represents the error of the model as a combination of previous error terms e t . The order q determines the number of terms to include in the model. ARIMA models are well-suited for long-term time series, such as the historic patterns in the present data. The automated algorithm within the forecast package searches through combinations of order parameters and picks the set that optimizes model fit criteria, comparing Akaike information criteria (AIC) or Bayesian information criteria (BIC) of respective models. Notably, the automated forecasting approach allows us to specify an exogenous predictor such as novel song choices, such that the automated function can evaluate the extent to which this exogenous predictor improves the fit above and beyond the decomposition of the time-series of the dependent variable. In other words, the automated function provides a conservative way to see whether an exogenous predictor such as the novel song choices index improves accuracy in forecasts of the lyrical compressibility. If the final model selected by auto.ARIMA includes our putative exogenous variable (in this case amount of novel song choices), then this suggests that this variable helps the model to achieve optimal fit to the data.

research paper song lyrics

Comparison of the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values for our primary and alternative models suggest that our primary model with novel song choices as an exogenous predictor and lyrical compressibility as the dependent variable, AIC = -235.84, BIC = -227.53, is superior to the alternate model with lyrical compressibility as an exogenous predictor and novel song choices as the dependent variable, AIC = 58.36, BIC = 68.75.

Robustness analyses: Controlling for percentage of scraped songs

Because of a positive association between lyrical compressibility and percentage of scraped songs per year, we performed a separate set of analyses in which we first regressed out the effect of sampling (% of scraped songs/year) on lyrical compressibility and performed an auto.ARIMA analysis on the residuals. Results of a model on the residuals with music production as a predictor indicated a significant effect of music production, B = .799, SE = 0.046, z = 17.32, p < .001, suggesting that the effect songs even when accounting for the possible change in sampling.

Multivariate analyses

research paper song lyrics

Exploratory song-level analyses

In exploratory analyses we evaluated how lyrical compressibility is associated with song success, and whether this relationship was stronger in time periods when more novel music was produced. Given that we shifted focus to song-specific data, we utilized a multi-level framework via lme4 package in R , with songs’ chart position and lyrical compressibility scores nested within years. Preliminary auto.ARIMA analyses on the yearly aggregate data indicated that a model with no auto-regressive components but a linear trend would show the best model fit. Therefore, in the first multi-level model we included year as a proxy for a linear trend as well as compressibility X year interaction as predictors of song success. Both year and lyrical compressibility were mean-centered prior to analyses. This multi-level model showed a good overall model fit, R 2 = .05, with 3.9% of the variance explained by fixed effects. Results indicated a significant effect of year, B = 0.318, SE = 0.031, t ( df = 57.29) = 10.23, p < 001, suggesting that over time songs included in the sample on average had a lower chart rank—a typical regression to the mean effect. Importantly, more compressible songs showed significantly higher rank in the charts, B = - 9.321, SE = 0.661, t ( df = 14640.88) = 14.10, p < .001, and this effect was particularly pronounced for more recent years, compressibility X Year interaction, B = - 0.105, SE = 0.039, t ( df = 14581.41) = 2.71, p = .007.

In the second step, we added mean-centered yearly music production index as a second covariate, along with a music production X compressibility interaction. Based on prior auto. ARIMA results, we also included linear effect of year to account for the trend in the chart position. This multi-level model also showed a good overall model fit, R 2 = .06, with 4.7% of the variance explained by fixed effects. More compressible songs showed significantly higher rank in the charts, B = - 9.353, SE = 0.657, t ( df = 14819.95) = 14.23, p < .001. Also, average chart position of songs was higher in years with a greater volume of songs produced, B = 6.141, SE = 1.280, t ( df = 53.76) = 4.80, p < .001. Moreover, as Fig 2 indicates, lyrical compressibility was more strongly associated with song success in years with greater volume of produced songs, compressibility X music production interaction, B = - 2.170, SE = 0.648, t ( df = 14781.15) = 3.35, p = .001. These analyses yield results consistent with the proposition that lyrically simpler songs enjoy greater success in time periods in which more novel song choices are available.

thumbnail

Confidence bands indicate 95% around the estimate.

https://doi.org/10.1371/journal.pone.0244576.g002

Forecasting

As a final step, we generated a forecast for average lyrical compressibility for four decades after the last data point in our time series. This is in keeping with recommendations by Varnum & Grossmann [ 38 ] that papers analyzing past patterns of cultural change provide forecasts for the future. These forecasts enable a test of this theoretical model against concrete future cultural trends. Using the automated ARIMA algorithm, we also identified the best function for the novel song choices data, which we used to estimate the subsequent 40 data points. In turn, we used this estimated data in conjunction with the compressibility function to forecast the further development of lyrical compressibility. Results of this model suggest that lyrical compressibility will continue to increase over the next several decades (see Fig 1 ).

Popular music lyrics have recently been used to inform work on the cultural transmission of emotional expression [ 14 , 66 ], as an index of culture-level changes in self- versus other-focus [ 5 ], and as a reflection of cultural mood in respond to economic and social threats [ 18 , 19 ]. But one major trend in popular music lyrics remained underexplored and unexplained—popular music lyrics are coming increasingly simple over time. We reasoned and found support for the hypothesis that increasing lyrical simplicity is associated with increasing amounts of novel music production. That is, in times when more novel music is produced, popular songs become increasingly lyrically simple.

The relationship between mean lyrical compressibility and the amount of novel music produced each year was robust. We observed significant positive associations across three operationalizations of the amount of novel song choices and the average lyrical compressibility of popular songs. Further, the relationship between amount of novel song choices and average compressibility of popular songs remained significant when including a host of ecological, socioecological, and cultural factors linked to other types of cultural change both in univariate and multivariate analyses. By and large these other variables were not significantly associated with changes in lyrical simplicity after controlling for the potentially confounding influence of temporal autocorrelation. Of note, we also observed a significant negative association between changes in pathogen prevalence and lyrical simplicity. This observation suggests a potentially new consequence of infectious disease threat, one that should be explored in more detail in future work.

Importantly, the linkage between amount of new music produced and average compressibility of popular songs also held when accounting for temporal autocorrelation using three distinct methods. Thus, results suggest that the amount of novel music produced contributes to changes in average lyrical compressibility above and beyond other plausible causes and autoregressive trends in the data.

In exploratory analyses, we also found evidence suggesting that success, as indexed by position in the billboard charts, among popular songs was associated with greater lyrical compressibility. This is broadly consistent with the notion that simpler content enjoys an advantage in memorability and/or transmission. Importantly, this effect appeared to be stronger in years when the amount of novel songs produced was higher, providing conceptual confirmation of our key finding. More novel song choices appear linked to both greater average lyrical compressibility of the body of songs that succeeds (i.e., those entering the billboard chart in a given year), and, among songs entering the charts in a given year, compressibility was more strongly associated with better performance on the chart in years when more novel songs were produced.

This finding might parallel ongoing research taking information-theoretic approaches in exploring communicative efficiency in human language [ 67 , 68 ]. For example, in both language and music, something akin to Zipf’s law seems to be at play [ 2 ]—i.e., the frequency rank of a phenomenon is inversely proportional to its probability, such that, in the case of language, many words are quite rare, but a few words (e.g., pronouns) appear with great frequency. Moreover, these more successful (i.e., frequently-used) words tend be shorter in length (but see also Piantadosi et al., 2011 [ 69 ]). This observation dovetails with our finding regarding the success of simpler lyrics. Indeed, the increasingly success of simple lyrics may reflect increasing communicative efficacy.

A preference for simpler information in increasingly information-saturated environments might also be consistent with some propositions from cultural evolutionary theory. One tenet of cumulative cultural evolutionary theory is that human innovation, transmission, and learning increase the amount and quality of cultural information, while also increasing the learnability of this information [ 70 , 25 ]. One way to increase information learnability is via simplicity [ 71 , 72 ], thereby yielding increasingly efficient communication.

The present report adds to two growing bodies of empirical research—work emphasizing the examination of cultural products as a window into cultural-level psychological processes [ 14 , 5 ] and work using time-series methods to test hypotheses regarding the causes of particular patterns of cultural change (for a review see Varnum & Grossmann, 2017 [ 38 ]). Here, we use big data and time series methods to show that increases in the amount of novel songs over time appear to be linked to the increasing simplicity of popular songs’ lyrics, as well as greater success of songs with simpler lyrics. What does this tell us more broadly about how American culture has changed? It suggests potentially that success of aesthetic complexity at the cultural level may be something that shifts over time. Although this is not the first such demonstration of this phenomenon, to our knowledge this is the first attempt to formally evaluate why such cultural-level preferences may change.

Alternative and complementary explanations

Although we found that our key effect was highly robust, alternative or complementary explanations for the growing success of lyrically simpler songs are still possible. For example, changes in the ways that people consume popular music could perhaps affect lyrical simplicity. Technological innovation (e.g., various portable music devices) could play a role, as could other variation in the ways that people interact with music. Relatedly, one might speculate that the success of increasingly simple lyrics might owe to technologically mediated increases in listening to music primarily in the background (e.g., on commutes, in gyms). However, one might easily argue that for generations music has been consumed in this fashion albeit with slightly different technologies—portable radios, car stereos, and portable music players have existed and been widely used for decades. It would be interesting to attempt to assess this question empirically, although we are not currently aware of high-quality time series data relating to how and why people listen to popular music. Moreover, operationalization of these indicators of technological innovations over time would be a potentially thorny problem. For instance, what does it mean to own a Walkman in 1982 as compared to a similar device in 2002? Nonetheless, it would be intriguing to assess these questions in future work.

Another possibility is that the length of songs may have changed over time affecting average lyrical complexity. Thus, perhaps song lyrics are more compressible by virtue of songs becoming shorter. However, a recent analysis of songs entering the Billboard charts over the course of its history suggests, in fact, that the average song on the charts in the late 2010’s was somewhat longer than those in the 1950’s and 1960’s, and similar in recent years to levels observed in the 1970’s [ 73 ]. Thus, this alternative explanation cannot account for the trends observed in the present analyses.

One might alternatively speculate that the rise in lyrical simplicity observed in the present data might be related to trends in the popularity of different musical genres. Indeed, although this is beyond the scope of the present work, it would be interesting to empirically assess how lyrical complexity varies across popular music genres and whether trends within these genres over time have been similar. Further, future work might assess whether the linkage between lyrical simplicity and song success observed in our exploratory analyses varies within genres of popular music or if genres that are on average simpler enjoy greater success in times of more music production.

Limitations

It is worth noting that our analysis was restricted to a single type of cultural product. It might be the case that empirical analysis of other domains might show similar trends and a similar relationship between amount of novel content and success of simpler content, or it may be that different dynamics are observed when considering television shows, videogames, or other types of cultural products. For example, many have argued that television shows have become more complex and intellectually stimulating in the past few decades, entering the so-called “Golden Age of Television.” However, empirical work examining complexity over time in other types of cultural products, including movies, news broadcasts, print newspapers, novels, and political speech suggests that there is in fact a broad trend toward simpler content being increasingly preferred, at least when it comes to the language used in these products [ 74 ]. It is noteworthy that Jordan and colleagues (2019) used a different measure of complexity, in this case use of a specific set of words indicate cognitive complexity, and that they find that the strength of the decline in complexity varies across different types of cultural products. Hence, future research may attempt to conceptually replicate our work by assessing compressibility of other types of cultural products over time and whether the success of such products is linked to the number of options or alternatives within that domain.

It is also worth noting that, in the present work, we assessed the simplicity of lyrics. Songs might be complex or simple in other ways as well, in terms of rhythm, melody, number of instruments played, and so on. Analyses of these features is beyond the scope of the present work, but it would be interesting to see the extent to which similar or divergent patterns are observed in these facets of successful popular music over time.

Our analysis was also limited to songs that were relatively successful over time—i.e., those that made the Billboard Hot 100 chart. This sample is quite large ( N > 14,000), but it may not be representative of all songs produced during this period. Further, we were able to successfully scrape a greater proportion of more recent rather than older songs, which we included in control analyses. Our sample captures a large chunk of popular music produced during more than half a century and enables tests regarding linkages between novel music choices, lyrical simplicity, and song success. A slightly different conceptual question may be worthwhile addressing in future work: Does average complexity of all music produced change along with shifts in the amount of music produced?

Our work is also limited by the fact that song success was operationalized by commercial success in the US market. Although some cultural shifts in the past several decades appear to be global in nature, such as rising individualism [ 36 ], this need not be the case for all dimensions of culture. Different dynamics may potentially be observed in terms of song success in parts of the world with different values, practices, and ecological conditions. Although such an endeavor is beyond the scope of the present manuscript largely due to the lack of equally rich time series data from other countries, it would be worthwhile to try to address this question in the future.

Finally, the present work is limited by its correlational nature. Although our findings appeared quite robust across different operationalizations of the independent variable—when accounting for autocorrelation in various ways, and when controlling for a host of plausible ecological, socioecological factors, and cultural values which have shifted over time—we cannot completely rule out all alternative explanations for increasing success of songs with simpler lyrics. Future work might attempt to quantify society level time series trends in conformity or other biases linked to lyrical affect and music sampling [ 14 , 75 ], and assess whether the present findings hold when controlling for these variables as well. Future work may also use in-lab methods to explore and disentangle the possible causal mechanisms underlying the link between amount of novel song choices and success of songs with simpler lyrics. For example, transmission chain methods [ 76 ] could be employed to explore whether participants might find simpler lyrics more pleasing and memorable when there is a greater number of other song-snippets competing for attention versus when there is not.

Why have the lyrics of pop songs become simpler over time? Our findings suggest that the answer may have to do with the proliferation of new songs available to consumers. The present work represents one of the first attempts to use big data and time series methods to quantify temporal shifts in information transmission dynamics at the societal level. Future work may attempt to replicate and extend these findings into other types of complexity and other types of cultural products.

Supporting information

S1 fig. zero-order kendall’s tau correlations between variables..

https://doi.org/10.1371/journal.pone.0244576.s001

https://doi.org/10.1371/journal.pone.0244576.s002

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 7. Hayakawa S. I. Popular songs vs. the facts of life. In Rosenberg B. & White D. M. (Eds.), Mass culture: The popular arts in America. Glencoe, I11.: Free Press, 1957. https://doi.org/10.1038/179537a0 pmid:13407766
  • 10. Reisman D. (1957), "Listening to Popular Music", pp. 408–417, in Rosenberg B. and White D.M. (eds.) Mass Culture, Glencoe: Free Press
  • 24. Bartlett F. C. (1932). Remembering: An experimental and social study. Cambridge: Cambridge University Press.
  • 25. Henrich J. (2015). The secret of our success: how culture is driving human evolution, domesticating our species, and making us smarter. Princeton University Press.
  • 26. Rubin D. C. (1997). Memory in oral traditions: The cognitive psychology of epic, ballads, and counting-out rhymes. Oxford: Oxford University Press.
  • 29. Fiske S. T., & Taylor S. E. (2013). Social cognition: From brains to culture. Sage.
  • 32. Eagly A. H., & Chaiken S. (1993). The psychology of attitudes. Harcourt Brace Jovanovich College Publishers.
  • 33. Petty R. E., & Cacioppo J. T. (1986). The elaboration likelihood model of persuasion. In Communication and Persuasion (pp. 1–24). Springer New York.
  • 63. Khandakar Y., & Hyndman R. J. (2008). Automatic time series forecasting: the forecast package for rj stat. Soft.
  • 64. Hyndman R, Athanasopoulos G, Bergmeir C, Caceres G, Chhay L, O'Hara-Wild M, et al. (2020). _forecast: Forecasting functions for time series and linear models_. R package version 8.12, <URL: http://pkg.robjhyndman.com/forecast >.
  • 65. R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. <URL: https://www.R-project.org/ >.
  • 68. Tooby J., & Cosmides L. (2020). Natural Selection and the Nature of Communication. In Floyd K. & Weber R. (Eds.), The Handbook of Communication Science and Biology. Routledge. https://doi.org/10.1016/j.cognition.2020.104284 pmid:32276235
  • 70. Boyd R., & Richerson P. J. (1988). Culture and the evolutionary process. University of Chicago Press.
  • 73. Bannister, M. (2020). The Billboard Hot 100: Exploring Six Decades of Number One Singles, <URL: https://github.com/mspbannister/dand-p4-billboard/blob/master/Billboard_analysis__100417_.md >.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Why are song lyrics becoming simpler? a time series analysis of lyrical complexity in six decades of American popular music

Michael E. W. Varnum

1 Department of Psychology, Arizona State University, Tempe, Arizona, United States of America

Jaimie Arona Krems

2 Department of Psychology, Oklahoma State University, Stillwater, OK, United States of America

Colin Morris

3 Toronto, Canada

Alexandra Wormley

Igor grossmann.

4 Department of Psychology, University of Waterloo, Waterloo, ON, Canada

Associated Data

All data and reproducible code for analyses reported in the manuscript are available on the Open Science Framework ( https://osf.io/qnsmj/ ).

Song lyrics are rich in meaning. In recent years, the lyrical content of popular songs has been used as an index of culture’s shifting norms, affect, and values. One particular, newly uncovered, trend is that lyrics of popular songs have become increasingly simple over time. Why might this be? Here, we test the idea that increasing lyrical simplicity is accompanied by a widening array of novel song choices. We do so by using six decades (1958–2016) of popular music in the United States ( N = 14,661 songs), controlling for multiple well-studied ecological and cultural factors plausibly linked to shifts in lyrical simplicity (e.g., resource availability, pathogen prevalence, rising individualism). In years when more novel song choices were produced, the average lyrical simplicity of the songs entering U.S. billboard charts was greater. This cross-temporal relationship was robust when controlling for a range of cultural and ecological factors and employing multiverse analyses to control for potentially confounding influence of temporal autocorrelation. Finally, simpler songs entering the charts were more successful, reaching higher chart positions, especially in years when more novel songs were produced. The present results suggest that cultural transmission depends on the amount of novel choices in the information landscape.

Introduction

Music is a human universal [ 1 , 2 ], and it is known to influence cognition, affect, and behavior [ 3 – 5 ]. Because songs—and particularly popular song lyrics—can be so rich in meaning [ 6 , 7 ], social scientists have long explored the ways that such lyrics intersect with some fundamental social processes, including identity formation and person perception [ 8 – 13 ].

More recently, social psychologists have begun to view music as a cultural product and to examine the ways that popular music lyrics reflect important aspects of psychology at the cultural level; the content in popular lyrics indexes changing norms, affect, and/or values [ 5 , 14 – 19 ]. For example, DeWall and colleagues explored popular song lyrics as a “window into understanding U.S. cultural changes in psychological states” [5, pp. 200], finding that popular songs lyrics from 1980–2007 reflected an increase in self-focus and a decrease in other-focus.

Here, we demonstrate that popular music lyrics have become increasingly simple over time, and we test one possible explanation for this surprising trend, namely that the amount of novel song choices has increased.

Novel song choices and lyrical simplicity

Several lines of evidence suggest that people may have baseline preferences for songs with simpler lyrics. One of the most widely known phenomena in psychology is the mere exposure effect, a phenomenon where repeated exposure to a non-aversive stimulus increases preference for it [ 20 – 22 ]. One implication of this principle for the present question is that simpler, more repetitive lyrics as these pieces essentially have this effect baked into them and thus may tend to be preferred all other things being equal. Further, songs with more repetitive lyrics may enjoy certain advantages in terms of information transmission as they are easier to remember [ 23 ] and likely easier to transmit with fidelity [ 24 – 26 ]. Further, recent work has shown that naïve listeners find simpler, more repetitive pieces of music to be more enjoyable, engaging, and memorable [ 27 , 23 ].

Why might pop songs become lyrically simpler in times when more new songs are produced? Theory and research from diverse literatures suggest that songs with simpler lyrics might be especially successful when there are more new songs to choose from. First, humans are cognitive misers. People have limited information-processing capacities [ 28 ], and are known to conserve mental resources [ 29 ]. Consequently, humans often use shortcuts in decision-making [ 30 , 31 ]. For example, when confronted with the task of evaluating persuasive messages and/or complex decision environments, people are more likely to use heuristics, peripheral cues, and other automatic cognitive processes to evaluate these messages if cognitive resources are limited in some fashion [ 32 , 33 ]. Thus, when there are more products to be evaluated, people may increasingly prefer simpler products as they may require less mental effort to engage with. The mere exposure effect might also have a greater influence on decision making in such contexts as well, given that it too can be thought of as a heuristic or even instinctive evaluation. Further, across real-world studies and in-laboratory experiments, when people are confronted with a greater number of options to choose from, they are more likely to choose simpler, less cognitively demanding products [ 34 ]. Taken together, this work suggests that pop songs on average might become lyrical simpler in times when people are exposed to greater amounts of new songs and that success of such songs might be more strongly linked to lyrical simplicity in such times.

Here, we test the hypothesis that the trend toward increasingly simple popular music lyrics might be accompanied by the increasing number of songs released each year, using six decades’ worth of song data. We also do so while including a number cultural and ecological control variables, as prior work demonstrates that well-studied ecological features, such as resource levels, pathogen threat, and sources of external threat (e.g., climatic stress, armed conflict) can impact markers of cognition and behavior at the cultural-level [ 35 – 38 ], and might plausibly affect preferences for simplicity in aesthetic products. For example, both resource scarcity and pathogen prevalence have been associated with conformity, innovation, and creativity in prior work [ 35 , 39 , 40 ].

We gathered cross-temporal data covering a period of six decades (1958–2016) on lyrical compressibility (as an index of simplicity/complexity of song lyrics), amount of novel songs produced (as an index of available novel song choices), and ecological, socioecological, and cultural variables linked to patterns of cultural change in previous research or plausibly related to trends in aesthetic content.

Lyrical compressibility of successful music

We gathered data from 14,661 songs that entered the Billboard Hot 100 charts spanning the period from 1958 (the charts inception) to 2016. The Billboard Hot 100 tracks the 100 most popular songs each week based on music sales, radio airplay, and internet streaming. To operationalize lyrical complexity (vs. simplicity), we estimated text compressibility. By operationalizing complexity via a compressibility index, we avoided some of the conceptual ambiguity associated with operationalization of complexity in prior research [ 40 – 42 ]: Whereas multi-purpose use of a single product may reflect product’s complexity from the operational standpoint, it may also represent greater simplicity from the standpoint of consumer psychology. Further, song lyrics are tractable to work with when using an automated compression algorithm.

Compressibility indexes the degree to which song’s lyrics have more repetitive and less information dense, and thus simpler, content. We used a variant of the established LZ77 compression algorithm. In brief, the LZ77 algorithm works by finding repeated substrings and replacing them with 'match' objects pointing back to the string's previous occurrence. A match is encoded as a tuple (D , L) , with D being the distance to the substring's previous occurrence, and L being its length. We treated these matches as costing 3 bytes. This way, a repeated string only leads to space savings if it is of at least length 4, and longer repetitions lead to greater relative savings. Given a song S , and the set of matches M produced by the LZ77 algorithm when applied to that song, its compressed size is therefore:

Where |S| is the original size of the song's lyrics, measured in characters/bytes. The compression ratios of songs in our dataset (i.e., | S |/ compsize(S) ) followed an approximately log-normal distribution, so we operationalized compressibility as the logarithm of this ratio:

We used the LZ77 compression algorithm because of its intimate connection to textual repetition. Most of the byte savings when compressing song lyrics arise from large, multi-line sections (most importantly the chorus, and chorus-like hooks). Another significant contributor are multi-word phrases, which may be repeated in variations across different lines for poetic effect (e.g. the anaphoric verses in Lady Gaga's Bad Romance : "I want your ugly / I want your disease / I want your everything …"). The compression may make use of repeated individual words, or even sub-word units that repeat (perhaps incidentally), but their contribution to the overall compressibility is low.

Higher compression scores signify more repetition and therefore higher simplicity. A score of 0 means no compression was possible (e.g. if the input were random noise), a score of 1 means a 50% reduction in size, a score of 2 means a 75% reduction, and so on. For example, Daft Punk’s 1997 song “Around the World” repeats the title 144 times and has a compressibility score of 5.42 (the maximum in this sample). Nat King Cole’s “The Christmas Song” (1961) has a low compression score of 0.11.

We computed mean compressibility for each year based on all songs that entered the Hot 100 charts in a given year for which we were able to scrape lyrics (1958–2016). Because we used an automated procedure for song scraping, which depends on the readability of the song lyrics, the percentage of songs scraped varied between 27% of top 100 songs in 1958 and 91% of songs in 2015 ( M = 57%, Md = 57%, SD = 19%). Because percentage of scraped songs has been increasing over time, and correlated with the compressibility index, τ = .73, p < .001, in additional analyses we controlled for this trend.

Song success

Some of the theoretical positions we draw on to evaluate possible reasons for changes in lyrical complexity suggest that more compressible songs may be more likely to be successful. To evaluate this proposition, we additionally gathered data on the highest position of each song in the sample achieved on the Billboard charts.

Novel music production

In the spirit of the multiverse analyses [ 43 ], we used three separate indicators to assess the amount of new music to which people are likely exposed in a given year. For each year (1958–2016) we computed the total number of songs which made the Hot100 chart, the number of musical releases per year according to Discogs (Discogs.com), and the number of Wikipedia entries about songs first published or performed each year (Wikipedia.org).

Possible ecological drivers of cultural change in aesthetic preferences and music production

We assessed a range of well-studied socioecological factors (e.g., resource levels, pathogen threat, sources of external threat), which could plausibly bear on aesthetic preferences or might affect lyrical simplicity (and whether the predicted association between novel music production and simplicity holds even controlling for these or other ecological and cultural variables discussed below). Resource scarcity has been linked to greater conformity [ 39 ] and cross-temporal work has found that greater resource levels are linked to more innovation and creative output [ 40 ] and less conformity [ 44 , 45 ]. Higher levels of infectious disease have also been linked to more conformity [ 46 , 47 ], traditionalism [ 48 ], and tight social norms [ 35 , 49 ]. External threats , due to climate or war, have also been linked to more traditional outlooks and tight social norms [ 49 ], which might similarly bear on trends in lyrical simplicity. We thus included publicly accessible data indexing these factors GDP per capita, GDP growth, unemployment, pathogen prevalence, climatic stress, and participation of the US in major armed conflicts. The data used in our analyses covered the years 1958–2016. Data on GDP per capita and GDP growth were gathered from macrotrends.net, and data on the other markers came from Varnum & Grossmann [ 50 ] and updates from the original data sources used in that publication.

We also explore the possible impact of other socioecological factors that might plausibly affect lyrical simplicity. One might speculate that immigration could drive increases in lyrical simplicity. For example, simpler lyrics in American pop songs might be linked to shifts in the amount of people for whom English may not be a first language. In a similar way, it might be that ethnic fractionalization , so far linked to changes in individualism and uniqueness over time [ 51 ]¸ may also increase preferences for, memory of, and/or dispersal of simpler, more repetitive lyrics, as such content would be easier to convey and understand to a wide range of audiences. To assess the possibility that a rise in simpler English lyrics might be linked to shifts in the amount of people for whom English may not be a first language, we used data on the number of green cards issued from the Department of Homeland Security as a marker of immigration. To assess possibilities linked to ethnic fractionalization, we used data on ethnic fractionalization from the US Census Bureau.

Research on the consequences of residential mobility also suggests that perhaps this variable might also affect lyrical trends. Previous studies have linked residential mobility to greater susceptibility to the mere exposure effect and greater preference for familiar cultural products [ 52 ]; thus, it may be that mobility is also linked to temporal variations in lyrical complexity of pop songs. To assess residential mobility, we gathered data on percentage of the US population that changed residence within the US from the US Census Bureau.

At the same time, a simpler variable might also be driving this effect. Perhaps products that succeed with a larger audience are merely simpler, akin to a lowest common denominator effect. Because the U.S. population grew substantially in recent decades, we also test whether population trends might be associated with lyrical simplicity. Thus, we also gathered data on the total size of the US population from macrotrends.net to explore population size.

Cultural factors

Prior work has found conservatives show a preference for simple and unambiguous art, speech patterns, and literature [ 53 – 57 ] (though see also Conway et al., 2016 [ 58 ]). Thus, one might suspect that possible changes in conservatism could be driving lyrical simplicity. Somewhat similarly, other evidence suggests that cross-cultural differences in aesthetic preferences and expression are linked to orientations toward collectivism [ 59 , 60 ]. Thus, we also gathered data on indicators of conservative ideology , operationalized conservatism as the average percent of annual survey respondents in Gallup polls identifying as conservative, and we included as an index of cultural level collectivism based on frequency of collectivism related words in the Google Ngrams American English corpus [ 45 ].

Analytic procedure

Where possible, we use non-parametric ordinal-level measures of correlation or partial correlation ( Kendall ’s rank correlation coefficient τ), which provides estimate of similarity of the orderings of the data when ranked by each of the quantities. Since Fechner’s initial work on time series analyses, Kendall ’s τ has been a preferred metric for examining cross-temporal relationships [ 61 ]. It provides a conservative estimate, which is preferred because time series data is rarely normally distributed. Results were comparable when we used Pearson’s r or partial Pearson correlations. In the initial step, we examined zero-order relationships between each of the three indices of available novel song choices and average lyrical compressibility of popular songs. Next, we created a composite index of novel song choices and assessed the robustness of the hypothesized link between amount of novel song choices and average lyrical compressibility of popular songs by controlling for a host of ecological, socioecological, and cultural factors that might plausibly influence cultural level success for simplicity vs. complexity. Our chief analyses focused on a set of corrective analyses, in which we controlled for the possibly spurious nature of the relationship between our key time series due to temporal autocorrelation.

Given the range of possibilities of correcting for temporal autocorrelation, we opted to perform three different types of analyses that correct or account for the possibility that observed relationships might be spurious as a function of autocorrelation in the time series. First, we computed adjusted significance thresholds based on the Tiokhin-Hruschka procedure [ 62 ]. Second, we detrended our novel song production and lyrical compressibility time series by residualizing for year and assessed the correlation between our detrended variables. Finally, for central univariate and multivariate analyses, we used an automated auto-regressive integrated moving average forecasting model (auto.ARIMA) to assess the relationship between novel song choices and lyrical compressibility [ 63 ]. This technique involves a machine learning algorithm that tests a number of different possible models which vary in autoregressive components, differencing, and moving average components, as well as whether they include an exogenous predictor. Additionally, we used auto.ARIMA to generate a forecast for future patterns of lyrical compressibility (2017–2046).

For multivariate analyses we entered multiple predictors of lyrical compressibility over time. To avoid multicollinearity and overfitting (and due to limited number of units at the yearly level of analysis), we first aggregated covariance scores attributed to additional socioecological and cultural factors (see Table 1 ) by performing a principal component analysis on these covariates and saving component scores for further multivariate time series analyses. The first principal component explained 50% of the variance in the covariates, with strong loadings (absolute value >.85) for Population Size, GDP/capita, Residential Mobility, Pathogen Prevalence, Ethnic Heterogeneity and Immigration, moderate loadings for Armed Conflicts (.49) and weak loading of GDP growth (.44). Other covariates (Climatic Stress, Unemployment, Conservatism, Collectivism) showed very weak loadings (.21 < absolute value ≤ .27). Next, we entered both yearly music production scores and covariate-PCA scores as independent predictors of lyrical compressibility, simultaneously accounting for the time series structure in the data.

*p < .05,

** p ≤ .01,

*** p ≤ .001.

Data availability

Indicators of novel song choices and average lyrical compressibility.

As Fig 1 indicates, mean lyrical compressibility (i.e., simplicity) of songs increased over time, Kendall’s τ = .726, p < .001, as did number of songs making the Hot 100 charts per year, Kendall’s τ = .425, p < .001, number of music releases according to Discogs per year, Kendall’s τ = .973, p < .001, and number of Wikipedia entries for songs by year of publication, Kendall’s τ = .871, p < .001.

An external file that holds a picture, illustration, etc.
Object name is pone.0244576.g001.jpg

Light purple indicates 95% confidence bands, dark purple indicates 80% confidence bands.

Analyses of the composite index of novel song choices

Hot100 songs, Discog music releases, and Wikipedia song entries were highly correlated, .41 < Kendall’s τ’s ≤ .87, and formed a single principle component with highest loadings by the Wikipedia song entries (.98), and weakest loading by the Hot 100 songs (.88). To avoid multicollinearity, we used component scores for further analyses. Overall, this index of novel music production was strongly positively related to compressibility, Kendall ’s τ = .714¸ p < .001. Consistent with our predictions, mean lyrical compressibility per year was positively correlated with amount of novel music produced per year as operationalized by three distinct indicators, Kendall’s τ ( n songs in Hot 100 charts/year) = .429, p < .001, Kendall’s τ ( n Discogs music releases / year) = .721, p < .001, Kendall’s τ ( n Wikipedia entries about songs/year) = .680, p < .001.

Relationships between socioecological factors and compressibility

Although several ecological dimensions were associated with changes in average lyrical compressibility over time (see Table 1 ), these relationships were often in the opposite direction that prior research or theorizing would suggest. For example, there were significant negative correlations between GDP per capita and pathogen prevalence and average lyrical compressibility. Further, our two cultural variables were either unrelated to lyrical compressibility (conservatism) or correlated in the opposite of the predicted direction (collectivism). We did observe theoretically sensible relationships between compressibility and residential mobility, immigration, ethnic fractionalization, and population size. However, when controlling for the potentially confounding effect of temporal auto-correlation by residualizing out the effect of year, only three of these relationships are statistically significant, and only the relationship between pathogen prevalence and average lyrical complexity remains in a theoretically sensible direction (see Table 1 ).

Robustness analyses: Control variables

This PCA-based composite index of music production remained significantly related to lyrical compressibility when including percentage of scraped songs/year as a covariate, Kendall ’s τ p = .261¸ p = .003. Further, it remained significant when controlling separately for each of the 12 specified control variables, .220 < partial Kendall’s τ’s < .770, p’s < .02 (see Table 2 for details). Full correlations between these variables are presented in S1 Fig .

Robustness analyses: Auto-correlation

Importantly, the correlation between this composite index of novel song choices and average lyrical compressibility remained significant when adjusting significance thresholds using the Tiokhin-Hruschka method to account for observed auto-correlation in the two time series, r = .877¸ corrected p < .001. As an alternative method for dealing with autocorrelation, we also detrended the time series by residualizing out the linear impact of year. The correlation for our detrended variables remained significant, Kendall’s τ = .222, p = .010.

Given the time series nature of our data, another way to test the hypothesized link between amount of new songs available and average compressibility of these songs while also addressing the issue of autocorrelation can involve an automated ARIMA algorithm (auto.ARIMA) within the forecast package [ 64 ] in R 4.0.0 [ 65 ]. This machine-learning algorithm inspects the time-series data to fit the optimal forecasting function. The auto-regressive ( AR(p) ) component refers to the use of past values in the regression equation for the series Y. The auto-regressive parameter p specifies the number of lags used in the model. A moving average ( MA(q) ) component represents the error of the model as a combination of previous error terms e t . The order q determines the number of terms to include in the model. ARIMA models are well-suited for long-term time series, such as the historic patterns in the present data. The automated algorithm within the forecast package searches through combinations of order parameters and picks the set that optimizes model fit criteria, comparing Akaike information criteria (AIC) or Bayesian information criteria (BIC) of respective models. Notably, the automated forecasting approach allows us to specify an exogenous predictor such as novel song choices, such that the automated function can evaluate the extent to which this exogenous predictor improves the fit above and beyond the decomposition of the time-series of the dependent variable. In other words, the automated function provides a conservative way to see whether an exogenous predictor such as the novel song choices index improves accuracy in forecasts of the lyrical compressibility. If the final model selected by auto.ARIMA includes our putative exogenous variable (in this case amount of novel song choices), then this suggests that this variable helps the model to achieve optimal fit to the data.

The results of this automated forecasting procedure indicated that a model with a positive autoregressive component, B = .527, SE = .124, and a positive contribution of the novel music production index, B = .059, SE = .008, provides the best fit to the data:

This model estimation suggests that the index of novel song choices contributes to average lyrical compressibility above and beyond the temporal autocorrelation observed for average lyrical compressibility. Further, the coefficient for the index of novel song choices was statistically significant, z = 6.95, p < .001.

We also ran an alternative set of auto.ARIMA analyses where we set novel song choices as the dependent variable and average lyrical compressibility as an exogenous predictor. The results of this automated forecasting procedure indicated that a model with two positive moving average components, B = 1.176, SE = .242, and B = .487, SE = .164, and a positive contribution of average lyrical compressibility, B = 5.067, SE = 2.207, provides the best fit to the data:

The coefficient for lyrical compressibility was statistically significant, z = 2.30, p = .02.

Comparison of the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values for our primary and alternative models suggest that our primary model with novel song choices as an exogenous predictor and lyrical compressibility as the dependent variable, AIC = -235.84, BIC = -227.53, is superior to the alternate model with lyrical compressibility as an exogenous predictor and novel song choices as the dependent variable, AIC = 58.36, BIC = 68.75.

Robustness analyses: Controlling for percentage of scraped songs

Because of a positive association between lyrical compressibility and percentage of scraped songs per year, we performed a separate set of analyses in which we first regressed out the effect of sampling (% of scraped songs/year) on lyrical compressibility and performed an auto.ARIMA analysis on the residuals. Results of a model on the residuals with music production as a predictor indicated a significant effect of music production, B = .799, SE = 0.046, z = 17.32, p < .001, suggesting that the effect songs even when accounting for the possible change in sampling.

Multivariate analyses

In another set of control analyses, we performed an auto.ARIMA analysis, in which we included the PCA factor formed by all socio-ecological covariates as a second covariates. By comparing the magnitude of the effect from this first principal component (which was chiefly driven by ecological variables) and music production index, we can assess the relative contribution of the music production index via-a-vis other socio-ecological covariates. The results of this automated forecasting procedure indicated that a model with a positive autoregressive component, B = .513, SE = .118, a significant positive contribution of the novel music production index, B = .038, SE = .016, z = 2.37, p = .018, and a non-significant positive trend formed by ecological covariates (and chiefly reflecting economic and population growth), B = .026, SE = .016, z = 1.61, p = .108, provides the best fit to the data:

This model estimation suggests that the index of novel song choices contributes to average lyrical compressibility above and beyond the temporal autocorrelation as well as other ecological covariates observed for average lyrical compressibility. Moreover, the effect of music production on lyrical compressibility was stronger than other feasible covariates explored in the present dataset.

Exploratory song-level analyses

In exploratory analyses we evaluated how lyrical compressibility is associated with song success, and whether this relationship was stronger in time periods when more novel music was produced. Given that we shifted focus to song-specific data, we utilized a multi-level framework via lme4 package in R , with songs’ chart position and lyrical compressibility scores nested within years. Preliminary auto.ARIMA analyses on the yearly aggregate data indicated that a model with no auto-regressive components but a linear trend would show the best model fit. Therefore, in the first multi-level model we included year as a proxy for a linear trend as well as compressibility X year interaction as predictors of song success. Both year and lyrical compressibility were mean-centered prior to analyses. This multi-level model showed a good overall model fit, R 2 = .05, with 3.9% of the variance explained by fixed effects. Results indicated a significant effect of year, B = 0.318, SE = 0.031, t ( df = 57.29) = 10.23, p < 001, suggesting that over time songs included in the sample on average had a lower chart rank—a typical regression to the mean effect. Importantly, more compressible songs showed significantly higher rank in the charts, B = - 9.321, SE = 0.661, t ( df = 14640.88) = 14.10, p < .001, and this effect was particularly pronounced for more recent years, compressibility X Year interaction, B = - 0.105, SE = 0.039, t ( df = 14581.41) = 2.71, p = .007.

In the second step, we added mean-centered yearly music production index as a second covariate, along with a music production X compressibility interaction. Based on prior auto. ARIMA results, we also included linear effect of year to account for the trend in the chart position. This multi-level model also showed a good overall model fit, R 2 = .06, with 4.7% of the variance explained by fixed effects. More compressible songs showed significantly higher rank in the charts, B = - 9.353, SE = 0.657, t ( df = 14819.95) = 14.23, p < .001. Also, average chart position of songs was higher in years with a greater volume of songs produced, B = 6.141, SE = 1.280, t ( df = 53.76) = 4.80, p < .001. Moreover, as Fig 2 indicates, lyrical compressibility was more strongly associated with song success in years with greater volume of produced songs, compressibility X music production interaction, B = - 2.170, SE = 0.648, t ( df = 14781.15) = 3.35, p = .001. These analyses yield results consistent with the proposition that lyrically simpler songs enjoy greater success in time periods in which more novel song choices are available.

An external file that holds a picture, illustration, etc.
Object name is pone.0244576.g002.jpg

Confidence bands indicate 95% around the estimate.

Forecasting

As a final step, we generated a forecast for average lyrical compressibility for four decades after the last data point in our time series. This is in keeping with recommendations by Varnum & Grossmann [ 38 ] that papers analyzing past patterns of cultural change provide forecasts for the future. These forecasts enable a test of this theoretical model against concrete future cultural trends. Using the automated ARIMA algorithm, we also identified the best function for the novel song choices data, which we used to estimate the subsequent 40 data points. In turn, we used this estimated data in conjunction with the compressibility function to forecast the further development of lyrical compressibility. Results of this model suggest that lyrical compressibility will continue to increase over the next several decades (see Fig 1 ).

Popular music lyrics have recently been used to inform work on the cultural transmission of emotional expression [ 14 , 66 ], as an index of culture-level changes in self- versus other-focus [ 5 ], and as a reflection of cultural mood in respond to economic and social threats [ 18 , 19 ]. But one major trend in popular music lyrics remained underexplored and unexplained—popular music lyrics are coming increasingly simple over time. We reasoned and found support for the hypothesis that increasing lyrical simplicity is associated with increasing amounts of novel music production. That is, in times when more novel music is produced, popular songs become increasingly lyrically simple.

The relationship between mean lyrical compressibility and the amount of novel music produced each year was robust. We observed significant positive associations across three operationalizations of the amount of novel song choices and the average lyrical compressibility of popular songs. Further, the relationship between amount of novel song choices and average compressibility of popular songs remained significant when including a host of ecological, socioecological, and cultural factors linked to other types of cultural change both in univariate and multivariate analyses. By and large these other variables were not significantly associated with changes in lyrical simplicity after controlling for the potentially confounding influence of temporal autocorrelation. Of note, we also observed a significant negative association between changes in pathogen prevalence and lyrical simplicity. This observation suggests a potentially new consequence of infectious disease threat, one that should be explored in more detail in future work.

Importantly, the linkage between amount of new music produced and average compressibility of popular songs also held when accounting for temporal autocorrelation using three distinct methods. Thus, results suggest that the amount of novel music produced contributes to changes in average lyrical compressibility above and beyond other plausible causes and autoregressive trends in the data.

In exploratory analyses, we also found evidence suggesting that success, as indexed by position in the billboard charts, among popular songs was associated with greater lyrical compressibility. This is broadly consistent with the notion that simpler content enjoys an advantage in memorability and/or transmission. Importantly, this effect appeared to be stronger in years when the amount of novel songs produced was higher, providing conceptual confirmation of our key finding. More novel song choices appear linked to both greater average lyrical compressibility of the body of songs that succeeds (i.e., those entering the billboard chart in a given year), and, among songs entering the charts in a given year, compressibility was more strongly associated with better performance on the chart in years when more novel songs were produced.

This finding might parallel ongoing research taking information-theoretic approaches in exploring communicative efficiency in human language [ 67 , 68 ]. For example, in both language and music, something akin to Zipf’s law seems to be at play [ 2 ]—i.e., the frequency rank of a phenomenon is inversely proportional to its probability, such that, in the case of language, many words are quite rare, but a few words (e.g., pronouns) appear with great frequency. Moreover, these more successful (i.e., frequently-used) words tend be shorter in length (but see also Piantadosi et al., 2011 [ 69 ]). This observation dovetails with our finding regarding the success of simpler lyrics. Indeed, the increasingly success of simple lyrics may reflect increasing communicative efficacy.

A preference for simpler information in increasingly information-saturated environments might also be consistent with some propositions from cultural evolutionary theory. One tenet of cumulative cultural evolutionary theory is that human innovation, transmission, and learning increase the amount and quality of cultural information, while also increasing the learnability of this information [ 70 , 25 ]. One way to increase information learnability is via simplicity [ 71 , 72 ], thereby yielding increasingly efficient communication.

The present report adds to two growing bodies of empirical research—work emphasizing the examination of cultural products as a window into cultural-level psychological processes [ 14 , 5 ] and work using time-series methods to test hypotheses regarding the causes of particular patterns of cultural change (for a review see Varnum & Grossmann, 2017 [ 38 ]). Here, we use big data and time series methods to show that increases in the amount of novel songs over time appear to be linked to the increasing simplicity of popular songs’ lyrics, as well as greater success of songs with simpler lyrics. What does this tell us more broadly about how American culture has changed? It suggests potentially that success of aesthetic complexity at the cultural level may be something that shifts over time. Although this is not the first such demonstration of this phenomenon, to our knowledge this is the first attempt to formally evaluate why such cultural-level preferences may change.

Alternative and complementary explanations

Although we found that our key effect was highly robust, alternative or complementary explanations for the growing success of lyrically simpler songs are still possible. For example, changes in the ways that people consume popular music could perhaps affect lyrical simplicity. Technological innovation (e.g., various portable music devices) could play a role, as could other variation in the ways that people interact with music. Relatedly, one might speculate that the success of increasingly simple lyrics might owe to technologically mediated increases in listening to music primarily in the background (e.g., on commutes, in gyms). However, one might easily argue that for generations music has been consumed in this fashion albeit with slightly different technologies—portable radios, car stereos, and portable music players have existed and been widely used for decades. It would be interesting to attempt to assess this question empirically, although we are not currently aware of high-quality time series data relating to how and why people listen to popular music. Moreover, operationalization of these indicators of technological innovations over time would be a potentially thorny problem. For instance, what does it mean to own a Walkman in 1982 as compared to a similar device in 2002? Nonetheless, it would be intriguing to assess these questions in future work.

Another possibility is that the length of songs may have changed over time affecting average lyrical complexity. Thus, perhaps song lyrics are more compressible by virtue of songs becoming shorter. However, a recent analysis of songs entering the Billboard charts over the course of its history suggests, in fact, that the average song on the charts in the late 2010’s was somewhat longer than those in the 1950’s and 1960’s, and similar in recent years to levels observed in the 1970’s [ 73 ]. Thus, this alternative explanation cannot account for the trends observed in the present analyses.

One might alternatively speculate that the rise in lyrical simplicity observed in the present data might be related to trends in the popularity of different musical genres. Indeed, although this is beyond the scope of the present work, it would be interesting to empirically assess how lyrical complexity varies across popular music genres and whether trends within these genres over time have been similar. Further, future work might assess whether the linkage between lyrical simplicity and song success observed in our exploratory analyses varies within genres of popular music or if genres that are on average simpler enjoy greater success in times of more music production.

Limitations

It is worth noting that our analysis was restricted to a single type of cultural product. It might be the case that empirical analysis of other domains might show similar trends and a similar relationship between amount of novel content and success of simpler content, or it may be that different dynamics are observed when considering television shows, videogames, or other types of cultural products. For example, many have argued that television shows have become more complex and intellectually stimulating in the past few decades, entering the so-called “Golden Age of Television.” However, empirical work examining complexity over time in other types of cultural products, including movies, news broadcasts, print newspapers, novels, and political speech suggests that there is in fact a broad trend toward simpler content being increasingly preferred, at least when it comes to the language used in these products [ 74 ]. It is noteworthy that Jordan and colleagues (2019) used a different measure of complexity, in this case use of a specific set of words indicate cognitive complexity, and that they find that the strength of the decline in complexity varies across different types of cultural products. Hence, future research may attempt to conceptually replicate our work by assessing compressibility of other types of cultural products over time and whether the success of such products is linked to the number of options or alternatives within that domain.

It is also worth noting that, in the present work, we assessed the simplicity of lyrics. Songs might be complex or simple in other ways as well, in terms of rhythm, melody, number of instruments played, and so on. Analyses of these features is beyond the scope of the present work, but it would be interesting to see the extent to which similar or divergent patterns are observed in these facets of successful popular music over time.

Our analysis was also limited to songs that were relatively successful over time—i.e., those that made the Billboard Hot 100 chart. This sample is quite large ( N > 14,000), but it may not be representative of all songs produced during this period. Further, we were able to successfully scrape a greater proportion of more recent rather than older songs, which we included in control analyses. Our sample captures a large chunk of popular music produced during more than half a century and enables tests regarding linkages between novel music choices, lyrical simplicity, and song success. A slightly different conceptual question may be worthwhile addressing in future work: Does average complexity of all music produced change along with shifts in the amount of music produced?

Our work is also limited by the fact that song success was operationalized by commercial success in the US market. Although some cultural shifts in the past several decades appear to be global in nature, such as rising individualism [ 36 ], this need not be the case for all dimensions of culture. Different dynamics may potentially be observed in terms of song success in parts of the world with different values, practices, and ecological conditions. Although such an endeavor is beyond the scope of the present manuscript largely due to the lack of equally rich time series data from other countries, it would be worthwhile to try to address this question in the future.

Finally, the present work is limited by its correlational nature. Although our findings appeared quite robust across different operationalizations of the independent variable—when accounting for autocorrelation in various ways, and when controlling for a host of plausible ecological, socioecological factors, and cultural values which have shifted over time—we cannot completely rule out all alternative explanations for increasing success of songs with simpler lyrics. Future work might attempt to quantify society level time series trends in conformity or other biases linked to lyrical affect and music sampling [ 14 , 75 ], and assess whether the present findings hold when controlling for these variables as well. Future work may also use in-lab methods to explore and disentangle the possible causal mechanisms underlying the link between amount of novel song choices and success of songs with simpler lyrics. For example, transmission chain methods [ 76 ] could be employed to explore whether participants might find simpler lyrics more pleasing and memorable when there is a greater number of other song-snippets competing for attention versus when there is not.

Why have the lyrics of pop songs become simpler over time? Our findings suggest that the answer may have to do with the proliferation of new songs available to consumers. The present work represents one of the first attempts to use big data and time series methods to quantify temporal shifts in information transmission dynamics at the societal level. Future work may attempt to replicate and extend these findings into other types of complexity and other types of cultural products.

Supporting information

Funding statement.

The author(s) received no specific funding for this work.

Data Availability

  • PLoS One. 2021; 16(1): e0244576.

Decision Letter 0

17 Sep 2020

PONE-D-20-20631

People prefer simpler content when there are more choices: A time series analysis of lyrical complexity in six decades of American popular music

Dear Dr. Varnum,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. 

The two reviewers provide constructive and partially overlapping comments on your framing and the analyses. I strongly encourage you to consider the additional analyses and validity checks proposed by reviewer 1 as well as addressing the conceptual questions raised by both reviewers 1 and 2.

I am also wondering whether genre and the proliferation and diversification of genres over the last century may partially be responsible for some of these effects. To what extent do these trends occur within genres or over the careers of artists/groups? Do novel genres have an advantage over more established genres? Greater attention to genres of music as well as trends for the same agent (singer/songwriter, performer) may help to address some of the conceptual issues identified by the reviewers.

Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 01 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at gro.solp@enosolp . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:  http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Ronald Fischer

Academic Editor

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information .

3. Please remove your figures from within your manuscript file, leaving only the individual TIFF/EPS image files, uploaded separately.  These will be automatically included in the reviewers’ PDF.

4. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary).

Additional Editor Comments (if provided):

This is an innovative and thought provoking article. The two reviewers provide constructive and partially overlapping comments on your framing and the analyses. I strongly encourage you to consider the additional analyses and validity checks proposed by reviewer 1 as well as addressing the conceptual questions raised by both reviewers 1 and 2.

I am also wondering whether genre and the proliferation and diversification of genres over the last century may partially be responsible for some of these effects. To what extent do these trends occur within genres or over the careers of artists/groups? Do novel genres have an advantage over more established genres? Greater attention to genres of music as well as trends for the same agent (singer/songwriter, performer) may help to address some of the conceptual issues raised.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This study explores a trend towards greater compressibility of US song lyrics, which became more repetitive over the last 6 decades. The authors test the claim that this trend is due to an increase in the variety of songs on offer. The results show that novelty in music production (henceforth "musical novelty") is a significant predictor of lyrics compressibility, even when controlling, separately, for temporal autocorrelation on the one hand, and for a host of potential confounds on the other hand.

This is an exciting and innovating study, correctly done overall, and demonstrating an intringuing and non-trivial phenomenon: song lyrics become more repetitive over time. The use of a future-oriented predictive model is particularly appreciated. If the paper merely demonstrated and explored this trend I would have no reservations about it. My main concern comes from the causal hypothesis that the study puts forward to explain the trend.

The results only partially support the authors' claims. First, because the study fails to test a set of competing explanations that seem more plausible to me than the one put forward. They are detailed below. Second, because the claim that novel music production predicts lyric compressibility above other predictors (p. 18, "the amount of novel music produced contributes to changes in average lyrical compressibility above and beyond other plausible causes") is not demonstrated or even suggested by the data. Third, no evidence is given for the contention that more compressible songs are more likely to be successful, when there is more choice (in the authors' own data or elsewhere).

1. Alternative explanations

An explanation that is alluded to in one paragraph of the discussion (p. 21) but not followed through is that song lyrics became simpler and more repetitive because listening to music became something that people did while doing other things and often without paying any particular attention (in supermarkets, elevators, bars, etc., no longer just concert halls or standing on street corners). This would readily explain why lyrics become simpler: because songs no longer have the listeners' undivided attention. This explanation is entirely distinct from the hypothesised effect of musical novelty: it is about changes in music consumption, not about changes in music production. Even so, it is coherent with the pattern of results presented here. Arguably the musical industry produced increasingly many songs because demand grew, and demand grew because people took to listening to music in circumstances where they did not use to. Changes in media of diffusion (e.g. from sheet music to radio) are an obvious and related explanation. Unless we assume that these two hypotheses are somehow equivalent or interchangeable, one cannot claim that growing musical novelty caused the observed trend without ruling out this alternative account.

One may also worry about a possible selection bias. As explained in the supplementary materials, the study selected roughly half the songs that appeared in the charts for textual analysis, due to difficulties in finding good textual data for other songs. This raises the possibility that a selection bias might explain the observed trend. It is possible that text data is better for later songs: that our documentation for 2000s hits is better than it is for 1960s hits. It is possible that songs with more less compressible lyrics are more likely to be documented, because they are more interesting, lyrics-wise, and more worthy of attention. If these two conditions obtained they would suffice to produce an apparent decrease in compressibility that would be entirely due to a preservation bias. Lyric compressibility would not actually decrease through time for unrecorded song lyrics. I am not saying that this is what happened, but this explanation is easy to rule out (just show that the proprotion of hit songs with undocumented lyrics does not change through time, or that such changes, if they occur, do not explain away the trend you observe). Relatedly, more detail on the selection of song lyrics to be analysed would be welcome: what the criteria for inclusion were, whether there was any stopping rule for data collection, etc.

2. Is novel music production a better predictor of lyric compressibility than other predictors?

The results do not establish that musical novelty is a better predictor of lyrics compressibility compared to other possible predictors studied here. Several indicators show a higher correlation with lyrics compressibility, among them (judging by Fig. 1) GDP per capita, population size, and (with an inverse correlation) residential mobility. (Although I don't know what would happen to these correlations after autocorrelation is taken into account.) To sustain the claim that musical novelty is a better predictor of lyric compressibility than other candidates, running partial correlations is not sufficient. Partial correlations merely show that the correlation between lyrics compressibility and musical novelty is robust when variable X is taken into account, but it could still be the case that variable X does better, as a predictor of lyrics compressibility, than musical novelty does.

Relatedly, it is not clear whether the correlation between lyrics complexity and musical novelty would still hold once all important confounds are controlled for *together*, and not just separately as done here. The choice of analysis that was made for this study (taking years as data points) does not allow this to be shown (too few data points), but a nested regression taking songs as data points instead of years might allow the authors to demonstrate this (with due attention being paid to multicollinearity). Alternatively, the authors could reduce all the potential confounds (all factors listed in Fig. 1 except Lyric compressibility, Music production, and Year) to one super-factor, with a PCA. Showing that the correlation between lyrics complexity and musical novelty holds when doing a partial correlation controlling for this super-factor would help make the authors' point.

3. Missing evidence of greater success for simpler songs

On p. 3–4, the study justifies the hypothesis to be tested on the grounds that people generally prefer simpler content to more complex content, especially when the choice is broad. This debatable claim is made by analogy with results in social psychology and experimental economics which in my view are not clearly relevant to the material being studied here. The similarity between a simple economic decision (e.g. a financial product that is easy to understand, as in Iyengar & Kamenica 2010) and a repetitive song, seems quite remote to me. Still, this view makes one clear prediction: more compressible songs should be more commercially successful than compressible ones, at least when there is a lot of choice. The paper seems to endorse this point but does not cite any evidence for it. It would be easy to answer this question, by comparing billboard hit songs with non-hits and controlling for various other factors.

Minor comments:

One possible confound that is (in my view) unlikely to explain the study's correlations but is easy to control for and should be ruled out, is song length: given the measurement of compressibility, I suspect song length will strongly impact compressibility, and if there is any trend in time towards shorter or longer song this might confound the observed trends.

The legend for figure 1 says that the correlations between variables are given as Kendall's tau, but I doubt it for two reasons. 1: The value given in the figure for the correlation between the Music Production index and Lyric Compressibility is .88, which does not correspond to the value reported in the main text (Kendall’s τ = .714), but does correspond to the Pearson's r correlation given in the markdown file (Pearson's r = .87723). 2. In the source code for the figure the method for the correlation is not specified (the command is cor(years, use=" " ext-link-type="uri" xlink:type="simple">pairwise.complete.obs" )). I suspect R defaults to method = "pearson" when method isn't specified. Please clarify and correct if needed.

Correlations are occasionally (exceptionally) given using Pearson's r (p. 10, also p. 14 when reporting the results for Tiokhin-Hruschka method). The authors note that this parametric correlation is inappropriate since time-series data are not normally distributed. Please remove mentions of Pearson's r or uses of it in reporting results. I recommend paying special attention to results on the Tiokhin-Hruschka method when doing so. See also the above comment regarding Fig. 1.

p. 16 AIC stands for Akaike's Information criterion (not Aikeke).

p. 20 This passage of the discussion alludes to a section of the supplementary materials that I could not find: "the aim of the present work was to understand what shapes the success of cultural products over time, rather than to use the broadest possible set of cultural products as a way to gain insight into other phenomena at the population level (see supplement for an extended discussion of this issue)."

Reviewer #2: This paper presents an analysis of why pop music in the US has become lyrically simpler over time, testing the hypothesis that the trend is driven by an expansion in the number of available song choices. This is tested by quantifying lyrical simplicity using a metric of information compressibility (LZ77 compression algorithm) over thousands of songs, and correlating this measure with estimates of the number of new songs in each year. The results support the hypothesis: large correlations between the measures.

The paper is well written and the analyses are sound and generally appropriately interpreted. The ‘multiverse’-style analysis approach is also helpful in that it provides converging different approaches. The results will be of interest to people in the psychology of music, cultural evolution, and the general public as well.

Here are a few suggestions for a revision:

(1) What songs are most popular and make it to Billboard is not unrelated to preferences, but also not that tight of a measure of people’s self directed-listening behaviours and preference for music, as is implied by the use of "preferences" throughout the paper. for instance, radio plays are influenced by advertisers, independently of people's preferences for songs. A tighter claim to make is that, as more music becomes available, simpler songs are more memorable and/or dispersible than more complicated ones. Whether and how this is related to claims in the manuscript about peoples’ music preferences changing based on Kahneman-esque heuristics being deployed due to increased cognitive load (Intro, pages 4 and 5) and/or interpreting these changes in lyrical trends as indicating changes in emotional expression (if this is what the abstract framing + discussion is implying? Eg. in “What does this tell us more broadly about how American culture has changed?”) is more up for debate, I think. This is an easy fix: just need to clarify the interpretation in the paper a bit more.

(2) The manuscript is clear that the correlational data doesn’t justify claims about causality, but it would be helpful to tighten up the areas where an interpretative claim is being made. Might the direction of causality be backwards? Songs that are simple could be easier to produce, so as artists realize they can produce simpler styles, maybe they produce more of them? There are plenty of other explanations here that would be good to discuss. For instance, maybe memorability is a big driver in what songs get a lot of radio plays, where memorability is a different aspect of music perception than preference.

(3) There may be some interesting parallels to be drawn between these results and ongoing research in how languages more generally are shaped by communicative efficiency (see for review: Gibson et al., 2019, TICS). Namely, the primary measure of simplicity of lyrics is sensitive to word length. Zipf’s law describes the frequency structure of words in a language as being related to word length (eg, Piantadosi, 2014, Psychonomic Bulletin Review), although more recent work shows that information content of words is a better predictor of word length than frequency-rank (Piantadosi et al., 2011, PNAS): in other words, more predictable words tend to be shorter. Something like Zipf's law is at work in music (see Levitin et al., 2012, PNAS; Mehr et al., 2019, Science) and so this connection with information-theoretic notions of communication would be productive. (It also fits neatly with how lyrical simplicity is quantified with LZ77).

(4) To what extent is variance in lyrical compressibility in these data mediated by the distribution of genres within the presented dataset? Electronic/dance music often has highly simple repetitive lyrics as a defining feature, for example, more so than, e.g., jazz lyrics. Perhaps one of the reasons for the popularity of electronic/dance genres within the broader popular music space may relate to this claimed attraction toward simplicity of lyrics. But the deeper point is then to ask how much of the variance in lyrical compressibility is stemming from a general trend across popular music genres and how much is contributed by relative shifts in other stylistic factors (that may be correlated with greater lyrical compressibility for additional reasons). Disentangling this is probably difficult, but I feel like it could be discussed.

For the predictions about the lyrical compressibility of future popular music, some comments about the bounds in which such extrapolation is valid/meaningful would be helpful. What does it mean for music to have an average compressibility index of ~1.225 by 2050 (as compared to the current average of ~1.1)? What are reasonable bounds of compressibility that things might plateau at?

Please check references, as at least one in-text citation was not in the end references (Steegen et al., 2016)

Mehr Krasnow 2017 is a bit of a funny citation for "music is a human universal". I think better might be Mehr et al., 2019, Science and/or the new BBS theoretical treatment ( https://doi.org/10.1017/S0140525X20000345 )

A reference about how lyrics play an important part in people’s listening habits may be helpful. For instance, this paper based on Spotify listening data would be a helpful citation: http://archives.ismir.net/ismir2018/paper/000098.pdf .

6. PLOS authors have the option to publish the peer review history of their article ( what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,  https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at  gro.solp@serugif . Please note that Supporting Information files do not need this step.

Author response to Decision Letter 0

15 Oct 2020

Dear Dr. Fischer,

We appreciate your inviting the revision of manuscript, now entitled “Why are song lyrics becoming simpler? A time series analysis of lyrical complexity in six decades of American popular music.”

To remind you of the contribution, briefly, we explore the surprising trend that popular songs are becoming increasingly simple. We reason that the increasing production of novel songs may drive this phenomenon and test this association, finding a robust link. We situate this finding in the growing bodies of work using song lyrics to assess culture-level phenomena and work using time series analysis to understand drivers of cultural change.

We see this work as being of interest to not only to those interested in social or cultural psychology, but also those studying communication, cognitive science, and music, as well as to the lay public.

Below, we detail the changes made to this revision in line with the reviews, point-by-point, including a significant number of additional analyses. You will find critiques in plain text, with our replies italicized below. We have also highlighted major changes in the revised manuscript file in yellow for your convenience.

This is an exciting and innovating study, correctly done overall, and demonstrating an intriguing and non-trivial phenomenon.

We thank the reviewer for their enthusiasm for the work.

We have run a significant number of new analyses to address this. In particular, we comprehensively address the reviewer’s second and third points, finding that novel song production is a robust predictor of lyrical simplicity even over and above a host of other ecological and cultural predictors (see Tables 1 and 2), including in new multivariate analyses (see page 19 “Multivariate analyses”), and showing new evidence that this relationship between song success per se (as indexed by a song’s position on the Billboard chart) and novel song production is strongest in years when there are more novel songs produced (see pages 19-21 “Exploratory song-level analyses”).

The reviewer also raised two competing hypotheses:

An explanation that is alluded to in one paragraph of the discussion (p. 21) but not followed through is that song lyrics became simpler and more repetitive because listening to music became something that people did while doing other things and often without paying any particular attention (in supermarkets, elevators, bars, etc., no longer just concert halls or standing on street corners). This would readily explain why lyrics become simpler: because songs no longer have the listeners' undivided attention. This explanation is entirely distinct from the hypothesised effect of musical novelty: it is about changes in music consumption, not about changes in music production.

We now address this point at even greater length in the discssion section (see page 25 first full paragraph), noting that, in particular, technology-mediated changes may influence music consumption practices. However, respectfully disagree that changes in listener attention are likely to cause the shift in lyrical simplicity seen here; for example, people have listened to music in their cars for decades, portable music players have been available for decades, and music has been featured as the background noise in various entertainment establishments for decades. Further, although an interesting avenue for future research, we feel it is beyond the scope of the present work to assess music listening patterns for reasons described on page 25 first full paragraph)

One may also worry about a possible selection bias. As explained in the supplementary materials, the study selected roughly half the songs that appeared in the charts for textual analysis, due to difficulties in finding good textual data for other songs. This raises the possibility that a selection bias might explain the observed trend. It is possible that text data is better for later songs: that our documentation for 2000s hits is better than it is for 1960s hits. It is possible that songs with more less compressible lyrics are more likely to be documented, because they are more interesting, lyrics-wise, and more worthy of attention.

In order to address this point we conducted analyses that controlled for percentage of charting songs for which lyrics could be successfully scraped (see page 18-19 “Robustness analyses: Controlling for percentage of scraped songs”). Our key relationship held controlling for this possibility.

Relatedly, more detail on the selection of song lyrics to be analysed would be welcome: what the criteria for inclusion were, whether there was any stopping rule for data collection, etc.

Additional details regarding the processing of song lyrics can be found on pages 2-3 of the Supporting Information.

Is novel music production a better predictor of lyric compressibility than other predictors?... The results do not establish that musical novelty is a better predictor of lyrics compressibility compared to other possible predictors studied here. Several indicators show a higher correlation with lyrics compressibility, among them (judging by Fig. 1) GDP per capita, population size, and (with an inverse correlation) residential mobility. (Although I don't know what would happen to these correlations after autocorrelation is taken into account.) To sustain the claim that musical novelty is a better predictor of lyric compressibility than other candidates, running partial correlations is not sufficient. Partial correlations merely show that the correlation between lyrics compressibility and musical novelty is robust when variable X is taken into account, but it could still be the case that variable X does better, as a predictor of lyrics compressibility, than musical novelty does.

We understand the reviewer’s concern, however we note that we do not claim that musical novelty is the best predictor of average lyrical compressibility. That said, we believe that new analyses in which we look at detrended relationships between all putative predictors and average lyrical compressibility suggest that it is one of only three significant predictors, and the only one for which we frankly had an a priori hypothesis when we began the work. We attempt no interpretation of the negative relationship between conservatism and compressibility, and we do talk briefly about the negative relationship between pathogens and compressibility, which we suggest should be followed up on in the future. That said, again our focus was on testing our a priori hypotheses about ONE possible driver of growing lyrical simplicity, hence we focus on this in the present manuscript.

We are grateful to the reviewer for this suggestion. Our new multivariate analyses follow these suggestions (p.19) and find that our key effect holds. Taken together we believe we have a great deal of evidence for the robustness of our key finding and we are grateful to the reviewer for helping strengthen the rigor of the manuscript.

Missing evidence of greater success for simpler songs On p. 3–4, the study justifies the hypothesis to be tested on the grounds that people generally prefer simpler content to more complex content, especially when the choice is broad. This debatable claim is made by analogy with results in social psychology and experimental economics which in my view are not clearly relevant to the material being studied here. The similarity between a simple economic decision (e.g. a financial product that is easy to understand, as in Iyengar Kamenica 2010) and a repetitive song, seems quite remote to me. Still, this view makes one clear prediction: more compressible songs should be more commercially successful than compressible ones, at least when there is a lot of choice. The paper seems to endorse this point but does not cite any evidence for it. It would be easy to answer this question, by comparing billboard hit songs with non-hits and controlling for various other factors.

Great point! We’ve taken this advice to heart (see pages 19-20, “Exploratory Song-level analyses,”) and we do find empirical support for this claim. Namely, among Billboard charting songs, those that are more compressible are more successful. Further this relationship is stronger in years in which more novel songs are produced. We thank the reviewer for suggesting this and we believe again that the rigor of the manuscript and the fit between evidence and the rationale in the introduction has been enhanced as a result.

Minor comments

- One possible confound that is (in my view) unlikely to explain the study's correlations but is easy to control for and should be ruled out, is song length: given the measurement of compressibility, I suspect song length will strongly impact compressibility, and if there is any trend in time towards shorter or longer song this might confound the observed trends.

We now address this possibility in the discussion section. Based on empirical findings regarding song length of Billboard charting songs, we do not feel that this alternative explanation can explain our observations. See below (from page 26):

“Another possibility is that the length of songs may have changed over time affecting average lyrical complexity. Thus, perhaps song lyrics are more compressible by virtue of songs becoming shorter. However, a recent analysis of songs entering the Billboard charts over the course of its history suggests, in fact, that the average song on the charts in the late 2010’s was somewhat longer than those in the 1950’s and 1960’s, and similar in recent years to levels observed in the 1970’s (Bannister, 2017). Thus, this alternative explanation cannot account for the trends observed in the present analyses.”

- The legend for figure 1 says that the correlations between variables are given as Kendall's tau, but I doubt it for two reasons. 1: The value given in the figure for the correlation between the Music Production index and Lyric Compressibility is .88, which does not correspond to the value reported in the main text (Kendall’s τ = .714), but does correspond to the Pearson's r correlation given in the markdown file (Pearson's r = .87723). 2. In the source code for the figure the method for the correlation is not specified (the command is cor(years, use=" pairwise.complete.obs ")). I suspect R defaults to method = "pearson" when method isn't specified. Please clarify and correct if needed.

We are grateful to the reviewer for catching this error. This has now been corrected in Table S1 which reports kendall’s tau’s instead of pearson’s r’s.

We understand the reviwer’s concern here. However we note that the Tiokhin-Hruschka procedure can only produce corrected significance thresholds for Pearon’s r at present. We have opted to leave these results in in the spirit of a multiverse approach. Importantly, this is only one approach used to account for autocorrelation, and importantly we get converging inferences using these different approaches. However, if the editor wishes, we are happy to move this section the supplement or to OSF as a supporting file.

Again, we are grateful to the reviewer for catching the error. It is now corrected.

We discuss this issue on page 27-8 of the revised manuscript and on pages 3-4 of the revised supplement. We hope this discussion is sufficient.

What songs are most popular and make it to Billboard is not unrelated to preferences, but also not that tight of a measure of people’s self directed-listening behaviours and preference for music, as is implied by the use of "preferences" throughout the paper. for instance, radio plays are influenced by advertisers, independently of people's preferences for songs. A tighter claim to make is that, as more music becomes available, simpler songs are more memorable and/or dispersible than more complicated ones. Whether and how this is related to claims in the manuscript about peoples’ music preferences changing based on Kahneman-esque heuristics being deployed due to increased cognitive load (Intro, pages 4 and 5) and/or interpreting these changes in lyrical trends as indicating changes in emotional expression (if this is what the abstract framing + discussion is implying? Eg. in “What does this tell us more broadly about how American culture has changed?”) is more up for debate, I think. This is an easy fix: just need to clarify the interpretation in the paper a bit more.

We have addressed this issue in line with the reviewer’s helpful comment; namely we clarify the interpretation in the present revision.

The manuscript is clear that the correlational data doesn’t justify claims about causality, but it would be helpful to tighten up the areas where an interpretative claim is being made. Might the direction of causality be backwards? Songs that are simple could be easier to produce, so as artists realize they can produce simpler styles, maybe they produce more of them? There are plenty of other explanations here that would be good to discuss. For instance, maybe memorability is a big driver in what songs get a lot of radio plays, where memorability is a different aspect of music perception than preference.

We agree that causal inference is inherently limited when analyzing this type of data. We have tried throughout the revised manuscript to be cautious in terms of causal and mechanistic claims, especially in the revised discussion section. We have also added several new analyses (see replies to reviewer 1 for details) that we hope do strengthen the inferences made, although again stopping short of claiming to show causality.

There may be some interesting parallels to be drawn between these results and ongoing research in how languages more generally are shaped by communicative efficiency (see for review: Gibson et al., 2019, TICS). Namely, the primary measure of simplicity of lyrics is sensitive to word length.

Zipf’s law describes the frequency structure of words in a language as being related to word length (eg, Piantadosi, 2014, Psychonomic Bulletin Review), although more recent work shows that information content of words is a better predictor of word length than frequency-rank (Piantadosi et al., 2011, PNAS): in other words, more predictable words tend to be shorter. Something like Zipf's law is at work in music (see Levitin et al., 2012, PNAS; Mehr et al., 2019, Science) and so this connection with information-theoretic notions of communication would be productive. (It also fits neatly with how lyrical simplicity is quantified with LZ77).

We thank the reviewer for pointing out this interesting parallel, which we now treat at some length in the Discussion (pages 23-24). We additionally link the present data and this work to another area of literature dealing with cultural evolution and communicative efficiency:

Minor comments: For the predictions about the lyrical compressibility of future popular music, some comments about the bounds in which such extrapolation is valid/meaningful would be helpful. What does it mean for music to have an average compressibility index of ~1.225 by 2050 (as compared to the current average of ~1.1)? What are reasonable bounds of compressibility that things might plateau at?

We are grateful for this insightful set of suggestions. We have now added the following description which we hope helps guide the reader’s intuitions (pg. 7) : “A score of 0 means no compression was possible (e.g. if the input were random noise), a score of 1 means a 50% reduction in size, a score of 2 means a 75% reduction, and so on.”

Further, there is a theoretical upper limit on compressibility score for any given length. The most repetitive possible song of length n would be a single letter repeated n times, and it would have a score of (log n) - 2. But this is so far from the reality of the data as to not be very interesting.

Please check references, as a least one in-text citation was not in the end references (Steegen et al., 2016).

We have now double checked the reference list and it should now match all in text citations. Thanks to the reviewer for catching this!

We agree and have switched the citation to Mehr et al., 2019.

AE Decision Letter

These are good points. We agree that genre would be an interesting avenue for future exploration and we now include an extended discussion of this issue in the revised discussion section (page 26). In terms of tracking the course of an individual artist’s output, this would also be an intriguing possibility, however we would be dealing with small N’s for most and potential confounds having to do with the aging process (i.e. executive function decline with age) that would be difficult to disentangle from broader cultural forces. We hope that with the additional analyses, revisions, and explication now provided that the reviewers points are largely addressed even though we did not opt to attempt analyses by genre or within artist. We hope that you will agree that the new analyses reported in the revision are in fact sufficient to all most major concerns.

In sum, we believe that we have addressed all major points raised by reviewers, and that the present revision is suitable for publication in PLOS ONE. We are grateful to the two reviewers and to yourself for the insightful feedback and critique. We believe the manuscript has improved tremendously as a result. We look forward to your reply.

Decision Letter 1

14 Dec 2020

Why are song lyrics becoming simpler? A time series analysis of lyrical complexity in six decades of American popular music

PONE-D-20-20631R1

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/ , click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at gro.solp@gnillibrohtua .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact gro.solp@sserpeno .

Additional Editor Comments (optional):

Congratulations, I recommend your article for publication to the Editor in Chief.

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

2. Is the manuscript technically sound, and do the data support the conclusions?

3. Has the statistical analysis been performed appropriately and rigorously?

4. Have the authors made all data underlying the findings in their manuscript fully available?

5. Is the manuscript presented in an intelligible fashion and written in standard English?

6. Review Comments to the Author

Reviewer #1: All my comments were addressed more than satisfactorily. The authors are to be congratulated for this excellent contribution!

Reviewer #2: (No Response)

7. PLOS authors have the option to publish the peer review history of their article ( what does this mean? ). If published, this will include your full peer review and any attached files.

Reviewer #1:  Yes:  Olivier Morin

Acceptance letter

18 Dec 2020

Dear Dr. Varnum:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact gro.solp@sserpeno .

If we can help with anything else, please email us at gro.solp@enosolp .

Thank you for submitting your work to PLOS ONE and supporting open access.

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ronald Fischer

March 28, 2024

Song Lyrics Really Are Getting Simpler and More Repetitive, Study Finds

An assessment of hundreds of thousands of songs confirms that choruses and hooks have taken over—but simpler isn’t necessarily worse

By Lauren Leffer

Pop art style comic book panel with woman singing carefree melody

Drante/Getty Images

When comparing today’s hit tunes with the top 40 of past decades, strong opinions are never in short supply. Every generation seems to lament its successor’s musical tastes and listening habits. Though science can’t necessarily account for such subjective preferences or generational divides, new research suggests popular music has indeed undergone some measurable and significant shifts over the past 50 years—with popular song lyrics becoming simpler and more repetitive , according to a study published on Thursday in Scientific Reports .

“There’s more rhyming lines and also more chorus,” says the study’s senior author Eva Zangerle , a computer scientist at Austria’s University of Innsbruck, who has developed music recommendation algorithms. “We basically found that lyrics [have gotten] easier to comprehend.” This trend, observed across five of the most popular English-language music genres (pop, rock, rap, R&B and country) since 1970, hints at how shifts in music listening habits, platforms and production may be shaping pop culture.

Zangerle and her co-authors compiled lyrics from 353,320 well-known songs released between 1970 and 2020. They used machine learning to single out these songs’ key linguistic features, such as the ratio of repeated words, the types of emotional cues, a readability score and the richness of vocabulary. Then they developed and trained additional models to sort and analyze those features across years in a representative subset of 12,000 songs.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing . By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The researchers discovered that the ratio of repeated to nonrepeated lines has ticked up across genres over the years, nearly doubling for pop songs and rising even more rapidly for rap music. The ratio of choruses to other song sections rose as well. The study also found that contemporary music lyrics convey more negative emotions and fewer positive ones than in the past and that songs have become more personal, with a higher frequency of pronouns such as “I” and “me”—echoing past research findings about shifts in lyrical content .

“I thought this study was really cool,” says Michael Varnum, a cultural psychologist at Arizona State University, who was not involved in the new research. Varnum has previously studied popular music and detected similar declines in lyrical complexity over time. This new work replicates those findings with a more inclusive sample, he says, by surveying an order of magnitude more songs, comparing across genres and looking beyond the biggest chart-toppers to assess the larger landscape of what people listen to.

Yet even this study’s expansive dataset isn’t completely comprehensive. Its focus on English-language songs—compiled from the online platform Genius—means it is inherently biased toward the Western cultural canon, Zangerle says.

Emotional intention and meaning, too, are hard to parse from lyrics alone, says Robin James , an independent popular music and philosophy scholar. James points out that slang terms such as “slay” might appear angry or violent in an automated assessment but actually convey a positive emotional meaning. Even complexity is challenging to quantify, she adds. Lyrics that seem like simple gibberish at first can actually be clever wordplay, James points out, noting Missy Elliot’s backward lines in the 2002 hit song “Work It.”

And lyrics are just one small part of what makes up a song, says Kaleb Goldschmitt, an ethnomusicologist at Wellesley College and co-editor of the Journal of Popular Music Studies . Even if lyrics are getting simpler, musical factors such as texture and rhythm seem to be getting more complex, Goldschmitt says.

Still, shifts in lyrical structure—particularly a boom in choruses—are apparent enough that music scholars have already been noting the trend for a few years, James says. It isn’t clear why these repeated sections are taking up more time in songs. But based on his past research, Varnum hypothesizes that the sheer amount of new music being produced could partially explain the phenomenon. “When people are faced with lots and lots of choices, they tend to prefer things that are easier to process and more straightforward,” he says.

The way people interact with songs has also shifted, Varnum adds, with more and more listeners tuning in to music while multitasking. “It would make sense that if it’s in the background, you don’t necessarily want things that are going to be challenging or attention-grabbing,” he says.

Today’s prevalence of streaming, Zangerle suggests, may also play a role by changing the ways artists compose for their market. Instead of investing in a whole album based on a single familiar song, listeners can now easily—and freely—preview every song an artist makes. This could ramp up pressure on artists to ensure all their songs are as catchy as possible.

But catchiness isn’t inherently negative, and music fans shouldn’t overthink what they like to hear. “Whether [the trend is] good or bad—that’s something I try to stay out of when thinking about science,” Varnum says.

“Complex music isn’t necessarily better music,” Goldschmitt notes. “If that were the case, we’d all be listening to prog rock.”

Digital Music News logo

  • French Indie Consortium Takes Aim At ‘Destructive Consequences’ of Warner Music’s Believe Buyout Plans, Urges Regulatory Action
  • Sony Music Entertainment CEO Won’t Rule Out a TikTok Catalog Removal—Especially Given ‘Astonishingly Small’ Payouts
  • CMW Head Flatters Ed Bicknell Back to Toronto Conference To Conduct Donald Passman and Bill Silva Keynote Interviews
  • Pop Conference 2024: ‘Legacy, Music Collections, and Archives’
  • DMN Pro Welcomes will.i.am to Speak at Our Upcoming ‘What Is Radio In 2024?’ Event Later This Month
  • How Much Artists Make Per Stream on Spotify, Apple Music, Amazon, YouTube, Pandora, More
  • Data & Research

New Study of Song Lyrics Reveals They’ve Become Simpler, More Repetitive and ‘Me’ Focused

song lyrics study

Researchers analyzed song lyrics across genres from 1980 to 2020 and have determined they’ve become simpler, more repetitive, and self-obsessed.

European researchers analyzed the words of more than 12,000 English language songs across multiple genres including rap, country, pop, R&B, and rock from a period of 40 years starting in 1980. Senior study author Eva Zangerle says how the music business has changed over the last forty years is impacting artist creativity.

“What we have also been witnessing in the last 40 years is a drastic change in the music landscape—from how music is sold to how music is produced,” Zangerle adds. From 1980 onward, there are several upheavals in music listening trends as vinyl records gave way to CDs and eventually .mp3s, all the way to the algorithmically-driven streaming habits of today.

Researchers looked at emotions expressed in the lyrics, how many different words were used, how complicated those words are, and how often a word is repeated.

“Across all genres, lyrics had a tendency to become more simple and more repetitive,” Zangerle says of the study. The study also found a decrease in positive, uplifting lyrics with a rise in lyrics that feature sadness, anger, or disgust. Another notable trend? Songs tend to be ‘me’ focused.

Zangerle notes that rap had the highest number of repeated lines moving across the 40-decade period, while also becoming more angry. Zangerle notes that the first ten to fifteen seconds of a song has become very important in whether it gets the boot in algorithmic listening.

The researchers also looked at the rates in which certain genre fans were looking up lyrics for their music. The study found that rock listeners were mostly interested in researching older songs’ lyrics, suggesting that rock fans are nostalgic for the hey-day of the genre. Meanwhile, country fans are more likely to look up a newer song’s lyrics driven by newer country artists’ music they’ve heard on the radio.

More than 582,759 songs were analyzed as part of this study, while researchers utilized the Genius website for its lyric analysis.

society logo

  • Previous Article
  • Next Article

AUTOMATIC SENTIMENT ANALYSIS

Works cited, quantitative sentiment analysis of lyrics in popular music.

  • Split-Screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Guest Access
  • Get Permissions
  • Cite Icon Cite
  • Search Site

Kathleen Napier , Lior Shamir; Quantitative Sentiment Analysis of Lyrics in Popular Music. Journal of Popular Music Studies 4 December 2018; 30 (4): 161–176. doi: https://doi.org/10.1525/jpms.2018.300411

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Popular music has been changing significantly over the years, revealing clear, audible differences when compared with songs written in other eras. A pop music composition is normally made of two parts—the tune and the lyrics. Here we use a digital humanities and data science approach to examine how lyrics changed between the 1950’s and the more recent years, and apply quantitative analysis to measure these changes. To identify possible differences, we analyzed the sentiments expressed in the songs of the Billboard Hot 100, which reflects the preferences of popular music listeners and fans in each year. Automatic sentiment analysis of 6,150 Billboard 100 songs covering all the years from 1951 through 2016 shows clear and statistically significant changes in sentiments expressed through the lyrics of popular music, generally towards a more negative tone. The results show that anger, disgust, fear, sadness, and conscientiousness have increased significantly, while joy, confidence, and openness expressed in pop song lyrics have declined.

Popular music has changed significantly since its early days in the 1950’s, as shown in clear musical differences between songs written then and those produced in later periods. 1 These differences are reflected in the numerous genres that developed since the early days of pop music, each with its own unique musical characteristics. A pop music fan can normally discriminate easily between the audio of a song recorded in the 1960’s and one made in the 2000’s.

In most cases, pop songs have lyrics in addition to a tune, and the lyrics provide another layer to the songs’ emotion, attitude, and narrative. 2 Although love and romance have always been the most dominant topic of popular music, lyrics and lyrical styles have changed significantly across the decades, reflecting social and political changes.

Several studies focused on the changes in the topics discussed in popular music lyrics, and their psychological, social, and economic implications. For instance, the expression of political ideas through pop music lyrics has changed over time, 3 and is also influenced by events outside of the world of music, 4 linking musical preferences to political orientation. 5 Popular music lyrics have also changed the way sexual content is expressed. 6 Another example is the explicit mentioning of drugs during the mid-1990s, and in general, drugs and other substances are mentioned more frequently in pop music during more recent years. 7 LGBTQ identity and gay rights have also appeared more frequently in pop music lyrics in tandem with social changes. 8 Flynn et al. looked for gender differences within cross-genre lyrics from the Billboard Top 20 songs between 2009 and 2013. Pettijohn and Sacco Jr assessed the content of Billboard Hot 100 number one songs between 1995 and 2003, and found that during difficult economic or social instances in history, there was a greater presence of comforting, romantic, and meaningful lyrics. Yoo et al. applied statistical analysis and text mining to identify changes in the patterns of words used in Korean pop music, and revealed, for example, that English words are more common in more recent Korean pop music lyrics. Ballard et al. identified differences between lyrics in four different genres: heavy metal, rap, country, and pop. Their study shows that lyrics of different genres have a different impact on the listener’s behavior. Namely, the findings show that rap and heavy metal lyrics are less likely to inspire prosocial behavior.

While little work has been focused on large-scale sentiment analysis and text mining in pop music, several previous studies isolated different defined emotions and measured these emotions from each song to profile songs or listener’s preference patterns. Yeh et al. classified popular songs using Thayer’s model of four emotions: angry, happy, relaxed, and depressed. 9 Their automatic classifier used the emotions of extracted choruses to categorize the emotion of the entire song, and achieved an average precision rate of 92% in the prediction of songs’ emotions. Batcho used survey participants to rate songs for anger, happiness, sadness, relevance, liking, and nostalgia. The study showed that participants who favored personal nostalgia also favored happy lyrics, while sad lyrics were more associated with historical nostalgia. Knobloch and Zillmann tested mood-management theory by having subjects in a bad or good moods choose to listen to songs from the top 30 chart. These songs were pre-categorized according to their joyfulness and energy. The results showed that subjects in bad mood were more likely than subjects in a good mood to listen to joyful, energetic songs for a greater length of time.

Here we use automatic sentiment analysis to measure the sentiments of the songs in the Billboard Hot 100 songs between 1951 and 2016. The different sentiments are quantified to measure the changing sentiments expressed through pop music lyrics across the years, and their correlation with the year is used to identify trends in pop music lyrics and music fan preferences.

Automatic sentiment analysis, a subset of text mining, is the examination of sentiments (opinions, emotions, attitudes, and feelings) found within text. Computational methods are implemented to locate, isolate, and categorize these sentiments. 10 Sentiment analysis is often paired with social media mining, as a means of discerning the sentiments found in text extracted from on-line product reviews, discussion forums, blogs, and other social networking services. 11 The implementation of sentiment analysis is a beneficial method of comprehending the opinion of a certain person or a certain population, and is often used in the context of a specific subject, such as politics, current news, entertainment, etc. The insights it provides from both product reviews and customer service reviews are vital information for businesses, enabling them to discover customer needs, satisfaction, and concerns, as well as helping them to provide quality products and services and achieve the intended brand image.

Sentiment analysis can be done at three levels: the document-level, sentence level, or aspect-level. Document-level analysis determines the average sentiment for an entire document. Sentence-level sentiment analysis views each sentence as a separate document, hence determining the individual average sentiment for each sentence or line of a given document. Aspect/feature-level analysis locates specific features of an entity (the main subject of the document), and uses the writer’s opinion of those features to classify the sentiment regarding the entity in question. An example of an entity-feature pair would be: “I had a nice stay at the Cupertino Hotel. The daily breakfast was delicious.” In this document, the entity is Cupertino Hotel and the feature is breakfast. 12 When studying the sentiment found in text, it is important to note that sentiment can be explicit or implicit. An explicit sentiment would be: “This is my new favorite restaurant.” Whereas, an implicit sentiment would be: “The restaurant’s apple pie had the right balance between sweetness and tartness.” 13  

Classifying the sentiment of a document or sentence can be separated into several common approaches. Text can be classified according to its polarity/valence (positive, negative, or neutral). 14 Cho et al. used multiple sentiment dictionaries to classify Amazon.com reviews on books, smartphones, and movies as either positive or negative. 15 Taking polarity one step further, text can be classified according to the strength of the sentiment (very good, very bad, etc). 16 Villarroel Ordenes et al. used the pre-existing five-star scale ratings of customer reviews from BN.com, Amazon.com, and TripAdvisor.com to classify the strength of both explicit and implicit sentiments. An alternative approach to polarity classification is appraisal theory, where emotions brought by situations found within the text are used to classify the sentiment of the text. 17 The number of emotion categories varies, depending on the data and purpose of the analysis. Twelve emotion classes were suggested by Storm and Storm: anxiety, fear, anger, hostility, disgust, shame, sadness, contentment, liking, happiness, pride, and love. Shaver et al. suggested the use of just five emotions: anger, sadness, fear, happiness, and affection. 18  

Data: The Billboard Hot 100

Pop music is a very prevalent form of art, and a large number of artists record and release a very large number of songs each year. Songs can become popular and have impact through the traditional channels of communication such as radio stations, but also through peer-to-peer communication and specific groups of interest. In the post-information era songs can also be communicated through social media or the web. Measuring all of these songs, however, can lead to biased results because the vast majority of these tunes do not become popular among pop music listeners, and therefore do not have a significant impact or indication on the preferences of pop music consumers. Therefore, the history of pop music can be profiled by the most popular songs in each year that had the highest impact and were consumed by the highest number of pop music listeners.

The annual Billboard Hot 100 is a common tool to characterize the most popular songs in a certain year, often has been used in pop music studies to identify trends and typical preferences of pop music artists and fans. 19  

Since 1958, Billboard, an entertainment media magazine, has been a prevalent source in relaying the most popular music hits with its introduction of the Billboard Hot 100 chart. This chart dictates the 100 most favored songs for a given year through the combined use of three measurements. Originally, these measurements were the following: 20  

The number of sales per single.

The number of times the single was played on a jukebox.

The number of times the single was played on the radio. (The Top 40 airtime chart was used as a reference.)

The grouping of these measurements was intended to capture the initial popularity of a single and test the longevity of its popularity over time. Over the years, the music industry has evolved, making the original Billboard Hot 100 measurements irrelevant. Jukebox sales have declined sharply and practically disappeared, and with the introduction of the web, Billboard magazine has made additions to its core measurements to properly reflect changes in technology. In addition to in-store and concert sales, Billboard magazine now tracks both physical and digital sales made on-line. 21 In 2011, streaming ratings were introduced as a measurement for the Billboard Hot 100. This includes the streaming services: Spotify, Rhaspody, Muve Music, etc. In 2013, video plays were incorporated from YouTube and Vevo. Billboard also takes into account information collected from social media. 21  

Before the Billboard Hot 100, Billboard generated three separate charts, each based on one of the original measurements. Of these charts, the ranking used by this analysis for songs earlier than 1958 is the Best Sellers in Stores (Best Sellers) chart as a reflection of music fan song preferences.

Certain Billboard songs were sold as part of a pair. The second song was either a B-side, an A-side, or double-A side single that was sold together with one of the charting songs. The Billboard Hot 100 has changed its rules over time regarding two-song charting, but remained consistent in considering the first song listed in the pair considered to have a stronger performance than the second one. Since it is not clear if the second song was equally as popular as the first or simply tolerated because of the popularity of the first, this analysis did not include the lyrics of any second songs from two-song chartings.

The source for the lyrics used in the analysis was www.azlyrics.com and www.oldielyrics.com , and the lyrics were collected and downloaded automatically. For songs not present in AZLyrics or OldieLyrics, secondary sources include genius.com, www.lyricsfreak.com , www.metrolyrics.com , www.lyricsmode.com , www.lyricsondemand.com , lyrics.com , www.songlyrics.com , and www.flashlyrics.com . Before, the lyrics were saved, any presence of labeling brackets was removed. Examples of these include: [Chorus], [Verse], [x2], [instrumental], etc. This minimizes the risk of non-lyrical, but descriptive, text distorting the tone analysis results.

Sentiment analysis: Tone Analyzer

The tool used for the sentiment analysis is Tone Analyzer, which is a mature computational linguistic tool provided as part of the IBM Watson developer cloud. 22 The IBM Watson Tone Analyzer Application Programming Interface (API) is a tone analysis service that processes text input for tone and outputs a JavaScript Object Notation (JSON) file containing document level and/or sentence-level scores of 13 types of tones. Scores range from 0 (no presence of given tone) to 1.0, which means 100% presence of tone. A score of at least 0.5 indicates the presence of a given tone, whereas any score greater than 0.75 indicates a strong presence of the given tone. Tone Analyzer separates these 13 tones into three categories:

Emotional Tone: anger, disgust, fear, joy, sadness.

Language Tone: analytical, confident, tentative.

Social Tone: openness, conscientiousness, extraversion, agreeableness, emotional range.

Tone analyzer uses a combination of psycholinguistics and machine learning to determine the type of tone found in text. 23 Psycholinguistics is the assessment of the association between how our minds work and how we learn, use, and comprehend language. 24 Tone Analyzer links words with different tones associated with them. 25 These words are balanced by the observation that negative emotions are normally expressed in a more intensive fashion. 26 The combination of different words and tones are handled by Tone Analyzer by using a Support Vector Machine (SVM), with a one-vs-rest method that expands SVM to more than two classes. The choice of words provides substantial information used by the computer to determine the tone and sentiments expressed in the text, and the personality of the writer. 27  

For example, the first sentence of the song “Total eclipse of the heart” (Bonnie Tyler) is “Turnaround, every now and then I get a little bit lonely and you’re never coming round”. The most dominant sentiment in that sentence according to Tone Analyzer is sadness, with value of 0.786. Sadness is also a dominant tone for that entire song, with a score of 0.52. Joy, on the other hand, is scored low for that song, with a score of 0.09. Fear has a value of 0.53, conscientiousness 0.08, extraversion is 0.02, and openness has a score of 0.48 for the song.

The first line of the Village People’s “Y.M.C.A.” is “Young man, there’s no need to feel down”, and the dominant tone in that line as analyzed by Tone Analyzer is tentative, with a score of 0.61. Extraversion score for that entire song is 0.55, and the joy score is 0.65. Anger, disgust, and fear are scored much lower, with 0.11, 0.07, and 0.09, respectively.

The first lines of Queen’s “We will rock you” is “Buddy you’re a boy make a big noise, playin’ in the street gonna be a big man some day”, analyzed by Tone Analyzer as agreeableness (0.64), extraversion (0.85), and fear (0.39). Tones with lower scores are disgust and sadness, with a score of ~0.07).

The Bee Gees song “Too much heaven” opens with “Nobody gets too much heaven no more, it’s much harder to come by, I’m waiting in line”. For that part of the song, Sadness is the most dominant tone as deduced by Tone Analyzer, with a score of 0.64. For that entire song Tone Analyzer computed low scores for anger (0.01), disgust (0.003), and fear (0.01), while joy (0.62), agreeableness (0.95), and extraversion (0.77) are scored high.

A total of 6,150 Billboard Hot 100 songs were collected, representing 66 years of popular music, from 1951 through 2016. Of these 6,150 songs, 65 songs were instrumental-only pieces, and due to the absence of lyrics these songs were omitted from the tone analysis in order to avoid skewing the data with zero values. The analysis therefore represents only the 6,085 songs that had lyrics. Figure 1 shows the number of songs used in each year, excluding instrumental pieces that do not have lyrics.

FIGURE 1. Number of songs in each year used in the analysis. Instrumental songs with no lyrics are excluded from the analysis.

Number of songs in each year used in the analysis. Instrumental songs with no lyrics are excluded from the analysis.

The songs were grouped together by year, and their tone scores were averaged for each year. For example, the average anger score from songs in 1951 was 0.0751. Averaging the data produced 66 data points (the total number of years) with 13 unique averaged tones. The standard deviation and standard error were calculated for each averaged tone of each year. Considering the calculations from all 13 tones, the range in standard error was 0.61% to 7.11%. Ninety percent of the standard error values were less than 3.51%, and just 10% of the standard error ranged from 3.51% to 7.11%. Focusing on the standard error (SE) of each individual tone category, 90% of emotional tone SE was less than 2.84%, 90% of language tone SE was less than 3.74%, and 90% of social tone SE was less than 3.64%.

Pearson Correlation and Linear Regression

From the averaged tone scores, two tests were conducted to identify the presence of a linear correlation between the year and a specific tone from lyrics of that year. The first test was a Pearson correlation coefficient test: using Equation 1 , where x is the year, x − is the sample mean of the years, y is the tone score, and y − is the sample mean of the tone scores.

The magnitude of the resulting coefficient indicates the polarity and strength of the relationship. The second test used is a simple linear regression. It examines the relationship between two continuous variables, where the value of one variable (x) is capable of predicting the value of the second variable (y). A linear regression uses the equation y = mx + b to test the linearity of the dependence between x and y. The regression line can be used in trend analysis as the best-fit line of the relationship between x and y. In this study the regression line shows the existence of a trend of a consistent change in sentiments over the observed period of time. To generate this equation, linear regression uses the Least Squares Method as shown in Equation 2 .

The Least Squares Method is the minimized sum of the squared vertical differences between each actual data point and the best fit line. If the resulting slope of Equation 2 is statistically different than zero, it indicates the existence of a linear dependency between the year (x) and the tone (y) of its songs. The polarity of the slope indicates the existence of a positive or negative trend.

Based on the results of the Pearson correlation coefficients, anger, disgust, fear, and conscientiousness are sentiments that have strong positive correlations with the year. Their respective coefficients were 0.8897, 0.7817, 0.7790, and 0.7629, and the correlations are all statistically significant ( P <0.0001 in all cases). Figures 2 , 3 , 4 , and 6 show the average measured anger, disgust, fear, and conscientiousness, respectively.

FIGURE 2. Average tone of anger measured for every year using Billboard Hot 100 song lyrics from 1951-2016. The error bars are the standard error of the mean.

Average tone of anger measured for every year using Billboard Hot 100 song lyrics from 1951-2016. The error bars are the standard error of the mean.

FIGURE 3. Average tone of disgust measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

Average tone of disgust measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

FIGURE 4. Average tone of fear measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

Average tone of fear measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

FIGURE 5. Average tone of joy in Billboard song lyrics from 1951-2016.

Average tone of joy in Billboard song lyrics from 1951-2016.

FIGURE 6. Average tone of conscientiousness measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

Average tone of conscientiousness measured for every year using Billboard Hot 100 song lyrics from 1951-2016.

The Pearson correlation shows that the emotional tones of anger, disgust, and fear, as well as the social tone of conscientiousness, have become more prevalent in the Billboard Hot 100 and Best Sellers lyrics. Anger started to increase more rapidly during the second half of the 1980s, and continued to increase through 2016. Disgust was lower in the early 1950s and early 1980s, but increased starting the 1980s and until 2016. Fear remained relatively steady until the early 1980s, where it started to increase until 2016.

Joy has a negative correlation with time, with a Pearson correlation coefficient of -0.7293 ( P <0.0001). That demonstrates that the Billboard Hot 100 and Best Sellers lyrics had been becoming less joyful over the years. Figure 5 shows the change in joy expressed in pop song lyrics. The graph show that anger joy was generally higher during the 1950’s through the 1970’s.

Unlike these four sentiments, not all sentiments that were measured showed a significant long-term trend over the years. Figures 7 and 8 show the change in measured extroversion and agreeableness, respectively. The graphs show that these sentiments have not changed significantly. The Pearson correlation of the extroversion is ~0.0687, and is not statistically significant ( P ’ 0.584). The Pearson correlation of the change in agreeableness is ~-0.126, and is also not statistically significant ( P ’ 0.313). The graph also shows that extroversion decreased during the 1980’s.

Looking at the simple linear regression trend line, the emotional tone of anger had the strongest linear relationship out of all the tones analyzed, with a R 2 value of 0.7915.

FIGURE 7. Average tone of extroversion in Billboard Popular Lyrics from 1951-2016.

Average tone of extroversion in Billboard Popular Lyrics from 1951-2016.

FIGURE 8. Average tone of agreeableness in Billboard Popular Lyrics from 1951-2016.

Average tone of agreeableness in Billboard Popular Lyrics from 1951-2016.

The tones of disgust, fear, joy, and conscientiousness were only mildly reflected by a linear regression line. Their R 2 values ranged between 0.5319 and 0.6111. Figures 2 - 6 show visualizations of anger, disgust, fear, joy, and conscientiousness, respectively within Billboard Hot 100 and Best Sellers lyrics over the decades. All graphs show the mean of the sentiment scores of all songs in each year, as computed by Tone Analyzer.

In addition to the seven sentiments mentioned above, other measurements that were made are sadness, analytical, confidence, tentativeness, and openness. Sadness and tentativeness had positive Pearson correlations of ~0.655 and ~0.58, respectively ( P <0.0001), showing that popular music lyrics became more tentative and sadder in time.

Figure 9 shows how sadness changed over time. The graph shows that sadness started to increase during the late 1980s, and peaked during the first decade of the 21st century compared to the other decades.

FIGURE 9. Average Tone of sadness in Billboard Popular Lyrics from 1951-2016.

Average Tone of sadness in Billboard Popular Lyrics from 1951-2016.

On the other hand, analytics, confidence, and openness have significant negative Pearson correlations of ~−0.661, ~−0.612, and ~−0.523, respectively ( P <0.0001), showing that the expression of these sentiments in the Billboard Hot 100 pop music lyrics decreased in time.

We also analyzed the different tones expressed in lyrics in different genres. Three common musical genres were tested, which are country, rap, and pop. The analysis was based on 148 country songs, 170 rap songs, and 281 pop songs. All songs were taken from the Billboard Hot 100 dataset used in the previous experiments. Figure 10 shows the mean and standard error of each of the 13 tones.

FIGURE 10. Mean and standard error of the 13 different tones analyzed by Tone Analyzer for three different musical genres.

Mean and standard error of the 13 different tones analyzed by Tone Analyzer for three different musical genres.

As the analysis shows, the tone in rap song lyrics is substantially different from the tone expressed in the lyrics of pop and country songs. Namely, anger and disgust are tones that are much more common in rap songs compared to pop and country, while joy is expressed less often in rap lyrics compared to pop and country. These observations are aligned with the work of Ballard et al., which showed that rap music lyrics are less likely to inspire prosocial behavior. The graph also shows that rap lyrics tend to be less analytical and less confident, while being more extroversive compared to pop and country.

Analytical tools and automatic text mining are useful methods that allow quantitative analysis of large datasets of text files, providing a new perspective on the analysis of art and human creations. These tools have been becoming more common through the emerging field of the digital humanities, which is a relatively new academic discipline that combines computer and information technologies with fields within the humanities such as literature, poetry, music, and visual art. That combination allows to archive, catalog, distribute, and process larger amounts of information, that were not possible in the pre-information era. The digitalization of the data also allows to measure that information quantitatively to analyze and profile human creations in a quantitative manner. Here we performed automatic analysis of the tone of the Billboard Hot 100 songs between 1951 through 2016 to identify trends in the sentiments expressed through the pop music lyrics.

This analysis shows that the tone in popular music lyrics has shifted significantly over the years. Anger, disgust, fear, sadness, tentativeness, and conscientiousness have increased over time, while joy, analytics, confidence, and openness have declined. Extroversion and agreeableness did not show a clear long-term trend, although extroversion showed a decline during the 1980s. In general, the results show a clear trend toward a more negative tone in pop music lyrics, with a more significant change around the early 1990s. That trend can also be explained by changes in social values, reflected through changes in mainstream popular music. Using the Billboard Hot 100 songs aims to ensure that the analysis is based on the songs that were the most popular in each year, as a reflection of music fan preferences during that time.

The study is limited by the tones that can be measured by Tone Analyzer. Clearly, lyrics can be subjective, and can be interpreted differently by different listeners, making it more difficult to make a precise measurement of the tone. The study also depends on the ability of Tone Analyzer to capture the sentiments, and the ability of the Billboard Hot 100 chart to reflect the true preferences of music fans in different years, as the measuring scheme of Billboard Hot 100 changed over time to adjust to the technological and social changes of popular music consumption. However, such quantitative approaches to popular music studies provide new insights from the historical popular music data, and can provide new information that can be difficult to quantify and profile without using text analysis algorithms. It is expected that with the continuous growth of the digital humanities, the application of text mining methods to popular music will become more prevalent in popular music studies, and will enable new discoveries and observations into the history of popular music.

Recipient(s) will receive an email with a link to 'Quantitative Sentiment Analysis of Lyrics in Popular Music' and will not need an account to access the content.

Subject: Quantitative Sentiment Analysis of Lyrics in Popular Music

(Optional message may have a maximum of 1000 characters.)

Citing articles via

Email alerts, affiliations.

  • Recent Content
  • Browse Issues
  • All Content
  • Info for Authors
  • Info for Librarians
  • Editorial Team
  • Online ISSN 1533-1598
  • Copyright © 2024

Stay Informed

Disciplines.

  • Ancient World
  • Anthropology
  • Communication
  • Criminology & Criminal Justice
  • Film & Media Studies
  • Food & Wine
  • Browse All Disciplines
  • Browse All Courses
  • Book Authors
  • Booksellers
  • Instructions
  • Journal Authors
  • Journal Editors
  • Media & Journalists
  • Planned Giving

About UC Press

  • Press Releases
  • Seasonal Catalog
  • Acquisitions Editors
  • Customer Service
  • Exam/Desk Requests
  • Media Inquiries
  • Print-Disability
  • Rights & Permissions
  • UC Press Foundation
  • © Copyright 2024 by the Regents of the University of California. All rights reserved. Privacy policy    Accessibility

This Feature Is Available To Subscribers Only

Sign In or Create an Account

Book cover

Brazilian Technology Symposium

BTSym 2020: Proceedings of the 6th Brazilian Technology Symposium (BTSym’20) pp 342–350 Cite as

Sentiment Analysis of Song Lyrics Using Clustering

  • Miguel Vásquez-Leon   ORCID: orcid.org/0000-0002-6314-3901 8 &
  • Willy Ugarte   ORCID: orcid.org/0000-0002-7510-618X 8  
  • Conference paper
  • First Online: 15 June 2021

1375 Accesses

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 233))

Music often helps people to relax and have fun times. However, the search for this in a personalized way based on the feelings of the users is not present in the main search engines. The closest thing to this would be the playlists created arbitrarily by other users. For this reason, we propose a method that allows us to label artists representing different feelings according to their lyrics.

  • Machine learning

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

“How Music Therapy Is Easing Anxiety and Isolation” - https://bit.ly/3o9iQ7o .

https://bit.ly/3dIU2gZ .

Billboard - https://bit.ly/2ZqRs9Z .

Rolling Stone Magazine - https://bit.ly/3iiCU5a .

Genius API - https://bit.ly/2BUsF68 .

https://www.nltk.org/ .

Emotions - https://bit.ly/3nqliq8 .

Fitch WT, Popescu T (2019) The world in a song. Science 366(6468):944–945. https://doi.org/10.1126/science.aay2214

Article   Google Scholar  

Bahraini A, Tilevich E (2019) Ask toscanini!: architecting a search engine for music scores beyond metadata. In: SAC, ACM. pp 796–803. https://doi.org/10.1145/3297280.3297356

Hadash S, Liang Y, Willemsen MC (2019) How playlist evaluation compares to track evaluations in music recommender systems. In: IntRS@RecSys. CEUR-WS, pp 1–9. http://ceur-ws.org/Vol-2450/paper1.pdf

Estivill-Castro V (2002) Why so many clustering algorithms: a position paper. SIGKDD Explor 4(1):65–75. https://doi.org/10.1145/568574.568575

Article   MathSciNet   Google Scholar  

Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. JSTOR: Appl Stat 28(1):100–108. https://doi.org/10.2307/2346830

Article   MATH   Google Scholar  

Gómez-Pérez, JM, Denaux R, García-Silva A (2020) A practical guide to hybrid natural language processing - combining neural models and knowledge graphs for NLP. Springer. https://doi.org/10.1007/978-3-030-44830-1

Rodríguez-Ibáñez M, Gimeno-Blanes FJ, Cuenca-Jiménez PM, Muñoz-Romero S, Soguero-Ruíz C, Rojo-Álvarez JL (2020) On the statistical and temporal dynamics of sentiment analysis. IEEE Access 8:87994–88013. https://doi.org/10.1109/ACCESS.2020.2987207

Chen XV, Tang TY (2018) Combining content and sentiment analysis on lyrics for a lightweight emotion-aware Chinese song recommendation system. In: ICMLC. ACM, pp 85–89. https://dl.acm.org/citation.cfm?id=3195148

Nielek R, Ciastek M, Kopec W (2017) Emotions make cities live: towards mapping emotions of older adults on urban space. In: WI. ACM, pp 1076–1079. https://doi.org/10.1145/3106426.3109041

Çano E (2018) Text-based sentiment analysis and music emotion recognition. CoRR abs/1810.03031. http://arxiv.org/abs/1810.03031

Thorndike RL (1953) Who belongs in the family. Psychometrika, 267–276. https://doi.org/10.1007/BF02289263

Jack RE, Garrod OG, Schyns PG (2014) Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Cur Biol 24:187–192. https://doi.org/10.1016/j.cub.2013.11.064

Kim DM, Kim K, Park K, Lee J, Lee K (2007) A music recommendation system with a dynamic k-means clustering algorithm. In: ICMLA. IEEE, pp 399–403. https://doi.org/10.1109/ICMLA.2007.97

Gómez LM, Navarro-Cáceres M (2017) Applying data mining for sentiment analysis in music. In: PAAMS. Springer, pp 198–205 (2017). https://doi.org/10.1007/978-3-319-61578-3_20

Download references

Author information

Authors and affiliations.

Universidad Peruana de Ciencias Aplicadas (UPC), Prolongación Primavera 2390, Lima, 15023, Peru

Miguel Vásquez-Leon & Willy Ugarte

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Willy Ugarte .

Editor information

Editors and affiliations.

Faculty of Electrical and Computer Engineering, UNICAMP, Campinas, São Paulo, Brazil

Divisão de Engenharia Eletrônica, Instituto Tecnológico de Aeronáutica, São José dos Campos, São Paulo, Brazil

Osamu Saotome

Univers. Peruana de Ciencias Aplicadas, Lima, Peru

Guillermo Kemper

PUC Campinas, Pontifícia Universidade Católica de Ca, Campinas, Brazil

Ana Claudia Mendes de Seixas

Universidade Estadual de Campinas, Campinas/SP, Brazil

Gabriel Gomes de Oliveira

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper.

Vásquez-Leon, M., Ugarte, W. (2021). Sentiment Analysis of Song Lyrics Using Clustering. In: Iano, Y., Saotome, O., Kemper, G., Mendes de Seixas, A.C., Gomes de Oliveira, G. (eds) Proceedings of the 6th Brazilian Technology Symposium (BTSym’20). BTSym 2020. Smart Innovation, Systems and Technologies, vol 233. Springer, Cham. https://doi.org/10.1007/978-3-030-75680-2_38

Download citation

DOI : https://doi.org/10.1007/978-3-030-75680-2_38

Published : 15 June 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-75679-6

Online ISBN : 978-3-030-75680-2

eBook Packages : Engineering Engineering (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

share this!

March 31, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

reputable news agency

Song lyrics are getting simpler, more repetitive: Study

song

You're not just getting older. Song lyrics really are becoming simpler and more repetitive, according to a study published on Thursday.

Lyrics have also become angrier and more self-obsessed over the last 40 years, the study found, reinforcing the opinions of cranky aging music fans everywhere.

A team of European researchers analyzed the words in more than 12,000 English-language songs across the genres of rap, country, pop, R&B and rock from 1980 to 2020.

Before detailing how lyrics have become more basic, the study pointed out that US singer-songwriting legend Bob Dylan—who rose to fame in the 1960s—has won a Nobel Prize in literature.

Senior study author Eva Zangerle, an expert on recommendation systems at Austria's University of Innsbruck, declined to single out an individual newer artist for having simple lyrics.

But she emphasized that lyrics can be a "mirror of society" which reflect how a culture's values, emotions and preoccupations change over time.

"What we have also been witnessing in the last 40 years is a drastic change in the music landscape—from how music is sold to how music is produced," Zangerle told AFP.

Over the 40 years studied, there was repeated upheaval in how people listened to music. The vinyl records and cassette tapes of the 1980s gave way to the CDs of the 90s, then the arrival of the internet led to the algorithm-driven streaming platforms of today.

For the study in the journal Scientific Reports , the researchers looked at the emotions expressed in lyrics, how many different and complicated words were used, and how often they were repeated.

"Across all genres, lyrics had a tendency to become more simple and more repetitive," Zangerle summarized.

The results also confirmed previous research which had shown a decrease in positive, joyful lyrics over time and a rise in those that express anger, disgust or sadness.

Lyrics have also become much more self-obsessed, with words such as "me" or "mine" becoming much more popular.

'Easier to memorize'

The number of repeated lines rose most in rap over the decades, Zangerle said—adding that it obviously had the most lines to begin with.

"Rap music has become more angry than the other genres," she added.

The researchers also investigated which songs the fans of different genres looked up on the lyric website Genius.

Unlike other genres, rock fans most often looked up lyrics from older songs, rather than new ones.

Rock has tumbled down the charts in recent decades, and this could suggest fans are increasingly looking back to the genre's heyday, rather than its present.

Another way that music has changed is that "the first 10-15 seconds are highly decisive for whether we skip the song or not," Zangerle said.

Previous research has also suggested that people tend to listen to music more in the background these days, she added.

Put simply, songs with more choruses that repeat basic lyrics appear to be more popular.

"Lyrics should stick easier nowadays, simply because they are easier to memorize," Zangerle said.

"This is also something that I experience when I listen to the radio."

Journal information: Scientific Reports

Explore further

Feedback to editors

research paper song lyrics

AI improves monsoon rainfall predictions

2 hours ago

research paper song lyrics

82% of EU farm subsidies bolster high emissions foods: Study

3 hours ago

research paper song lyrics

Leaves of three, let it be? Wide variability among poison ivy plants makes identification more challenging

research paper song lyrics

Golfers' risk from pesticides used on turf grass is likely low, studies find

research paper song lyrics

'Frankenstein design' enables 3D printed neutron collimator

research paper song lyrics

New antibiotic class effective against multidrug-resistant bacteria discovered

4 hours ago

research paper song lyrics

Computational tools fuel reconstruction of new and improved bird family tree

research paper song lyrics

New method reveals hidden activity of life below ground

research paper song lyrics

A frozen chunk of genome rewrites our understanding of bird evolution

research paper song lyrics

Engineers 'symphonize' cleaner ammonia production

Relevant physicsforums posts, cover songs versus the original track, which ones are better.

23 hours ago

How did ‘concern’ semantically shift to mean ‘commercial enterprise' ?

Interesting anecdotes in the history of physics.

Mar 31, 2024

The new Shogun show

Mar 29, 2024

History of Railroad Safety - Spotlight on current derailments

Mar 27, 2024

Metal, Rock, Instrumental Rock and Fusion

More from Art, Music, History, and Linguistics

Related Stories

research paper song lyrics

Song lyrics have become simpler and more repetitive since 1980, study finds

Mar 28, 2024

research paper song lyrics

Spotify has added lyrics to all of its songs for all users

Nov 19, 2021

research paper song lyrics

Apple Music offers a chance to sing with your favorites

Dec 6, 2022

research paper song lyrics

Computer analysis shows that popular music lyrics become angrier and sadder over time

Jan 24, 2019

research paper song lyrics

Music exposes listeners to alcohol references, potentially influencing drinking

Feb 19, 2024

research paper song lyrics

The science of why you can remember song lyrics from years ago

Aug 15, 2023

Recommended for you

research paper song lyrics

Low resting heart rate in women is associated with criminal offending, unintentional injuries

research paper song lyrics

Your emotional reaction to climate change may impact the policies you support, study finds

research paper song lyrics

Survey study shows workers with more flexibility and job security have better mental health

Mar 26, 2024

research paper song lyrics

We have revealed a unique time capsule of Australia's first coastal people from 50,000 years ago

Mar 25, 2024

research paper song lyrics

Prestigious journals make it hard for scientists who don't speak English to get published, study finds

Mar 23, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

  • International edition
  • Australia edition
  • Europe edition

Studio microphone

Song lyrics getting simpler, more repetitive, angry and self-obsessed – study

Researchers analysed the words in more than 12,000 English-language songs across several genres from 1980 to 2020

You’re not just getting older. Song lyrics really are becoming simpler and more repetitive, according to a study published on Thursday.

Lyrics have also become angrier and more self-obsessed over the last 40 years, the study found, reinforcing the opinions of cranky ageing music fans everywhere.

A team of European researchers analysed the words in more than 12,000 English-language songs across the genres of rap, country, pop, R&B and rock from 1980 to 2020.

Before detailing how lyrics have become more basic, the study pointed out that US singer-songwriting legend Bob Dylan – who rose to fame in the 1960s – has won a Nobel prize in literature.

Senior study author Eva Zangerle, an expert on recommendation systems at Austria’s University of Innsbruck, declined to single out an individual newer artist for having simple lyrics.

But she emphasised that lyrics can be a “mirror of society” which reflect how a culture’s values, emotions and preoccupations change over time.

“What we have also been witnessing in the last 40 years is a drastic change in the music landscape – from how music is sold to how music is produced,” Zangerle said.

Over the 40 years studied, there was repeated upheaval in how people listened to music. The vinyl records and cassette tapes of the 1980s gave way to the CDs of the 90s, then the arrival of the internet led to the algorithm-driven streaming platforms of today.

For the study in the journal Scientific Reports, the researchers looked at the emotions expressed in lyrics, how many different and complicated words were used, and how often they were repeated.

“Across all genres, lyrics had a tendency to become more simple and more repetitive,” Zangerle summarised.

The results also confirmed previous research which had shown a decrease in positive, joyful lyrics over time and a rise in those that express anger, disgust or sadness.

Lyrics have also become much more self-obsessed, with words such as “me” or “mine” becoming much more popular.

The number of repeated lines rose most in rap over the decades, Zangerle said – adding that it obviously had the most lines to begin with.

“Rap music has become more angry than the other genres,” she added.

The researchers also investigated which songs the fans of different genres looked up on the lyric website Genius.

Unlike other genres, rock fans most often looked up lyrics from older songs, rather than new ones.

Rock has tumbled down the charts in recent decades, and this could suggest fans are increasingly looking back to the genre’s heyday, rather than its present.

Another way that music has changed is that “the first 10-15 seconds are highly decisive for whether we skip the song or not,” Zangerle said.

Previous research has also suggested that people tend to listen to music more in the background these days, she added.

Put simply, songs with more choruses that repeat basic lyrics appear to be more popular.

“Lyrics should stick easier nowadays, simply because they are easier to memorise,” Zangerle said.

“This is also something that I experience when I listen to the radio.”

Most viewed

research paper song lyrics

Song lyrics are getting simpler, more repetitive: study

Y ou're not just getting older. Song lyrics really are becoming simpler and more repetitive, according to a study published on Thursday.

Lyrics have also become angrier and more self-obsessed over the last 40 years, the study found, reinforcing the opinions of cranky ageing music fans everywhere.

A team of European researchers analysed the words in more than 12,000 English-language songs across the genres of rap, country, pop, R&B and rock from 1980 to 2020.

Before detailing how lyrics have become more basic, the study pointed out that US singer-songwriting legend Bob Dylan -- who rose to fame in the 1960s -- has won a Nobel Prize in literature.

Senior study author Eva Zangerle, an expert on recommendation systems at Austria's University of Innsbruck, declined to single out an individual newer artist for having simple lyrics.

But she emphasised that lyrics can be a "mirror of society" which reflect how a culture's values, emotions and preoccupations change over time.

"What we have also been witnessing in the last 40 years is a drastic change in the music landscape -- from how music is sold to how music is produced," Zangerle told AFP.

Over the 40 years studied, there was repeated upheaval in how people listened to music. The vinyl records and cassette tapes of the 1980s gave way to the CDs of the 90s, then the arrival of the internet led to the algorithm-driven streaming platforms of today. 

For the study in the journal Scientific Reports, the researchers looked at the emotions expressed in lyrics, how many different and complicated words were used, and how often they were repeated. 

"Across all genres, lyrics had a tendency to become more simple and more repetitive," Zangerle summarised.

The results also confirmed previous research which had shown a decrease in positive, joyful lyrics over time and a rise in those that express anger, disgust or sadness.

Lyrics have also become much more self-obsessed, with words such as "me" or "mine" becoming much more popular.

- 'Easier to memorise' -

The number of repeated lines rose most in rap over the decades, Zangerle said -- adding that it obviously had the most lines to begin with.

"Rap music has become more angry than the other genres," she added.  

The researchers also investigated which songs the fans of different genres looked up on the lyric website Genius. 

Unlike other genres, rock fans most often looked up lyrics from older songs, rather than new ones.

Rock has tumbled down the charts in recent decades, and this could suggest fans are increasingly looking back to the genre's heyday, rather than its present.

Another way that music has changed is that "the first 10-15 seconds are highly decisive for whether we skip the song or not," Zangerle said.

Previous research has also suggested that people tend to listen to music more in the background these days, she added.

Put simply, songs with more choruses that repeat basic lyrics appear to be more popular.

"Lyrics should stick easier nowadays, simply because they are easier to memorise," Zangerle said.

"This is also something that I experience when I listen to the radio."

Idiot wind: In contrast to Bob Dylan's literature Nobel, lyrics have become simpler and more repetitive over the last 40 years, a study says

A business journal from the Wharton School of the University of Pennsylvania

Knowledge at Wharton Podcast

Words matter: how lyrics help songs top the charts, june 1, 2018 • 8 min listen.

Wharton research uses natural language processing to study why some songs catch on while others fail to climb the charts.

research paper song lyrics

Wharton's Jonah Berger discusses his research on predicting song popularity.

Summer is still a few weeks away, but people are already talking about what will be 2018’s “song of summer.” New research from Wharton marketing professor Jonah Berger might help out the oddsmakers. Along with co-author Grant Packard, a marketing professor at Wilfrid Laurier University, Berger used natural language processing to study hundreds of songs and their lyrics to see what made some catch on while others failed to climb the charts. Berger recently spoke to Knowledge at Wharton about his findings, which are outlined in the paper, “Are Atypical Songs More Popular?”

An edited transcript of the conversation follows.

Knowledge at Wharton: A lot of your past research focuses on virality or why certain things catch on while others don’t. What inspired you to focus on songs this time?

Jonah Berger: We hear songs all the time. We’re in our car on the way to work, or we’re at home listening to YouTube. We like some songs and don’t like others. Some climb the Billboard charts and some fail. One question I wondered was why. Obviously, songs are tough to study. There are a lot of different factors that shape song success, from who sings the songs to what sort of melodies they have. We thought it would be interesting to look at a slightly more unexplored place, and that is the lyrics. Just by looking at the lyrics alone, could we pick up some traction on why songs succeed and fail?

Knowledge at Wharton: You examined songs within their particular genres — country songs with other country songs, R&B songs with other R&B songs. Why did you decide on this format?

Berger: We had a simple, and I think interesting, hypothesis, which is that the success of songs doesn’t just depend on their lyrics in general, it depends on how similar that song is from other songs that are popular recently. Take a country song, for example. A country song could be very like most country songs that are out there already. The lyrics could be very similar to what people usually sing about in country music, or the lyrics could be more different, more novel, more new for a country song. We wondered whether songs that sound more new, because their lyrics are different than most songs in the space, could be more popular with listeners.

Knowledge at Wharton: You found some interesting patterns among these song lyrics. For example, if somebody thinks that all country songs are about girls and trucks, they are not too far off, based on this research.

“There is this notion out there that certain genres care more about lyrics than others.”

Berger: We used natural language processing, which is a way to analyze text using software. We did something called LDA (latent Dirichlet allocation), which helps us figure out the underlying themes or topics in the songs. It can take a whole bunch of different songs with all of their different lyrics and find general themes. We found that there were 10 themes across all songs that tended to pop. Some songs talk about love, others talk about girls and cars, some talk about dancing. We asked, what part of each song is in each of these themes? And we found, like you mentioned, that country songs don’t only sing about girls and cars, but they sing a good bit about that theme.

Knowledge at Wharton: What were your key findings about how lyrical differentiation impacted song popularity?

Berger: We found that even though many things affect song popularity, like who sings it and the melody, we could understand what songs become successful just by looking at the lyrics alone and just by looking at how different a song is from its genre. We found that, on average, the more different a song was from its genre, the more atypical a country song was for country music, the more successful it was on the Billboard chart, the higher ranked it was. Even controlling for things like who sang the song, when it was released, etc., the mere fact that it was different from most other songs in its genre was connected to it being more successful.

Knowledge at Wharton: Is there a particular song that is an especially good example of these overall findings?

Berger: What is interesting is it is hard to tell the difference between songs just by listening to them. I bet if you were listening to a radio and I asked you how different this country song is from country songs that usually come on, you wouldn’t be very good at telling me, and I wouldn’t be very good at telling you. We often don’t consciously pay attention to all of the lyrics or even recognize all of the lyrics, but natural language processing allows us to find these implicit or underlying themes that drive success. So, it wasn’t driven by one song in particular; it was looking at it across all of these songs. Lyrics shaped whether they were successful or not.

Knowledge at Wharton: There were a couple of outliers here, and one of those was pop songs. Can you talk to us about that?

Berger: One thing we wanted to try to do is say whether this effect is causal. It is neat that successful songs tend to have more different lyrics, but are the different lyrics causing the songs to be successful, or might it be something else? We did a lot of work to show a causal effect, and one thing we tried to do was see whether it varies by genre. There is this notion out there, if you think about songs, that certain genres care more about lyrics than others.

You might imagine, for example, that lyrics don’t matter so much in dance songs because there aren’t many lyrics, if there are any at all. If we think about pop songs compared with country or rock, success in pop is often more about being the same rather than being different. We wondered whether in those two genres — pop and dance — we might see different effects, and indeed we did.

We found that lyrics didn’t matter much at all in dance songs, and similarity was better than difference in pop songs. It suggested it is not just about the lyrics themselves, it is about how the lyrics relate to difference and how that matters in the specific genre being examined.

Knowledge at Wharton: Could this technique also be used to predict a summer blockbuster or most popular beach read, for example?

“We found that, on average, the more different a song was from its genre, the more successful it was on the Billboard chart.”

Berger: That is exactly right, and that is what we are trying to do now. We are looking at thousands of movies to see whether we can predict how successful movies are going to be, in terms of box office sales as well as ratings online, based on their scripts. We’re looking at emotional trajectories, for example, in the scripts. We’re doing more work with music lyrics, we’re doing some work with content and text of books, and we’re also doing work with customer service calls.

Imagine you call an airline or an online retailer. How do the words the customer service representative uses, as well as how they use those words, affect how satisfied the customers are? Across a bunch of different domains, we are interested in words even though we don’t always pay attention to them. How might those words affect success or failure?

Knowledge at Wharton: How does this research demonstrate the value of natural language processing, and how do you see that helping to develop this area of research?

Berger: There is a lot of attention these days around artificial intelligence and machine learning, though most people in the general population don’t necessarily understand what that means. But one way that these tools are being used is to pull behavioral insight from text. There are all sorts of textual data out there from online reviews to things like song lyrics and movie scripts. Textual analysis or natural language processing allows us to pull behavioral insights from those reams of data, not just to predict what is going to succeed and fail, but also to use it to understand human and customer behavior. That is really the power of these tools — using them to understand things we might not have been able to understand before.

People may tell you they liked a song or a movie, but they may not know why. What this allows us to do is actually quantify what makes a song or a book or a movie successful, even if it is hard to study otherwise.

Knowledge at Wharton: Do you have a prediction for what is going to be the song of this summer?

Berger: I don’t have a specific prediction, but I bet it will be atypical. I bet it will be something unusual rather than normal.

More From Knowledge at Wharton

research paper song lyrics

Chip Bergh, Former President & CEO of Levi Strauss & Co

Emily may, president, co-founder, and lead executive officer of right to be, erica martin, marketing communications manager, ford brand, looking for more insights.

Sign up to stay informed about our latest article releases.

song lyrics Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

The Essence of Poetry Application as Indonesian Pop Songs Lyrics in the Perspective of Utilitarianism Ethical Axiology

Background: The existence of literature among the younger generation is increasingly difficult to detect. Meanwhile, literature plays an important role in character education as well as the existence of bahasa Indonesia as the national language. One form of literature whose popularity is declining is poetry. Using poems as song lyrics is a new way to deliver poetry. Aim: This study aims to determine the essence of poetries application as Indonesian pop song lyrics on the spread of literature. Methods: The research was conducted using a qualitative approach with an open questionnaire data collection method and interview. The population is Faculty of Humanities (FIB) of Universitas Airlangga students from the cohort of 2019-2021. The sample was taken using a random sampling technique with the number 40 respondents for the questionnaire and 14 participants for the interview. For the questionnaire, the respondents were asked to interpret three lyric snippets of popular pop songs in Indonesia. The purpose of the questionnaire was to measure the respondent's level of understanding of the song lyrics so that conclusions could be drawn. The data analysis method was done using the classification technique, which was grouping the answers from the questionnaire into four categories, namely: 1). completely understand, 2). somewhat understand, 3). do not quite understand, and 4). do not understand at all. Results: The results show that the use of poetry as song lyrics has an effect on increasing respondents' literary understanding. Recommendation: Younger generation should utilize songs as media for dissemination as well as learning literature, especially poetry. Limitation: However, further in-depth research is needed due to the limitations of this study. The respondents and participants involved are still restricted to FIB students from Universitas Airlangga, hence the resulting data does not represent the young generation from all over Indonesia.

Evolution of the English-speaking song discourse in the XX–XXI centuries.

The article analyses various tendencies in the development of the English-speaking song discourse, that have revealed themselves in the course of the previous 50 years. The aim of the research is to identify these tendencies and to determine the reasons of their emergence. The goals of this research were to trace the evolution of this type of discourse by way of analysing a vast corpus of song lyrics and to identify the factors that had influenced the changes that followed. The focus of our attention was mainly on rock song lyrics due to their obvious non-commercial nature. The results of the research allow us to state that in the course of evolution of the English-speaking song discourse such tendency as a tendency to simplification/sophistication manifested itself persistently. Moreover, both vectors either replaced one another or developed simultaneously. Besides, a tendency to melodicism, related to the abovementioned one, also revealed itself. We should also mention a tendency to commercialisation, typical of mass culture in general. Another one is a tendency to technological dependency, mainly of synthesizers and computers. Among the most influential tendencies we can name a tendency to visualisation, which actually brings the song discourse close to cinematographic discourse.

Arransemen Lagu Bertemakan Keselamatan Lalu Lintas Untuk Guru PAUD

Kindergarten is located in Bandar Seikijang District which is on the edge of the eastern causeway which is an accident-prone road. In addition, there is no curriculum related to traffic safety in the learning process. In overcoming partner problems, the service team provides solutions in the form of assisting the arrangement of the song with the theme of traffic safety which is applied to the curriculum in schools. The steps for implementing the solutions offered in overcoming these problems are four stages, (1) the preparation, (2) the activity, (3) the simulation and microteaching, and (4) evaluation. From the results obtained there is an increase in the result of 50% of the pretest and posttest questions. Then 85% of participants agreed to increase learning creativity with song media and 75% of participants felt this activity was useful and 85% of teachers really followed the course of the activity and teachers could cerate song lyrics with the theme of traffic safety and directly practice it by means of microteaching.

Feminativi v luči sodobne slovenske glasbe

Feminitives in Contemporary Slovene Music This article focuses on the use of feminine forms derived from masculine nouns in contemporary Slovene music. Feminine nouns denoting persons were extracted from selected song lyrics and analysed in terms of their word formation. Based on the analysis of feminine forms derived from masculine nouns it is suggested that in music women are often sexualised – their overall image implies sexuality and is clearly based on male fantasies; linguistic reality often reflects social reality. Feminatywy we współczesnej muzyce słoweńskiejNiniejszy artykuł skupia się na wykorzystaniu form żeńskich pochodzących od rzeczowników rodzaju męskiego we współczesnej muzyce słoweńskiej. Rzeczowniki żeńskie oznaczające osoby wyekstrahowano z wybranych tekstów piosenek i zanalizowano pod względem słowotwórczym. Analiza form żeńskich pochodzących od rzeczowników rodzaju męskiego wskazuje, że w muzyce kobiety są często seksualizowane – ich ogólny wizerunek implikuje seksualność i jest wyraźnie oparty na męskich fantazjach; rzeczywistość językowa często odzwierciedla rzeczywistość społeczną.

Figure of Speech Stylistic Analisys on Song Liryc in Sami Yusuf’s Albums

Abstract: This research aimed to analyze figure of speech Stylistic in Sami Yusuf’s song lyrics. The author is intrigued by the songwriter's lyrics that may occur in the life we live. The purpose of this study is to determine whether or not there are any figure of speech features in Sami Yusuf's song lyrics, as well as to examine the meaning of the song lyrics found in Sami Yusuf's song lyrics. Content analysis was a part of this study. The method used in this study is descriptive qualitative research, and the design is qualitative research. The data was gathered through library research. The researcher is the research instrument since he or she is the main instrument in this study who collects and evaluates the data. After finishing the investigation, the researcher discovered that the most common figure of speech is personification, which has 14 data, and the least common figure of speech is paradox, which has one data. Metaphor (4 data), simile (7 data), synecdoche (10 data), metonymy (5 data), symbol (4 data), and hyperbole (5 data) are some of the other data collected. In other cases, the researcher found no figures of speech like allegory, apostrophe, understatement, or linguistic irony. The connotations of figure of speech utilized in Sami Yusuf's song lyrics are varied. Keywords: figure of speech, song lyric, stylistic

The Use of Ateji as Figurative Speech in Japanese Song Lyrics

This study aims to analyze the types of ateji used in Japanese song lyrics, and to analyze ateji as a form of figurative speech by the semantic-semiotic connection between the words used in ateji’s writing and reading. Analysis is done qualitatively based on Shirose’s theory of ateji classification and Japan’s hiyu hyougen (figurative speech) theory. This research resulted in the finding of 4 ateji types in Japanese song lyrics, which are ateji for foreign words pronunciation, ateji for pronouns, ateji for replacing words, and ateji for words used in specific titles/works. The use of metaphor (in’yu), metonymy (kan’yu), and synecdoche (teiyu) are also found between the uses of ateji, based on the relation of the words in said ateji. The words can be linked through contextual relation, conceptual relation, or semantical relation. The connection of the words can also result in similar uses as other figurative speeches not included in Japanese’s hiyu hyougen, which indicated that ateji can be handled and understood as a general form of figurative speech in written Japanese language.

Proses Awal dalam Penghasilan Komposisi “Kelampan Bajang” dengan Mengadaptasi Struktur Musik Pop Suku Sasak Lombok

The musical composition, entitled Kelampan Bajang, is the main narrative of the composition theme about the story of the journey of the young Sasak tribe which is semiotically depicted in three different times, namely morning, afternoon, and night. This composition emerged due to several phenomena found, including 1) there are no complete Sasak songs in transcripts and arrangements, 2) many musicians arrange songs in Sasak but often cause controversy in lyrics and music, 3) many talented young musicians cover repeating noname songs on several songs performed by many pop singers in Lombok, 4) many sasak pop songs but most of them do not represent local culture in terms of lyrics and music. This paper aims to explain the initial process of musical composition with the theme Kelampan Bajang including 1) Bekuliq, 2) Bekayaq, 3) Matur Tampiasih, 4) Sepi Alam Dese, 5) Lalo Midang, and 6) Merariq. All song lyrics are written in the Sasak language. The research uses the autoethnographic method to discuss the problem, how the idea of ​​creating musical compositions can be aligned with the diversity of musical culture in Lombok. After conducting an in-depth study, the author concluded that in order to create a compositional work that could represent regional culture and not cause controversy among the public, namely 1) the author inserted an exploration of the pelog and slendro melodic motifs in the vocal melody section as regional identity, 2) the author inserted the motif the melodies of pelog and slendro in the musical arrangement section, and 3) the author uses good and correct local language lyrics that are in accordance with the regional culture in Lombok.

Can a machine win a Grammy? An evaluation of AI-generated song lyrics

Representasi perempuan dalam video klip girlband korea (analisis semiotika video klip dalla dalla dari girlband itzy).

Feminism is a social movement that demands justice and equal rights of women with men. Gender equality is still a matter of debate. One of them is in South Korea. Women who support feminism are considered to hate, do not need, and will feel their position is higher than men. This demand for equality covers the fields of economy, politics, social, lifestyle, culture and so on. This research was conducted with the aim of knowing how the representation of women displayed by Korean Girlband Itzy in their video clip entitled Dalla Dalla. Dalla-Dalla's video clip with the theme of women's freedom depicts the life movements of women who want freedom from the criticisms of others who consider them weak and discriminate. This video clip also shows them different from the others. Although in this video clip it is not clearly stated the relevance of feminism in it. Therefore, in order to get a detailed description of the issue of feminism and women in the video clip, the researcher uses John Fiske's semiotic theory with the main theory, namely The Codes of Television, to see that behind the video clips there are connotative and denotative representations. According to John Fiske's Semiotics, there are 3 levels, namely the level of reality, representation and ideology. So it is hoped that the representation of women in this video clip is clearly described. The results of the study show the meaning of signs at the level of reality, and representations indicate freedom, self-confidence, and personal existence. At the ideological level, it shows the existence of postmodern feminism. So at the level of signs and symbols related to the representation of post-modern feminism, namely creating new discourses or messages that are meaningful and provide positive energy to other women, so that they dare to fight oppression and discrimination. The song lyrics and narrative code in the video clip show several scenes that lead to on postmodern feminist symbols

RELASI MAKNA LEKSIKAL LIRIK LAGU KESENIAN RODAD SEKARWANGI BOYOLALI

AbstrakPenelitian ini membahas mengenai relasi makna leksikal lirik lagu pada kesenian rodad sekarwangi  yang terletak di desa Kendelban, Kecamatam Kemusu Kabupaten Boyolali. Jenis penelitian yang digunakan adalah deskriptif kualitatif. Teori yang digunakan adalah teori dari I Dewa Putu Wijana dan Josh Daniel Parera. Pengumpulan data diperoleh dari dokumentasi dan  diperkuat dengan wawancara serta observasi. Analisis data dilakukan dengan (1) mengumpulkan data dari lirik lagu kesenian rodad, (2) melakukan klasifikasi dari data yang termasuk ke dalam bagian relasi makna leksikal, (3) menyajikan data dalam bentuk tabel dan analisis, kemudian (4) menyimpulkan temuan data. Hasil penelitian menunjukkan, dalam lirik lagu kesenian rodad terdapat relasi makna leksikal berupa, antonimi 12 data, sinonimi 32 data makna denotasi sebanyak 72 data dan konotasi 13 data. Sehingga dapat disimpulkan, bahwa makna denotasi dominan guna mengetahu makna secara kongkrit sehingga merepresentasikan budaya masyarakat, agama masyarakat, kondisi bahasa, proses pembentukan kata dan penuturan yang berbeda. Kata kunci: semantik, relasi makna, rodad AbstractThis study discusses the relation of the lexical meaning of song lyrics in the Sekarwangi rodad art located in Kendelban village, Kemamatu Kemusu, Boyolali Regency. The type of research used is descriptive qualitative. The theory used is the theory of I Dewa Putu Wijana and Josh Daniel Parera. Data collection was obtained from documentation and strengthened by interviews and observations. Data analysis was performed by (1) collecting data from the lyrics of the rodad song, (2) classifying data included in the lexical meaning relation, (3) presenting data in tabular form and analysis, then (4) summarizing the data findings. The results showed, in the lyrics of the rodad art song there is a relation of lexical meaning in the form, antimony 12 data, synonym 32 data meaning denotation as much as 72 data and connotation of 13 data. So it can be concluded, that the meaning of the dominant denotation in order to find out the meaning concretely so that it represents the culture of the community, the religion of the people, the condition of the language, the process of word formation and different speech. Keywords: semantics, lexical relation, rodad  

Export Citation Format

Share document.

research paper song lyrics

  • The Star ePaper
  • Subscriptions
  • Manage Profile
  • Change Password
  • Manage Logins
  • Manage Subscription
  • Transaction History
  • Manage Billing Info
  • Manage For You
  • Manage Bookmarks
  • Package & Pricing

Song lyrics have become angrier since the 80s, study shows

Monday, 01 Apr 2024

Related News

Renowned Malaysian artists rally for Gaza art auction

Renowned Malaysian artists rally for Gaza art auction

Tourism, arts and culture ministry hosts break-fast for foreign diplomats, turn a spark into a flame with cipta seni arts programme.

Analysis also showed the number of different words used within songs has decreased, particularly among rap and rock songs. Photo: AP

Song lyrics have become angrier over the past four decades, while the words in rap, country, pop, R&B, and rock songs have also become simpler and more repetitive, according to research published in the journal Scientific Reports .

Rap songs showed the most profound increase in anger and negative emotions while country songs had the lowest rise, the researchers said.

Eva Zangerle, an assistant professor at the University of Innsbruck’s department of computer science in Austria, said anger-related words may have become more common because music – as a mirror of society – “reflects more general changes in society and culture”.

Prof Zangerle and her colleagues analysed the lyrics of 12,000 English-language songs released between 1980 and 2020.

They also looked at the page views of these lyrics on the online song lyric platform Genius.

The team found that, in general, lyrics have become simpler and easier to understand over time.

Analysis also showed the number of different words used within songs has decreased, particularly among rap and rock songs.

The team speculate this may be because of the changes in how music played, such as increases in songs being played as background music.

Prof Zangerle added: “In the last 40 years, we have witnessed a change from buying records in the store to being able to choose from hundreds of millions of songs on streaming platforms on the phone.

“This has also changed the way music is consumed.”

The researchers also found that lyrics have become more emotional and personal over time, with an increase in both positive and negative words in rap songs.

R&B, pop and country songs have also seen a rise in the use of emotionally negative lyrics.

When analysing the page views on Genius, the team found that older rock songs were viewed more than those of newer ones, but the reverse was true for country songs.

This could indicate that rock listeners prefer lyrics from older songs, while country listeners may prefer lyrics from newer songs.

Commenting on the overall findings, the authors wrote: “This suggests that for rap, rock, and country, lyrics play a more pronounced role than for other genres and that listeners’ interest in lyrics goes beyond musical consumption itself.” – dpa

Related stories:

Tags / Keywords: Lyrics , songs , research , science journal , music , simple , angry

Found a mistake in this article?

Report it to us.

Thank you for your report!

Tourism, Arts and Culture Ministry hosts break-fast for foreign diplomats

The importance of nurturing intimate health

Next in culture.

research paper song lyrics

Trending in Lifestyle

Air pollutant index, highest api readings, select state and location to view the latest api reading.

  • Select Location

Source: Department of Environment, Malaysia

Others Also Read

Best viewed on Chrome browsers.

research paper song lyrics

We would love to keep you posted on the latest promotion. Kindly fill the form below

Thank you for downloading.

We hope you enjoy this feature!

  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

Lindsay Ann Learning English Teacher Blog

Analyzing a Song – So Simple Every Student Can Do It

analyzing-song-lyrics-2

December 13, 2022 //  by  Lindsay Ann //   2 Comments

Sharing is caring!

analyzing-a-song

English teachers, teaching your students how to analyze song lyrics needs to be a “go-to” strategy, a step toward deeper analysis of more complex texts .

Whether you’re teaching poetry, persuasive essays, or some other writing unit, analyzing song lyrics will give your students an opportunity to look at the different ways that language can be used to capture emotions and tell stories .

This close reading process will also help improve their vocabulary and grammar skills while they are having fun!

Here are some tips on how to teach students to analyze song lyrics so that they can gain valuable writing knowledge through a familiar medium they love!

Analysis of Song Lyrics

Taylor Swift makes analyzing song lyrics in the classroom easy peasy. Like her or not, you can count on her to write songs that tell a story, are layered in deep meaning, and littered with Easter eggs that are fun to try and collect (even for the non-Swifties). 

Taylor Swift’s “ Anti Hero” is a fun student-friendly song to bring into the classroom to practice analysis skills.

With callbacks to songs on other albums in lines like “I have this thing where I get older but just never wiser,” you can challenge students to analyze the development of a theme across multiple texts (helloooo higher level DOK and those really tricky to meet standards!).

Lyrics like “I’m the problem; it’s me” coupled with the title setup an opportunity to teach the concept of anti-hero (I especially like the idea of teaching about anti-heroes after teaching about the hero’s journey) and challenging students to analyze how Swift herself could be seen as this archetype by analyzing other songs and conducting online research.

“Anti Hero” also has what appear to be two references to pop culture ( 30 Rock and Knives Out ) that had even the swiftest of Swifties stumped online. These references are an accessible way to introduce the idea of allegory. 

Taylor has really teed up the song analysis practice in English classrooms to be endless with so many rabbit holes to go down at every turn! 

music-analysis

Song Meaning “Hallelujah”

Leonard Cohen’s “Hallelujah” has a deep meaning making it a popular choice for teaching song analysis.  The meaning of Hallelujah is about someone who was deeply in love and is mourning the guilt of the loss of that love .

The song can teach students how to analyze lyrics by pointing out that even though it doesn’t say so explicitly, this is a song about a break-up .

They can also learn other aspects of reading literature, like examining tone and form. Analyzing song lyrics enables students to apply what they’ve learned as they read other texts or songs.

After reading a poem or listening to a song’s lyrics, students should be able to answer questions like: 

  • Who is speaking? 
  • How do you know? 
  • What do you think the speaker’s feelings are?
  •  What does this tell you about their personality? 
  • Do these feelings make sense for the situation?

analyzing-a-song

Good Songs to Analyze

When choosing good songs to analyze remember these three things:

  • Choose a song that tells a story
  • A song with a deep meaning or theme that challenges students’ inferential thinking skills works best
  • Pick songs that students will know and be excited to listen to (that means that while “We Didn’t Start the Fire” is technically a great song for analysis, it might not be the most engaging for your students)

Here are some songs for teaching song analysis that will not only help you teach important analysis skills but also engage and delight your students:

  • “ Pray for Me ” by the Weeknd ft. Kendrick Lamar
  • “ Thunder ” by Imagine Dragons 
  • “ Bohemian Rhapsody ” by Queen (this one is suitable for older students)
  • “ Born This Way ” by Lady Gaga
  • “ Getting Older ” by Billie Eilish 
  • “ Drivers License ” by Olivia Rodrigo 
  • “ This is America ” by Childish Gambino/Donald Glover
  • “ Matilda ” by Harry Styles
  • “ Victoria’s Secret ” by Jax (does have some profanity – I’ve linked the “clean” version)
  • “ Vacation ” by The Dirty Heads (does say “shit”)

analyzing-a-song

How to Analyze a Song

Teaching students how to analyze a song is similar to teaching poetry or literary analysis, but using songs disguises the learning as a fun activity making it really engaging and accessible for all learners.

Start by having students listen to their song twice .

  • Instruct them to listen through for the first time just for enjoyment and to follow along with the printed lyrics (or digital if you have a way for students to access the lyrics online).
  • Then have them listen a second time but this time have them highlight and circle words and phrases that they think are important and interesting. 

Challenge students to consider the following questions during their second time listening and to annotate the lyrics as they go:

  • Who’s telling the story? What’s their perspective? How do they feel? What’s making them feel that way?
  • What’s the mood of the song? Do the lyrics or the music contribute more to the mood?
  • What figurative language do you notice in the lyrics? Why might the songwriter have chosen to include that figurative language?
  • What could the songwriter be saying about human nature or society through their lyrics? How could you write a theme statement about these lyrics?

Once you’ve gotten your students started with the analysis process, make sure to involve your students. Ask them what they notice and use their insights to build discussion. Have them write a summary of the song or write a detailed analysis or work on a more creative, visual response.

research paper song lyrics

Song & Poem Analysis Paired Text Lesson Plans

Make close reading, textual analysis and literary analysis of songs (and poems)  less intimidating  with these detailed, CCSS-aligned  close reading song analysis lesson plans for paired texts . Integrated close reading, text-based writing, speaking, listening, and inquiry skills, make these lessons both  engaging and worthwhile.

To help you save prep time, I’ve put together some awesome lessons for you HERE , including:

  • Carrie Underwood’s song “Cry Pretty” & Macklemore & Ryan Lewis’ song “Growing Up”
  • William Ernest Henley’s poem “Invictus” & Imagine Dragons’ song “Whatever it Takes”
  • Maya Angelou’s poem “Still I Rise” and Tupac’s song “Still I Rise”
  • Stephen Dobyns’ poem “Loud Music” and Incubus’ song “Dig”
  • “Anti-Hero” by Taylor Swift
  • “Boulevard of Broken Dreams” by Green Day and “Brick by Boring Brick” by Paramore
  • “Hotel California” by the Eagles and “Stairway to Heaven” by Led Zeppelin
  • Protest Songs
  • “Mad World” by Tears for Fears and “A Million Dreams” sung by Pink / The Greatest Showman

Wrapping Up

When students analyze songs, they think about its overall impact.

What makes this song great, and why do you like it? What is it about this song that makes it stand out?

Thinking through these ideas with easily-accessible texts makes transferring their skills and knowledge to literature (ya know, the kind with the capital L ) easier.

They’ll have practice analyzing craft moves like figurative language and allegory, but they’ll also have practice with those more complex reading strategies like making inferences and connections .

Have a song you think would be perfect to analyze in the classroom? I’d love to hear about it! Drop me a comment below to share! 

Hey, if you loved this post, you’ll want to download a  FREE copy of my guide to streamlined grading .

I know how hard it is to do all the things as an English teacher, so I’m excited to share some of my best strategies for reducing the grading overwhelm. 

grading-essays

About Lindsay Ann

Lindsay has been teaching high school English in the burbs of Chicago for 18 years. She is passionate about helping English teachers find balance in their lives and teaching practice through practical feedback strategies and student-led learning strategies. She also geeks out about literary analysis, inquiry-based learning, and classroom technology integration. When Lindsay is not teaching, she enjoys playing with her two kids, running, and getting lost in a good book.

Related Posts

You may be interested in these posts from the same category.

project-based-learning

Project Based Learning: Unlocking Creativity and Collaboration

the-danger-of-a-single-story

Empathy and Understanding: How the TED Talk on the Danger of a Single Story Reshapes Perspectives

story-elements

Teaching Story Elements to Improve Storytelling

figurative-language-examples

Figurative Language Examples We Can All Learn From

growth-mindset-versus-fixed-mindset

18 Ways to Encourage Growth Mindset Versus Fixed Mindset in High School Classrooms

song-analysis

10 Song Analysis Lessons for Teachers

conversation-starters

Must-Have Table Topics Conversation Starters

writing-process

The Writing Process Explained: From Outline to Final Draft

writing-narratives

The Art of Storytelling: Techniques for Writing Engaging Narratives

writing-topics

Writing Topics for Student Writer’s Block

research paper song lyrics

The Art of Argumentation: How to Write a Convincing Argumentative Essay

research paper song lyrics

5 Strategies for Improving Reading Comprehension Skills

exit-tickets

Reader Interactions

' src=

March 28, 2023 at 4:50 am

Jungle by Tash Sultana

[…] this post, I will share with you 20 must-read Classic novels for high school students and some modern texts that pair well with some of these well-loved […]

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Music lyrics have become more repetitive since the 1980s, study says

FILE - New research says music lyrics have gotten more emotional and repetitive.

(CNN) – Music lyrics have gotten simpler, angrier and more repetitive, according to a new study.

Researchers compared English-language rap, pop, country, rock and R&B songs from the 1980s to 2020.

They found the more recently released songs are more repetitive and emotional.

Songs use more rhyming lines, more chorus and more anger-related words.

The study also looked at which music genres people liked to look up lyrics for.

Fans of rap and R&B showed the most interest in song lyrics.

The study was published Thursday in Scientific Reports .

Copyright 2024 CNN Newsource. All rights reserved.

Jill and Jay Campbell say their dream vacation has turned into a nightmare.

‘We’re stranded here’: Couple says cruise line abandoned them on African island

Shooting on Denham Street

Girl injured by gunfire while in her bed in Baton Rouge, police say

One person is dead in connection with a reported shooting in Baton Rouge overnight.

2 people killed, teen injured in shooting on 68th Avenue

Shooting on Scenic Highway

BRPD: Man dies after massive gunfire along Scenic Highway

Troopers with Louisiana State Police are investigating a crash that left one person dead early...

Driver killed in 2-vehicle crash in Iberville Parish, troopers say

Latest news.

Denham St shooting

New details in shooting that hurt a teen early Easter morning

BRPD search for alleged shooter.

BRPD search for alleged shooter

Gonzales has been chosen as one of seven small towns across the country to host an upcoming...

Kid Rock, Jason Aldean to headline country music festival in Gonzales

ACLU responds to attacks from anti-abortion groups on a woman’s right to choose

State of the Union: Abortion Rights preview

‘Most are living check to check’ Congress on efforts to improve public housing

‘Most are living check to check’ Congress on efforts to improve public housing

Generate accurate MLA citations for free

  • Knowledge Base
  • How to cite a song in MLA

How to Cite a Song in MLA | Format & Examples

Published on December 3, 2020 by Jack Caulfield . Revised on March 5, 2024.

The format for citing a song in MLA depends on the medium in which you listened to it.

To cite a song accessed through an online streaming service, list the performer (or group) as author, the song title in quotation marks, the name of the site in italics, and the URL where the song can be found. Omit “the” from a band name, e.g. “Beatles,” not “the Beatles.”

If relevant, use a timestamp to indicate a specific part of the song in the in-text citation.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

Citing songs in physical formats, quoting lyrics, optional details to include in an mla song citation, frequently asked questions about mla citations.

To cite a song you accessed in a physical format (e.g. CD, vinyl), include details about the album (or single, EP, etc.) it appears on: the title, distributor, year, and, optionally, the format (e.g. “CD”).

This format also applies to songs you downloaded and accessed through a media player (e.g. iTunes).

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research paper song lyrics

Quote lyrics from a song in a similar format to poetry : separate lines with a slash symbol, and format four or more lines quoted at once as a block quote .

If you’re quoting these lyrics from a transcript included alongside the song (e.g. in the description on YouTube, in the accompanying booklet to a CD), then you should clarify this at the end of the Works Cited entry.

MLA’s citation style is flexible; some details are not mandatory in a song citation but can be included when relevant.

For example, if your discussion focuses on the work of a particular musician, you can list them in the author position instead of the main artist, along with a description of their role. The main artist should still be listed later in the contributor slot.

Especially with classical music, it may be more relevant to cite the composer rather than the performer in the author position. You can then list the particular performer(s) and/or conductor after the title. If the particular performance is irrelevant, you might omit this information entirely.

When citing a song in MLA style , the author is usually the main artist or group that released the song.

However, if your discussion focuses on the contributions of a specific performer, e.g. a guitarist or singer, you may list them as author, even if they are not the main artist. If you’re discussing the lyrics or composition, you may cite the songwriter or composer rather than a performer.

In an MLA song citation , you need to give some sort of container to indicate how you accessed the song. If this is a physical or downloaded album, the Works Cited entry should list the album name, distributor, year, and format.

However, if you listened to the song on a streaming service, you can just list the site as a container, including a URL. In this case, including the album details is optional; you may add this information if it is relevant to your discussion or if it will help the reader access the song.

If a source has no author, start the MLA Works Cited entry with the source title . Use a shortened version of the title in your MLA in-text citation .

If a source has no page numbers, you can use an alternative locator (e.g. a chapter number, or a timestamp for a video or audio source) to identify the relevant passage in your in-text citation. If the source has no numbered divisions, cite only the author’s name (or the title).

If you already named the author or title in your sentence, and there is no locator available, you don’t need a parenthetical citation:

  • Rajaram  argues that representations of migration are shaped by “cultural, political, and ideological interests.”
  • The homepage of The Correspondent describes it as “a movement for radically different news.”

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Caulfield, J. (2024, March 05). How to Cite a Song in MLA | Format & Examples. Scribbr. Retrieved April 1, 2024, from https://www.scribbr.com/mla/song-citation/

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, how to cite a youtube video in mla, how to cite a website in mla, how to cite a poem in mla, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

  • ABBREVIATIONS
  • BIOGRAPHIES
  • CALCULATORS
  • CONVERSIONS
  • DEFINITIONS

Lyrics.com

Discuss these research paper Lyrics with the community:

 width=

Report Comment

We're doing our best to make sure our content is useful, accurate and safe. If by any chance you spot an inappropriate comment while navigating through our website please use this form to let us know, and we'll take care of it shortly.

You need to be logged in to favorite .

Create a new account.

Your name: * Required

Your email address: * Required

Pick a user name: * Required

Username: * Required

Password: * Required

Forgot your password?    Retrieve it

We need you!

Help build the largest human-edited lyrics collection on the web, the web's largest resource for, music, songs & lyrics, a member of the stands4 network, search for song lyrics by, browse lyrics.com, our awesome collection of, promoted songs.

research paper song lyrics

Get promoted 

Are you a music master?

"just own the night, like the fourth of july…", free, no signup required :, add to chrome, add to firefox.

How to Cite Song Lyrics in Different Formats With Examples

11 December 2023

last updated

Songs contain lyrics that can support arguments made in scholarly papers. Basically, MLA 9, APA 7, Chicago/Turabian, and Harvard referencing styles rely on different rules for in-text citations and bibliographic entries. In this case, significant differences include orders or names, use of capital letters and sentence cases, and sequence of bibliographic details. For example, bibliographic entries in MLA 7 and Chicago/Turabian formats should have a song’s titles in a title case. However, song’s titles must appear in a sentence case for APA 7 and Harvard format styles. In all referencing styles, writers may use parenthetical or narrative in-text citations. Also, parenthetical in-text citations should appear inside rounded brackets. In turn, only the MLA 9 style does not require an author-date format for in-text citations. Besides, the MLA referencing style requires students to use the artist’s name and the section cited. In particular, parts of songs refer to the duration of time that contains quoted lyrics. Also, a prudent writer includes the production date inside the essay’s body. Finally, one should include URL links for songs located in online databases. Hence, students need to learn how to cite song lyrics in MLA 9, APA 7, Chicago/Turabian, and Harvard formats.

General Guidelines for Using Music Lyrics in MLA 9, APA 7, Chicago/Turabian, and Harvard Styles

Songs are suitable sources of information used in scholarly papers. For example, students in literature write papers analyzing specific song lyrics. In most cases, instructions require them to use different referencing styles to cite song lyrics. Basically, MLA 9, APA 7, Chicago/Turabian, and Harvard referencing styles rely on different rules for in-text citations and bibliographic entries. Also, the most significant differences include orders or names, use of capital letters and sentence cases, and sequence of bibliographic details. However, credible papers must follow the necessary guidelines to avoid possible cases of plagiarism. Hence, this guide provides key steps that a writer must follow when citing song lyrics MLA 9, APA 7, Chicago/Turabian, and Harvard styles.   

How to cite song lyrics in MLA 9, APA 7, Chicago/Turabian, and Harvard formats

How to Cite Song Lyrics in MLA 9

1. in-text citation rules for citing music lyrics in mla 9.

MLA citation requires writers to use the artist’s name and the section cited. Basically, parts of songs refer to the duration of time that contains the quoted lyrics. In this case, one can use both parenthetical and narrative in-text citations when writing scholarly papers. In practice, parenthetical in-text citations should include the artist’s name and the time containing the cited lyric. Also, these details should appear inside rounded brackets at the end of a sentence. Hence, writers should separate the name and time using a black space.

Citation scheme for song lyrics in MLA 9:

  • (Name Time)

In-text citation example for citing music lyrics in MLA 9:

  • (Knowles-Carter 2:00-3:01)

Narrative in-text citations contain similar details as parenthetical ones. However, only the time should appear in rounded brackets at the end of the sentence.

Example of a narrative in-text citation for song lyrics in MLA 9:

  • According to Knowles-Carter, … (2:00-3:01).

2. Full Bibliographic Entry on Works Cited for Song Lyrics in MLA 9

Music exists in multiple ways, which depends on the container that a writer may access. For example, music citations in MLA 9 begin with last names of artists. In some instances, writers may opt to use composers or performers in the absence of the artist’s surnames. Otherwise, one must note composers and performers after the album’s title. In turn, the song’s title must appear inside double quotation marks, while the album’s entry should look as italicized. Besides, one must use the title case for the song’s entry. Then, other essential details include the recording manufacturer and the publication date where applicable.

Example of a bibliographic entry on Works Cited for song lyrics in MLA 9:

  • Knowles-Carter, Beyoncé Giselle. “Irreplaceable.” Lemonade, Parkwood Entertainment, 2009, www.youtube.com/watch?v=2EwViQxSJJQ.  

The bibliographic entry should begin with the artist’s surname followed by the first and second names. Periods should support the artist’s names, title, and URL link for music available online. However, a comma should follow the album, producer, and publication year.

3. MLA Footnote for Song Lyrics

MLA referencing style requires writers to use references when providing long explanations. In this case, MLA footnotes contain details that may distract readers. As a rule, one should use numerical footnote callouts inside the body text. Moreover, footnote callouts should appear as a suffix. In turn, actual footnotes must look at the bottom of the page containing the callout. Hence, the following is an example of a possible note: 

Footnote example for song lyrics in MLA 9:

  • 1 For more relevant lyrics, listen to the section between the second and third minutes.

How to Cite Song Lyrics in APA 7

1. in-text citation rules for citing music lyrics in apa 7.

APA style requires writers to use parenthetical or narrative in-text citations. Basically, writers should use the artist’s surname and production date. In some cases, one may include the truck number in the in-text citation where applicable.

Citation schemes for song lyrics in APA 7:

  • (Surname, Date)
  • According to Surname (date), … .

Parenthetical citation sample for song lyrics in APA 7:

  • (Knowles-Carter, 2009)

Narrative in-text citation example for citing music lyrics in APA 7:

  • Knowles-Carter (2009) sang about the value of a romantic partner… .

2. Reference Entry for Song Lyrics in APA 7

Bibliographic entries in APA 7 should appear on the last page of any essay or research paper . As a rule, reference entries should contain the artist’s surname, song’s title, medium, and producer and URL link where applicable. Also, bibliographic entries must appear in the sentence case. Hence, the following is an example of the reference entry.

Example of a reference entry for song lyrics in APA 7:

  • Knowles-Carter, B. G. (2009). Irreplaceable [CD]. Parkwood Entertainment. https://www.youtube.com/watch?v=2EwViQxSJJQ

3. APA Footnote for Song Lyrics

The following is an example of a possible footnote.

Footnote example for song lyrics in APA 7:

  • 1 For more details about the importance of love, listen to the section between the second and third minutes.

How to Cite Song Lyrics in Chicago/Turabian

1. in-text citation rules for citing music lyrics in chicago/turabian.

The Chicago/Turabian citation style requires writers to use music lyrics to support their central arguments. As a rule, all citations must contain an accurate date. Besides, students should use the recording date as opposed to the release year. In turn, the Chicago/Turabian style requires one to use two in-text citation types, which include parenthetical citations or footnotes.

Parenthetical in-text citation for song lyrics in Chicago/Turabian:

Parenthetical citations in Chicago/Turabian should include the artist’s surname and production date. Basically, writers should separate these details with space, which must appear inside rounded brackets at the end of a sentence. Hence, the following is an example that one should use:

In-text citation scheme:

  • (Artist Date)

Actual in-text citation:

  • (Knowles-Carter 2009)

2. Bibliography Entry for Song Lyrics in Chicago/Turabian

Chicago/Turabian bibliographic entries should appear on the essay’s last page. Basically, a credible bibliographic entry should contain the artist, the song’s title, recording data, producers, and medium. However, one may include the URL link of the song where necessary. Also, the song’s title must appear as italicized for the bibliographic entry. Hence, the following is the scheme required and actual example.

Bibliography scheme for song lyrics in Chicago/Turabian:

  • Name of group or composer or performer.  Title . Contributing personnel. Recording date. Recording Company or Publisher, medium.

Example of a bibliography entry for song lyrics in Chicago/Turabian:

  • Knowles-Carter, Beyoncé Giselle. Irreplaceable . 2009. Parkwood Entertainment, CD. https://www.youtube.com/watch?v=2EwViQxSJJQ.

3. Chicago/Turabian Footnote for Song Lyrics

Writers should use footnote callouts in the text body. Basically, numerical callouts should follow ending punctuation marks, like periods and question and exclamation marks. In this case, Chicago/Turabian footnote containing the bibliographic entry should appear at the bottom of the page. Also, each note should have a binary number that corresponds to the one used in the callout. However, a period should follow the numeral used. Besides, the artist’s name should appear in the standard form, which begins with the first name for the first in-text citation. Hence, the following is the scheme and actual in-text citation.

Footnote scheme for song lyrics in Chicago/Turabian:

  • 1. Name of group/composer/performer, “Title,” contributing personnel, recording date, Recording Company or Publisher, track number on Name of Album , year of release, medium.

Example of a Chicago/Turabian footnote for song lyrics:

  • 1. Beyoncé Giselle Knowles-Carter, “Irreplaceable,” recorded February 2009, on Lemonade , Parkwood Entertainment, CD, https://www.youtube.com/watch?v=2EwViQxSJJQ.

How to Cite Song Lyrics in Harvard

1. in-text citation rules for citing music lyrics in harvard.

The Harvard referencing style requires writers to use parenthetical or narrative in-text citations. As a rule, in-text citations must include the artist’s surname and production date. For parenthetical in-text citations, students should separate the artist’s surname and production date with space.

Citation schemes for song lyrics in Harvard:

  • (Surname Date)
  • Surname (Date) sang … .

Parenthetical citation sample for song lyrics in Harvard:

Narrative in-text citation example for citing music lyrics in Harvard:

  • Knowles-Carter (2009) sang about the importance of valuing romantic love.

2. Reference Entry for Song Lyrics in Harvard

Bibliographic entries in Harvard should appear on the last page of essays and research papers. As a rule, Harvard references should contain the artist’s surname, song’s title, medium, and producer. Moreover, the song’s title must appear in sentence case and be italicized. In this case, a prudent writer must use commas to separate all the details in the entry. Besides, reference entries should begin with the author’s surname followed by the initials of other names. Hence, the following is an example of a reference entry for song lyrics in Harvard.

  • Knowles-Carter, BG 2009, Irreplaceable , CD, Parkwood Entertainment.

3. Harvard Footnote for Song Lyrics

The following is an example of a possible note.

  • 1 For more details about valuing one’s romantic partner, listen to the section between the first and third minutes.

Songs contain lyrics that can support the main arguments presented in scholarly articles. In this case, MLA 9, APA 7, Chicago/Turabian, and Harvard referencing styles rely on different rules. Also, one should consider the following guidelines on how to cite song lyrics in MLA 9, APA 7, Chicago/Turabian, and Harvard when using such sources in essays and research papers.

  • Bibliographic entries in MLA 9 and Chicago/Turabian should have song titles in a title case.
  • Song’s title must appear in sentence cases for APA 7 and Harvard styles.
  • In all referencing styles, writers may use parenthetical or narrative in-text citations.
  • Only the MLA style does not require an author-date format for the in-text citation.

To Learn More, Read Relevant Articles

How to title an essay: basic guidelines with examples, how to write a critical analysis essay with examples.

Read the Latest on Page Six

  • Weird But True
  • Sex & Relationships
  • Viral Trends
  • Human Interest
  • Fashion & Beauty
  • Food & Drink

trending now in Lifestyle

Dear Abby: Our neighbors keep on copying whatever we do

Dear Abby: Our neighbors keep on copying whatever we do

Van-lifers reveal the most difficult challenges of roving lifestyle: 'Everything is 10 times harder'

Van-lifers reveal the most difficult challenges of roving...

Content creator stunned to learn $15 Goodwill dress has star-studded past

Content creator stunned to learn $15 Goodwill dress has...

Meet the Whittakers: 'America's most inbred family' that speaks in grunts

Meet the Whittakers: 'America's most inbred family' that speaks...

Trump ordered $200 worth of burgers from Long Island drive-in for flight home after NYPD officer's wake

Trump ordered $200 worth of burgers from Long Island drive-in for...

Mom stunned to give birth to twins 22 days apart: 'Genuinely couldn’t believe it'

Mom stunned to give birth to twins 22 days apart: 'Genuinely...

Couple discovers secret room behind bookcase in home they've lived in for 4 years: 'That screams murder room'

Couple discovers secret room behind bookcase in home they've...

American Airlines is relaxing its pet policy and making it cheaper to fly with furry friends

American Airlines is relaxing its pet policy and making it...

Breaking news, no, you’re not just getting old — song lyrics really are getting worse, according to new research.

  • View Author Archive
  • Email the Author
  • Get author RSS feed

Contact The Author

Thanks for contacting us. We've received your submission.

Thanks for contacting us. We've received your submission.

Silhouette of a woman vocalist singing into a microphone

He may live at the top of the charts, but nobody’s putting The Weeknd up for a Nobel Prize for literature.

A new study showing a steep decline in songwriting skills over the last 40 years used folk crooner Bob Dylan’s award from the sainted Swedish institution of an example of how much music has changed.

To draw their conclusions, a team of researchers from Europe pored over the words to roughly 12,000 songs in English across a variety of genres from rap to rock to R&B, written between 1980 and 2020, The Guardian reported.

Profile view of a woman with glasses speaking, representing the concept of voice recognition

Their findings were exactly as anyone over the age of 40 might have suspected — lyrics have become simpler, more repetitive, angrier and more self-obsessed.

“What we have also been witnessing in the last 40 years is a drastic change in the music landscape — from how music is sold to how music is produced,” said senior study author Eva Zangerle, recommendations systems expert at the University of Innsbruck.

The study declined to name and shame any newer artists, but instead spoke of how lyrics can be a “mirror of society,” reflecting a culture’s shifting values and preoccupations.

The study, published in Scientific Reports , concentrated on emotions expressed, repetitiveness and word choice, particularly to see how often difficult or unusual words were used.

“Across all genres, lyrics had a tendency to become more simple and more repetitive,” Zangerle said.

Songwriter passionately writing music in home studio with an electronic keyboard and guitar

The period studied saw great change in the way we relate and listen to music, the experts noted, ranging from vinyl records at the beginning of the 1980s to today’s streaming platforms and their algorithms.

Study results reinforced previous research that suggested a decline in the number of upbeat and positive lyrics over time, giving way to a rise in anger, disgust or sadness.

“Rap music has become more angry than the other genres,” Zangerle noted.

Songwriting has also come to celebrate self-obsession, the pros pointed out — words like “mine” and “me” are far more popular today.

Share this article:

research paper song lyrics

How do I format a quotation of song lyrics?

Format a quotation of song lyrics the same way you would format a quotation of poetry. If the quotation consists of fewer than four lines, run it into the text, placing quotation marks around the lines and separating the lines from each other with a forward slash with a space on either side of it. 

Bob Dylan famously sang that “[t]he answer, my friend, is blowin’ in the wind / The answer is blowin’ in the wind.” Work Cited Dylan, Bob. “Blowin’ in the Wind.” Bob Dylan , 2018, www.bobdylan.com/songs/blowin-wind/.

If the quotation consists of four or more lines, set the quoted lines apart from the text as an extract.

In “Blowin’ in the Wind,” Bob Dylan asks: How many roads must a man walk down Before you call him a man? Yes, ’n’ how many seas must a white dove sail Before she sleeps in the sand?

Dylan, Bob. “Blowin’ in the Wind.” Bob Dylan , 2018, www.bobdylan.com/songs/blowin-wind/.

IMAGES

  1. Song Lyric Analysis Assignment EXAMPLE

    research paper song lyrics

  2. Song Lyrics Template

    research paper song lyrics

  3. 017 Song Essay Example Writing Lyrics How To Format Lyric Sheet

    research paper song lyrics

  4. 008 Song Essay Example ~ Thatsnotus

    research paper song lyrics

  5. Song Lyric Analysis Essay by Let's Get Literature

    research paper song lyrics

  6. Example Of Critique Paper About Song : Song Lyrics Research Papers

    research paper song lyrics

VIDEO

  1. Laddu ka paper song 🎸🎤#kahnastatus

  2. paper song #shorts #comedy #mpboardexam #physics

  3. Scientist Lyrics Video

  4. #Paper song # Shorts # Subscribe # Trending 🤗🤗

  5. exam paper song 🤣🤣😂

  6. Paper Song status video #viral #reels #trending #tiktok #sad #shorts #instagram

COMMENTS

  1. Song lyrics have become simpler and more repetitive over the ...

    Scientific Reports - Song lyrics have become simpler and more repetitive over the last five decades. ... Thereby, we investigate the following research questions in this paper: (RQ1) Which trends ...

  2. Why are song lyrics becoming simpler? a time series analysis of ...

    Introduction. Music is a human universal [1, 2], and it is known to influence cognition, affect, and behavior [3-5].Because songs—and particularly popular song lyrics—can be so rich in meaning [6, 7], social scientists have long explored the ways that such lyrics intersect with some fundamental social processes, including identity formation and person perception [8-13].

  3. Introduction: The challenges of the song lyric

    First, because song lyrics are an integral part of popular music, which itself is a domain of human experience that has enormous significance. Since its emergence through revolutionary media at the begin-ning of the twentieth century, and particularly since its explosion in popularity and avail-ability in the late 1950s, popular music as we now ...

  4. Why are song lyrics becoming simpler? a time series analysis of lyrical

    Introduction. Music is a human universal [1, 2], and it is known to influence cognition, affect, and behavior [3-5].Because songs—and particularly popular song lyrics—can be so rich in meaning [6, 7], social scientists have long explored the ways that such lyrics intersect with some fundamental social processes, including identity formation and person perception [8-13].

  5. Song Lyrics Really Are Getting Simpler and More Repetitive, Study Finds

    Lyrics that seem like simple gibberish at first can actually be clever wordplay, James points out, noting Missy Elliot's backward lines in the 2002 hit song "Work It." And lyrics are just ...

  6. New Study of Song Lyrics Reveals They've Become Simpler

    European researchers analyzed the words of more than 12,000 English language songs across multiple genres including rap, country, pop, R&B, and rock from a period of 40 years starting in 1980.

  7. Quantitative Sentiment Analysis of Lyrics in Popular Music

    Popular music has been changing significantly over the years, revealing clear, audible differences when compared with songs written in other eras. A pop music composition is normally made of two parts—the tune and the lyrics. Here we use a digital humanities and data science approach to examine how lyrics changed between the 1950's and the more recent years, and apply quantitative analysis ...

  8. Sentiment Analysis of Song Lyrics Using Clustering

    A sentiment analysis was performed on lyrics of popular music artists 2 of various periods by analyzing the positivity, negativity, and neutrality present in the lyrics. Figure 1 shows the most relevant words for the band Radiohead. Fig. 1. Happy and sad words for Radiohead songs.

  9. Quantitative Sentiment Analysis of Lyrics in Popular Music

    A dataset based on the song lyrics of the top-5 charting weekly songs in the UK and USA from January 1999 to August 2020 was computer-analysed for interpersonal variables, such as satisfaction and ...

  10. Song lyrics are getting simpler, more repetitive: Study

    Song lyrics have become simpler and more repetitive since 1980, study finds Mar 28, 2024 Low resting heart rate in women is associated with criminal offending, unintentional injuries

  11. (PDF) A Critical Discourse Analysis of song "Look What ...

    The data of this study were the lyrics of Taylor Swift's song "Look What You. Made Me Do" and the official video clip on YouTube uploaded by VIVO on August 27 th, 2017. Both the lyrics and ...

  12. Song lyrics getting simpler, more repetitive, angry and self-obsessed

    Researchers analysed the words in more than 12,000 English-language songs across several genres from 1980 to 2020 You're not just getting older. Song lyrics really are becoming simpler and more ...

  13. PDF Lyrics Information Processing: Analysis, Generation, and Applications

    Lyrics Information Processing: Analysis, Generation, and Applications Kento Watanabe and Masataka Goto National Institute of Advanced Industrial Science and Technology (AIST) fkento.watanabe, [email protected] Abstract In this paper we propose lyrics information processing (LIP) as a research field for tech-nologies focusing on lyrics text ...

  14. (PDF) Natural language processing of lyrics

    Topic Modelling Among the works addressing this task for song lyrics, Mahedero et al. [39] define five ad hoc topics (Love, Violent, Antiwar, Christian, Drugs) into which they classify their ...

  15. When words matter: A cross-cultural perspective on lyrics and their

    Music and lyrics co-occur in most societies around the world (Clarke, 1952; Levitin, 2006; MacDonald et al., 2012).Emotions aroused by instrumental music have been explored in a broad set of studies (e.g., Juslin & Laukka, 2004; Sloboda, 1992; Zentner et al., 2008).Despite how common listening to music with lyrics is, research on the link between lyrics and emotions is scarce (Juslin, 2005 ...

  16. On exploiting transformers for detecting explicit song lyrics

    First, we empirically assess the performance of various transformer-based language models on a large ( ∼ 800K songs) available dataset for explicit song lyrics detection [25], comparing their performance against strong competitive baselines such as logistic regression, the fastText classifier [14], and a 1D Convolutional Neural Network (CNN ...

  17. Song lyrics are getting simpler, more repetitive: study

    Before detailing how lyrics have become more basic, the study pointed out that US singer-songwriting legend Bob Dylan -- who rose to fame in the 1960s -- has won a Nobel Prize in literature.

  18. Words Matter: How Lyrics Help Songs Top the Charts

    Knowledge at Wharton Podcast Words Matter: How Lyrics Help Songs Top the Charts June 1, 2018 • 8 min listen. Wharton research uses natural language processing to study why some songs catch on ...

  19. song lyrics Latest Research Papers

    This paper aims to explain the initial process of musical composition with the theme Kelampan Bajang including 1) Bekuliq, 2) Bekayaq, 3) Matur Tampiasih, 4) Sepi Alam Dese, 5) Lalo Midang, and 6) Merariq. All song lyrics are written in the Sasak language.

  20. Song lyrics have become angrier since the 80s, study shows

    Song lyrics have become angrier over the past four decades, while the words in rap, country, pop, R&B, and rock songs have also become simpler and more repetitive, according to research published ...

  21. PDF Using Song Lyrics 1 RUNNING HEAD: Using Song Lyrics TUNING INTO

    Using Song Lyrics 1 ... 61-683 Research Paper Summer 2013 November 13, 2013 . ... The research includes findings that answer the questions, "Is there a significant difference in students' attitudes when using the Houghton Mifflin English textbook (1999) or the Write

  22. Analyzing a Song

    English teachers, teaching your students how to analyze song lyrics needs to be a "go-to" strategy, a step toward deeper analysis of more complex texts. Whether you're teaching poetry, persuasive essays, or some other writing unit, analyzing song lyrics will give your students an opportunity to look at the different ways that language can be used to capture emotions and tell stories.

  23. Music lyrics have become more repetitive since the 1980s, study says

    (CNN) - Music lyrics have gotten simpler, angrier and more repetitive, according to a new study. Researchers compared English-language rap, pop, country, rock and R&B songs from the 1980s to 2020.

  24. A Corpus-based study of Taylor Swift's songs: the linguistic

    TSC consists of song lyrics of different genres, including country, country rock, pop, country pop, folk and folk pop. ... In this paper, I report on an analysis of lyrics based on a chart corpus ...

  25. How to Cite a Song in MLA

    To cite a song accessed through an online streaming service, list the performer (or group) as author, the song title in quotation marks, the name of the site in italics, and the URL where the song can be found. Omit "the" from a band name, e.g. "Beatles," not "the Beatles.". If relevant, use a timestamp to indicate a specific part ...

  26. Lyrics containing the term: research paper

    A list of lyrics, artists and songs that contain the term "research paper" - from the Lyrics.com website. Login . The STANDS4 Network. ABBREVIATIONS; ANAGRAMS; BIOGRAPHIES; ... you is wrong then nothing right for me No research paper but your type is what I need You tryna play it safe but know I can see When that sweat drip. Relics (feat ...

  27. How to Cite Song Lyrics in Different Formats With Examples

    In-Text Citation Rules for Citing Music Lyrics in APA 7. APA style requires writers to use parenthetical or narrative in-text citations. Basically, writers should use the artist's surname and production date. In some cases, one may include the truck number in the in-text citation where applicable.

  28. Song lyrics are worse, more angry than ever, research shows

    To draw their conclusions, a team of researchers from Europe pored over the words to roughly 12,000 songs in English across a variety of genres from rap to rock to R&B, written between 1980 and ...

  29. How do I format a quotation of song lyrics?

    Format a quotation of song lyrics the same way you would format a quotation of poetry. If the quotation consists of fewer than four lines, run it into the text, placing quotation marks around the lines and separating the lines from each other with a forward slash with a space on either side of it. Bob Dylan famously sang that " [t]he answer ...

  30. Discourse Analysis of a Song Lyric Entitled "We Will Not Go Down"

    This research built on previous research about song lyrics (i.g. Vitria,2013;AlAfar,2016; Risdianto, 2016; Miller,2017, Yastanti&Setiawati,2018) the research about song lyrics analysis for example ...