The Seven Words You Can Never Say on Television: Increases in the Use of Swear Words in American Books, 1950-2008

Evidence is accumulating that American culture has become more individualistic since the 1950s. In the present research, we focused on one plausible manifestation of individualism, the use of swear words in cultural products. We examined trends in the use of the seven words identified by George Carlin in 1972 as the “seven words you can never say on television” in the Google Books corpus of American English books from 1950 to 2008. We find a steady linear increase in the use of swear words, with books published in 2005-2008 twenty-eight times more likely to include swear words than books published in the early 1950s. Increases for individual swear words ranged from 4 to 678 times (ds = 6.58-45.42). These results suggest that American culture has become increasingly accepting of the expression of taboo words, consistent with higher cultural individualism.

Despite the interest in trends in swearing, direct empirical evidence for the increasing use of swear words is lacking. In fact, studies of spoken frequencies of swear words suggest few changes in their use between the late 1970s and mid-2000s (Jay, 2009). It is unclear if cultural products actually do feature more taboo words, possibly indicating that American culture has become more accepting of crude language. In this article, we examine changes in the use of swear words since 1950 in written language, using the large Google Books database of 5 million books (Michel, Kui, Presser, et al., 2010).
Studying changes in language use in books is important because cultural products such as books are a useful place to observe and quantify cultural change. As Lamoreaux and Morling (2012) argued, it is important to study cultural products for at least three reasons. First, culture includes the context as well as the person, and cultural products capture culture "outside the head." Second, cultural products are not subject to the biases of that plague self-report measures, such as reference group and social desirability effects. Third, and perhaps most important, cultural products shape individuals' ideas of cultural norms and "common sense." People's behavior is often influenced by their beliefs about what others in their culture believe and do, even if these assumptions are erroneous (e.g., Zou et al., 2009). Cultural products are likely one of the most common sources for perceptions of cultural norms. Thus, if swear words are used more often in books, it might indicate an increasing acceptance of these words in the culture.

Cultural Shifts Relevant for Changes in the Use of Swear Words
Why might the use of swear words increase in American books? As noted above, individualism is a cultural system that favors the self more highly than the collective, and a growing body of research suggests that American culture has become increasingly individualistic (Grossmann & Varnum, 2015;Twenge, 2014). One key factor may be selfexpression. Swear words allow the free expression of emotion, especially anger (Jay, 2009). Due to the greater valuation of the rights of the individual self, individualistic cultures favor more self-expression in general (Kim & Sherman, 2007) and allow more expression of personal anger in particular (Safdar et al., 2009). Thus, a more individualistic culture should be one with a higher frequency of swear word use.
In addition, swear words are also known as taboo words, as they are "sanctioned or restricted on both institutional and individual levels" (Jay, 2009, p. 153). Taboos, usually defined as social rules or inhibitions, are less prominent in individualistic cultures, which usually name fewer behaviors as social taboos (Triandis, 1995). Consistent with a rise in individualism, other behaviors once considered taboos have faded in American culture. For example, social taboos against premarital and homosexual sex have decreased since the 1990s (Twenge, Sherman, & Wells, 2015. Other traditional social rules, such as those stigmatizing working mothers and atheists, have also become less prominent (Donnelly et al., 2016;Twenge, Carter, & Campbell, 2015). However, little research has explored the loosening of social taboos in cultural products. The study on song lyrics (DeWall et al., 2011) examined antisocial words including swear words but was limited to only 10 songs per year and did not examine swear words separately from antisocial words such as "kill" and "hate." Research suggests that swearing is linked to personality traits such as extraversion, dominance, narcissism, and neuroticism (Fast & Funder, 2008;Holtzman, Vazire, & Mehl, 2010;Schwartz et al., 2013). A recent study found that swearing was associated with high extraversion and low agreeableness (Kennison & Messer, 2017), a personality profile empirically linked to high individualism via an association with grandiose narcissism (e.g., Miller et al., 2011;Paulhus, 2001). This profile is also linked to individualism conceptually, as high extraversion, especially boldness and assertiveness, and low agreeableness, especially low modesty and high grandiosity, are connected to an individualistic drive to stand outside and above the group (Triandis & Suh, 2002). Several studies also link narcissism directly to selfreported individualism across cultures (Cai, Kwan, & Sedikides, 2012;Konrath, Bushman, & Grove, 2009;Meisel, Ning, Campbell, & Goodie, 2016). Average levels of extraversion and dominance (Terracciano, 2010;Twenge, 2001), narcissism (Twenge & Foster, 2010), and neuroticism (Twenge, 2015) have all increased among individuals in the United States. Thus, the frequency of swearing may increase as well.

The Present Study and Hypotheses
The Google Books Ngram viewer allows the examination of language use in 5 million books (Michel et al., 2011). The Ngram database reports usage frequency by dividing the number of instances of the word in a given year by the total number of words in the corpus in that year, thus correcting for changes in the number of published works and their length. To avoid confounds between year and country of publication, we drew from a corpus restricted to books published in one country (the United States, the American English corpus). Books are also an ideal cultural product in which to examine language use, as they have stayed relatively unchanged as a medium, in contrast to the significant changes in broadcast media, with the movement from three main networks to cable to streaming video.
In such research, a key question is which words to examine. Two issues are of particular importance: (a) a list of words that is somewhat objective, and (b) a list without a strong present fashion bias (i.e., words that have become fashionable or popular in recent years). This latter issue is problematic because any increase seen in the data might simply reflect fashion trends rather than broader psychological changes. We attempted to address these issues by analyzing a list of swear words chosen by someone else (to provide some objectivity vis-à-vis the investigators) and chosen in a previous historical period (to avoid present fashion bias). To that end, we used the "seven words you can never say on television" popularized by comedian George Carlin in 1972 (shit, piss, fuck, cunt, cocksucker, motherfucker, and tits;Bella, 2012). This is a short yet reasonably comprehensive list of swear words considered taboo in polite society. It differs somewhat from other lists of commonly used taboo words such as that of Jay (2009) as it focuses on swear words rather than simply taboo words (Jay's list is fuck, shit, hell, damn, goddamn, Jesus Christ, ass, oh my god, bitch, and sucks). Some of the taboo words on Jay's list (Jesus Christ, hell) would be problematic to examine in frequency databases as indicators of cultural change as they are also used in other contexts where they are not taboo. Nevertheless, the two words that appear on both lists (fuck and shit) account for up to half of the uses of taboo words (Jay, 2009). Three of the words on Carlin's list (cocksucker, cunt, and fuck) are the same as the three identified by college students as the most taboo (Jay & Janschewitz, 2008). Carlin's list thus identifies the most taboo words, making it a conservative test of the hypothesis that the use of taboo words has increased.
We began examining swear words in 1950, a common starting point for the postwar era. Michel et al. (2011) noted that the Google Books database is more reliable after 1900, and most research examining changes in social rules begins with data in the 1970s or later (e.g., Baunach, 2012;Donnelly et al., 2016;Twenge, Campbell, & Carter, 2015). Thus, examining data after 1950 covers the decades in which social rules began to change, plus an extra two decades  in case swear words changed earlier than other indicators. Given previous research and cultural studies pointing toward the relaxation of social taboos in American society, we hypothesize that the use of swear words will increase in American books between 1950 and 2008.

Method
We examined the American English (2009) 1 corpus from the Google Books Ngram database. The Google Books corpus contains 4% of books published since the 1800s. These books were likely not truly randomly selected (Michel et al., 2011); however, we assume these books were not selected in a way dependent on word use frequency that also varied systematically with year. In addition, the Ngram database is by far the largest database available of digitized books. As described in more detail in Michel et al. (2011), Google used 100 sources such as university libraries and publishers to generate a comprehensive catalog of books. The books were digitally scanned and the corpus was winnowed of serial publications, multiple editions, and books with poor print quality, unknown publication dates, or miscoded language (e.g., a book listed in the library catalog as being written in English that was not actually in English). Country of publication (in this case, the United States) was determined by 100 bibliographic sources (Michel et al., 2011).
Our unit of analysis was the frequency of the use of a word in a specific year. We then tested for changes in those frequencies over time by examining the correlation between year and frequency, with the n in each analysis of 58 (the number of years). Our results thus refer to the annual change in the frequency of the use of the seven swear words (shit, piss, fuck, cunt, cocksucker, motherfucker, and tits). We also examined two composites of the seven words. The first adds the use frequencies together (a composite of means); thus, it is more influenced by the words used more frequently. The second adds the Z-scores of each of the seven words together; thus, it counts each word equally. The seven words formed a reliable index, Cronbach's alpha = .71. We did not use smoothing in our analyses or figures as we wished to capture the exact frequency in each year.
By definition, correlations represent the direction and fit of the linear relationship between the variables of interest-here, the frequency and year. However, it is also important to know the simple magnitude of the change from the first part of the time period to the last. Thus, we include a second effect size, d, based on the difference between use in the first 4 years of the time period (1950)(1951)(1952)(1953) and use in the last 4 years of the time period (2005)(2006)(2007)(2008) divided by the standard deviation. We also used these means to calculate how many times more common the use of the word (or words) was in the late 2000s compared with the early 1950s.
The American English corpus does not note any changes in the types of books (fiction vs. nonfiction). As a substitute, we obtained the percentage of books published in the United States each year that were fiction from the Statistical Abstract of the United States (U.S. Census, 2004); statistics were available only for 1960-2002. We will use these statistics as controls in the analyses to rule out the possibility that any changes over time are caused by shifts in types of books. However, we have no way of knowing if these percentages are the same as those in the database. In addition, the 1982 edition of the Abstract notes that an increase in the number of books between 1980 and 1981 was "due in part to a major improvement in the recording of paperbound books," and more of these paperback books are likely to be fiction. Thus, the measurement differed with time, so these analyses should be interpreted with caution. Fortunately, the percentage of fiction books did not vary much by year, ranging from a low of 7% to a high of 15%. As an alternative, we considered analyzing the English Fiction corpus of Google Books; however, this corpus includes all books in English, creating the possibility of confounding year with country of origin (if, for example, the corpus included a higher percentage of American books in later years). 2 In addition, our interest was not specific to either nonfiction and fiction books.
We also examined whether the frequency of swearing in books covaried with the violent crime rate, obtained from the FBI Uniform Crime Reports, 1960Reports, -2008

Results
American books in recent years became significantly more likely to use each of the seven swear words in the years since 1950, with a linear change evident in most (see Table 1, and Figures 1 and 2). Motherfucker was used 678 times more often in the mid-2000s compared with the early 1950s, shit 69 times more often, and fuck 168 times more often. In total, American books used the seven taboo words 28 times more often in the mid-2000s than the early 1950s. Effect sizes (ds) were also very large, ranging from 6.58 to 45.83. The words vary widely in frequency of use (see Figure 3), but all increase over time (see Figure 1).
The results were unchanged when controlled for the percentage of books that were fiction, r for mean composite with year = .98, p < .001; r for Z-score composite with year = .97, p < .001.
Total swearing in books was positively correlated with the violent crime rate, r for mean composite = .59, p < .001; r for Z-score composite = .67, p < .001.

Discussion
American books contained dramatically more swear words in the late 2000s than they did in the early 1950s. Readers of books in the late 2000s were 28 times more likely than those in the early 1950s to come across one of the "seven words you can never say on television." These findings suggest a notable decline in social taboos against swear words consistent with previous research finding evidence for increasing individualism (e.g., Greenfield, 2013). American culture increasingly values individual self-expression and weaker social taboos, and these trends are manifested in the increasing use of swear words. If books reflect broader cultural trends, it suggests that other cultural products such as movies and TV shows may also demonstrate increases in the use of swear words (a potential future topic for research; that said, any increases in swear words in broadcast media may be confounded with the introduction of media not regulated by the Federal Communications Commission, such as premium cable and streaming video). Overall, these findings are consistent with the observation that American culture has become more accepting of crude and coarse language.
Several studies have found that swear words are more emotional and distracting than nonswear words (Bertels, Kolinsky, Bernaerts, & Morals, 2011;Colbeck & Bowers, 2012). This suggests that swear words are powerful ways of attracting attention. However, as they become more common, they may lose their power. This prediction that the attentional power or "shock value" of swear words has declined could not be tested in these data but is an interesting question for future research. These data also suggest two more historical points. First, the trend toward the use of swear words in books began before George Carlin's 1972 comedy routine. His work therefore captured and possibly amplified an emerging cultural trend. Second, this massive shift in the use of swear words in books occurred despite the U.S. federal government's efforts to reduce profanity on television. In this case, it seems the government was unable to constrain free expression broadly, such as in books (although they almost certainly have been able to on network TV). We have not examined other media in this research, but it is plausible that the cultural push for individual expression has in part resulted in the boom of less regulated media such as satellite radio and cable television.
In summary, the use of swear words was significantly more common in American books in the late 2000s compared with the early 1950s, increasing in a primarily linear fashion over this time period. The size of this effect, expressed as a d, is massive. This change is consistent with a cultural shift from more collective or communal values to more individualistic, self-expressive values. Note. The y axis reflects the actual frequency of the words as a percentage of words in books in that year.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.  Greenfield, 2013;Kesebir & Kesebir, 2012;Twenge et al., 2012bTwenge et al., , 2013. In addition, the 2012 "American English" database contains a sharp, anomalous downturn in several swear words around 2005 inconsistent with the general trend. We could not determine whether this downturn was due to an error in this database or to some other cause, so we continued to rely on the 2009 database. Nevertheless, both databases showed dramatic increases in the use of swear words.
For example, the use of the word "shit" increased 57 times between 1950-1953 and 2005-2008 in the "American English" database and 69 times in the "American English (2009)" database we used in the primary analyses. 2. Nevertheless, results were similar, though somewhat smaller in magnitude, in the "English Fiction (2009)" and "English Fiction" databases, which include only fiction books. For example, the use of the word "shit" increased 52 times between 1950-1953 and 2005-2008 in the "English Fiction (2009)" database and 38 times in the "English Fiction" database, compared with 69 times in the "American English (2009)" database we used in the primary analyses. These two databases also show the downturn after 2005 observed in the "American English" database. Thus, all four databases show a dramatic increase in the use of swear words, but the increase may have been attenuated during the mid-2000s. Note. The y axis reflects the actual frequency of the words as a percentage of words in books in that year.