Linking musical metaphors and emotions evoked by the sound of classical music

Musical meaning is often described in terms of emotions and metaphors. While many theories encapsulate one or the other, very little empirical data is available to test a possible link between the two. In this article, we examined the metaphorical and emotional contents of Western classical music using the answers of 162 participants. We calculated generalized linear mixed-effects models, correlations, and multidimensional scaling to connect emotions and metaphors. It resulted in each metaphor being associated with different specific emotions, subjective levels of entrainment, and acoustic and perceptual characteristics. How these constructs relate to one another could be based on the embodied knowledge and the perception of movement in space. For instance, metaphors that rely on movement are related to emotions associated with movement. In addition, measures in this study could also be represented by underlying dimensions such as valence and arousal. Musical writing and music education could benefit greatly from these results. Finally, we suggest that music researchers consider musical metaphors in their work as we provide an empirical method for it.

experiences. Metaphors based on the grounded image schemata appears at level two while level three ascribe emotional qualities to the music. In addition, emotional connotation can be grounded in the image-schematic level two, when, for instance, an ascending sequence of notes, tapping into the VERTICALITY schema, is interpreted as "majestic" (Cooke, 1959). As the fifth level, culture plays an important role in shaping the musical meaning. Western European music for example features repetitively themes such as the hunt, the military, and the pastoral, that can be related to literature, social history, and the fine arts (Monelle, 2006). Finally, the last level relates to individual differences and make music meaning such a personal experience.
The use of embodiment as a main framework to explain the link between emotions and metaphors in the context of musical meaning has already been explored in two main aforementioned theories: the conceptual blending (Fauconnier & Turner, 2008) and the multilevel grounding in musical semantics (Antovic, 2018). The conceptual blending is the mapping of two input spaces, based on a generic space, that creates a blend from which additional meaning emerges. In the case of music, these input spaces have been associated on one hand with musical features (e.g. notes, performance gestures, musical structure) and on the other hand with different aspects of either an affective event (Spitzer, 2018) or an object/concept (e.g. path) (Antovic, 2015). The act of explaining one domain (or input space) in terms of another is founding ground of the Conceptual Metaphor Theory (CMT, Lakoff & Johnson, 1980}). In such theory, a metaphor is defined as a cross-domain mapping in the conceptual system (Lakoff, 1993). These mappings are based on dynamic cognitive constructs resulting from the embodiment of physical experiences, called image schemata. While not all conceptual metaphors are necessarily grounded in bodily experiences, most might (Goschler, 2005). Similarly, these embodied experiences could support cross-domain mappings with affective domains. Juslin, for the BRECVEMA system, defined emotions themselves as "embodied phenomena that serve to guide action" (Juslin, 2013). Consequently, both metaphors and emotions could share similar embodied cognitive structures linking them in a similar fashion to the music. In the second theory, the multilevel grounding in musical semantics, the elaboration of musical meaning is based on a hierarchical process covering successively physiological, image schematic, connotational, conceptual, elaborated cultural, and individual aspects (Antovic, 2018). The third level, the connotation level, refers to emotions associated with images schemata. For example, an ascending musical sequence (VERTICALITY) could be perceived as "majestic" (Cooke, 1959). The fourth level, the conceptual level, is entirely referential or extramusical, connecting music to the experience of the world and creating conceptual metaphors. Both levels are based upon the image schematic level, referring to the embodied experiences. Consequently, looking at both theories, we suggest that embodied cognition might be at the basis of the link between musical metaphors and emotions in the context of musical meaning.
In this work, we have demonstrated that musical meaning in the form of metaphors and emotions can be extracted from Western classical excerpts. Furthermore, we highlighted that both musical metaphors and emotions seem to be connected in certain ways. Going back to the different theories introduced in this Supplementary Material, we want to link the results presented here to what other researchers have theorized. First, concerning the BRECVEMA, visual imagery, one of the eight mechanisms eliciting emotions when listening to music, seems to be a subset of all the cross-domain mappings possible (Juslin, 2013). Indeed, its visual nature is somewhat limiting the diversity of possible metaphors elicited by music. Moreover, the mechanism appears at the same time as music listening, while the metaphorical meaning of a piece can be explored after listening. Despite these limitations, we have shown that metaphors are associated with different emotions, as it is assumed by the BRECVEMA model. The directionality and timing of such interaction could unfortunately not be observed in this study and remains an important point to clarify in future research. Second, as hinted by Koelsch, listeners were also able to decode the extra-musical meaning associated with our excerpts, both the iconic and indexal signs (Koelsch, 2011). The coherence in their answers points at the ability of our participants to pick up on the same characteristics in music, such as acoustic and perceptual cues. Lastly, the last two theories introduced in this paper, the conceptual blending and the hierarchical system of six contextual constraints, states the metaphors as a primary building block to explain meaning and musical emotions. Once again, the order of the mechanisms by which the creation of meaning occurs could not be studied in this assessment. Future studies could focus on the temporality of musical emotions and metaphors, as they most likely evolve over time, similar to emotional prosody (Pell & Kotz, 2011, Schaerlaeken e al., 2018.

Online Supplementary Material B
Music excerpts were selected based on a pilot study with 20 participants (9 females, age M = 26.2, SD = 6.8) and with the help of a music expert. The list of excerpts presented in the pilot study was drawn from our previous experiment (Schaerlaeken et al., 2019). All excerpts were pieces of western classical music, from different periods of time and different genres. We selected two excerpts per emotion described by the Geneva Emotion Musical Scale (GEMS (Zentner et al., 2008)). For each of the 9 emotions, we selected final excerpts following two criteria: 1) high ratings in the pilot study for a specific emotion, preferably with high discriminability when possible, 2) consensus with the description of the excerpt as assessed by a group of musicians including a professor from the Haute Ecole de Music of Geneva. Since both "Wonder" and "Transcendence" scales were not well recognized in the pilot study, we used four excerpts from another study based on the results obtained for such emotions to complete our list (Eliard et al., 2017). The final list contained 18 excerpts (Table A). During the study, these excerpts were also rated for how much participants knew them. If the participants did not know them, this would ensure that the metaphors and emotions collected on these excerpts are not a result of episodic memory irrelevant to the music itself. Table A. Western classical excerpts presented in the experiment with style and year of composition associated. Excerpts are sorted by how much they are known by the participants (0 = not at all, 4 = neither known nor unknown, 8 = fully known).

Online Supplementary Material C
The descriptors linked to acoustical and perceptual features were principal components (PC) from three PCA computed on the acoustic features, the perceptual features, and the entrainment questionnaire, respectively. The first PCA resulted in two components for the acoustical features that explained cumulatively 48.3% of the variance (respectively 27.05% and 21.27%). The first PC was positively associated with the spectral centroid and brightness. The second PC was negatively associated with the intensity of the signal (RMS) and the roughness. The second PCA, based on the perceptual features, resulted also in two components that encapsulated 65.3% of the variance (respectively 39.7% and 25.6%). The first PC was positively associated with melody and negatively with dissonance. The second PC was positively associated with rhythm and articulation. Finally, the last PCA integrated the entrainment questionnaire into one single component that explained 85.5% of the variance and was associated positively to feeling animated, wanting to move, and feeling the beat.

Online Supplementary Material D
At first glance, the distribution of ratings for both the GEMMES and GEMS scales did not follow a normal distribution. Both were zero-inflated distributions due to the tendency of our participants to associate each excerpt with only a part of the items proposed, leaving the rest at the minimum value. This had two consequences: first, statistical comparison of the raw ratings required non-parametric testing (permutation testing and Spearmann correlation). Second, we artificially created a binomial distribution by characterizing each scale of each trial as "1" or "0". A scale was set to "1" if the corresponding raw rating was superior or equal to the middle of the rating scale (4, noted for the participants as "neither relevant nor irrelevant"), and set to "0" otherwise. It is therefore important to note that, for the models computed, an estimated value of 0.5 correspond to the chance level. Therefore, significant effects deviated from this value either positively or negatively.

Online Supplementary Material E
Z test for permutations between two groups within our participants pool.

Online Supplementary Material F
Reliability of each subscale over the excerpts presented to each participant (Cronbach's alpha)

Online Supplementary Material G
To evaluate if the participants accurately labelled the emotionally connoted excerpts we presented, we performed a GLMM using the binary values. A model encompassing the interaction between the labels participants gave and the labels pre-selected by the music expert, and the main effects associated, statistically outperformed the model with only the main effect (c 2 (83, N = 81) = 3069.5, p < 0.001, R 2 m = 0.31, R 2 c = 0.44, AICGEMS*ExcerptGEMS = 13019, AICGEMS+ExcerptGEMS = 15960; BICGEMS*ExcerptGEMS = 13640, BICGEMS+ExcerptGEMS = 16102). Participants gave significantly higher ratings for the specific emotion associated with the excerpt presented ( Figure Table A. Contrast between each emotional sub-scale and the chance level (0.5) on each emotion represented by the selected excerpt. A significant difference represents a value above or below 0.5, the chance level in a binomial distribution. All contrasts are FDR-corrected [ns: non-significant] Table B. Contrasts between the emotional sub-scale corresponding to the emotion portrayed in the excerpts and all the other subscales for the same excerpts taken together. All contrasts are FDR-corrected.