Affective priming with musical chords is influenced by pitch numerosity

Previous studies using an affective priming paradigm have shown that valenced chords (e.g., consonant–positive; dissonant–negative) facilitate the evaluation of similarly valenced target words. The role of numerosity (the total number of pitches in a chord) and timbre has not yet been systematically investigated in previous priming studies using consonant/dissonant chords. An experiment was conducted in which 40 participants evaluated positive and negative target words with consonant/dissonant chords used as affective primes. Eight distinct chords (four consonant and four dissonant) were used as primes; the consonant and dissonant chords were equally divided to comprise either two (i.e., interval) or four (i.e., tetrad) pitches. Each chord was played with two distinct timbres (piano and harmonium), resulting in a total of 16 chords. Results showed that congruent chord–word pairings resulted in faster reaction times, and this finding was in line with previous research using consonant/dissonant chords as primes. However, this effect was present only with tetrad chords, suggesting that numerosity influences affective priming done with chords. There were no significant effects of timbre or the musical sophistication of the participants. Arguments are made as to why higher pitch numerosity in chords (resulting in acoustic complexity) might influence the evaluation of valenced target words.


Introduction
The contrast between consonance and dissonance is a crucial feature of Western music. Although the exact acoustic and cultural components of consonance and dissonance and their order of importance is notoriously contentious (see e.g., Harrison & Pearce, 2019), there is agreement over the notion that 'consonant' typically means harmonious, agreeable and stable, whereas 'dissonant', by contrast, means disagreeable, unpleasant and in need of resolution (see Tramo et al., 2001). The current study is concerned with the perception of isolated vertical intervals (two concurrent pitches) and chords (three or more concurrent pitches). The term 'chord' will be used as a broad category comprising both intervals and actual chords, for ease of communication, as this is a common procedure in studies dealing with the perception of simultaneous pitch combinations (see e.g., Bowling et al., 2018). On an affective level, many selfreport studies have demonstrated that dissonant chords are perceived as more negative and unpleasant than their consonant counterparts (see e.g., Arthurs et al., 2018;Costa et al., 2000;Maher, 1980). Zentner and Kagan (1996) suggest that the preference for consonance over dissonance is present already in infants, implying that humans might at least to some extent be biologically prepared to prefer consonance over harsh dissonance.
As a research method, the affective priming paradigm consists of two stimuli (the prime and the target) presented in succession. Participants then evaluate the target stimulus according to a binary division such as 'positive' and 'negative'. The experimental manipulation is whether or not the prime and target share an affective property such as valence. Although scholars disagree on the exact mechanism responsible (for a review, see e.g., Herring et al., 2013), target stimuli are typically evaluated more quickly when preceded by a prime of the same affective category compared with preceded by one of the opposite category. Affective priming paradigms have been consistently found to be a robust measure of attitudes to prime stimuli (see Fazio, 2001) and avoid many of the limitations of self-report measures as outlined by, for example, Zentner and Eerola (2010).
Only a handful of affective priming studies have investigated the effect of consonance/dissonance on the evaluation of valenced target words. Sollberger et al. (2003) and Steinbeis and Koelsch (2011) found that participants evaluated target words more quickly if the words were preceded by a similarly valenced chord (e.g., consonant-positive) compared with affectively incongruent chord-word pairs (e.g., dissonant-positive), and this finding was independent of musical training. Costa (2013) later tested the chord stimuli of Sollberger et al. (2003) using emotional pictures as the targets instead of words and, curiously, in this setting the priming with consonant/dissonant chords was not effective.
These previous studies using consonant/dissonant chords as primes have not yet been systematic in identifying what aspects of the chords lead to priming effects. Not one of them has directly explored the role of numerosity (the total number of pitches in a chord) and timbre in consonant/ dissonant chords used as affective primes. It has been theorised that numerosity affects the perceived consonance of chords; the addition of pitches to a chord typically increases its acoustic roughness (Bowling & Purves, 2015), an acoustic component which is seen as prevalent in dissonant, but not in consonant, musical chords (see e.g., Hutchinson & Knopoff, 1979;Plomp & Levelt, 1965). This would imply that dissonant chords with higher numerosity and hence higher roughness could be more congruent with negative target words compared with dissonant but less rough intervals. Also, on a cognitive level, it has been suggested that stimuli that are easier to process result in more positive affective reactions (e.g., Winkielman & Cacioppo, 2001); according to this logic a consonant interval consisting of only two pitches might result in an easier processing than a consonant tetrad consisting of four pitches, ensuing higher congruency with positive target words. In the previous affective priming studies using consonant/dissonant chords Sollberger et al. (2003) tested three-and four-pitch chords for dissonance, but only three-pitch chords for consonance. The results showed no difference in congruency effects according to pitch numerosity. Steinbeis and Koelsch (2011), in turn, used exclusively four-pitch chords for both consonance and dissonance. Costa (2013) later used the same three-pitch chords as Sollberger et al. (2003). Consequently, two-pitch chords (intervals) have never been tested in an affective priming setting to investigate whether chord numerosity plays a role in the evaluation of valenced target words. Also, previous affective priming studies with consonant/dissonant chords have used piano timbre exclusively, and this can be seen as a possible limitation. Timbre has been shown to strongly affect the perception of single chords in self-report studies (see Arthurs et al., 2018;Lahdelma & Eerola, 2016b) in which a piano timbre was rated more positively than, for example, that of strings or organ. Moreover, the timbre of isolated musical instrument sounds has been demonstrated to convey distinct emotions in an affective priming setting (Liu et al., 2018).

Participants
The participants for the experiment were recruited through Prolific Academic (https://prolific. ac), an online crowdsourcing platform targeted especially for research purposes. Previous research suggests that Prolific participants consistently complete questionnaires carefully and the platform has high reliability (see Palan & Schitter, 2018;Peer et al., 2017). Informed consent was obtained from all participants before the start of the experiment via an online checkbox, and the study was approved by the institutional ethics committee.
Forty participants (24 female; mean age = 32.2 years, SD = 9.9 years) completed the priming task. Participants were pre-filtered via Prolific as right-handed (see Hardie & Wright, 2014;Kalyanshetti & Vastrad, 2013) native English speakers to avoid any confounds. The participants' musical sophistication was measured using five self-report rank items (Which title best describes you?) taken from the Ollen Musical Sophistication Index (Ollen 2006). The five items were (a) Non-musician (7 participants); (b) Music-loving non-musician (23 participants); (c) Amateur musician (9 participants); (d) Semi-professional musician (1 participant); and (e) Professional musician (0 participants). For the benefits of using this strategy to assess musical sophistication, see Zhang and Schubert (2019).

Materials
The stimuli were chosen on the basis of an empirical rank ordering of consonance/dissonance ratings of intervals and tetrads conducted by Bowling et al. (2018); the two most consonant/ dissonant intervals and tetrads were chosen to represent each category (e.g., consonant interval), resulting in a total of eight distinct chords (Table 1). As per the procedure used by Bowling et al. (2018), the fundamental frequencies ( f 0 ) of the pitches in each chord were adjusted so that the mean f 0 of all pitches was 263 Hz (middle C). Each of the eight chords was played with both the piano and the harmonium timbre, resulting in 16 chords altogether. All chords were played in equal temperament. The piano timbre was chosen because the piano is a common and highly familiar instrument; also, it has been used in the three previous affective priming studies applying consonant/dissonant chords. The harmonium timbre was chosen to represent a categorically different and less familiar timbre compared with piano, as it has a slow attack and a sustained sound as opposed to the piano's sharp attack and fast decay. No a priori hypothesis was put forward with regard to potential differences in valence ratings. The chords were exactly 800 ms in length, including a 75 ms fadeout. The chords were generated using Ableton Live 9 (a music sequencer software; Ableton, Berlin). The piano chords were created using the Synthogy Ivory Grand Pianos II plug-in (Synthogy.com) with Steinway D Concert Grand as the applied sound font. The harmonium chords were created with Noiiz Player (a sampler instrument plug-in, noiiz.com) using the Harmonium sound library with Harmonium II as the applied sound font. No reverb was used, and a fixed velocity was applied. All stimuli were normalised to -3db with Adobe Audition CC 2019 (a digital audio workstation) in order to control for any amplitude differences due to numerosity and timbre dissimilarities. The chords were rendered as 44.1 kHz, 32 bits per sample waveform audio files and were then converted to constant bit rate 320 kbps high-quality stereo mp3 files for compatibility with the survey design used in the experiment (see Procedure). The stimuli can be found online at https://osf.io/ ghve9/?view_only=248966eb3fa9418cb246381de3bc5460. The positive/negative target words were taken from Warriner et al. (2013). Altogether, 16 words were chosen according to valence (positive versus negative). The words were all similar in length, consisting of two syllables, with the exception of one word ('rest') consisting of one syllable. The positive words were climax, lively, gentle, rest, excite, payday, comfy and relax; the negative words were rabid, hijack, coma, saggy, arrest, fatal, dismal and morgue.
Numerosity and roughness. As pitch numerosity is closely linked to the question of acoustic roughness (see Bowling & Purves, 2015), roughness calculations were conducted on the chord stimuli. Figure 1 shows the theoretical roughness values of the eight different chords (collapsed across timbre) assessed with the model of Hutchinson and Knopoff (1978); this model has recently been demonstrated to be the most reliable predictor of consonance/dissonance ratings among several roughness models (see Harrison & Pearce 2019). As can be seen from Figure 1, both the consonant and dissonant tetrads are more rough than their interval counterparts, and this difference is especially prominent with the dissonant chords.  Table 1 for full details).

Procedure
The experiment was programmed and conducted on PsyToolkit, a web-based service designed for setting up, running and analysing online questionnaires and reaction-time experiments (see Stoet, 2017). The web-based reaction time approach has been validated to generate data that is comparable to lab-based data (Hilbig, 2016). Before the experimental task, participants were asked to provide information about their demographic background (age, gender and musical sophistication). For the priming task, participants were presented with a series of words, which had to be classified as positive (by pressing the 'm' key) or negative (by pressing the 'z') key. Each word was preceded by a chord. The task comprised a 10-item practice block followed by an experimental block of 256 items (16 primes × 16 targets) presented in a random order. Participants received feedback during the practice block to indicate whether or not their responses were correct; no feedback was provided during the experimental block. For each item, the screen initially contained a fixation cross for 500 ms; at 500 ms, the fixation cross disappeared and the auditory prime (duration 800 ms) sounded. At 200 ms after the onset of the auditory prime, the target word appeared in the centre of the screen and remained for 2000 ms, during which time participants were able to respond. Responses slower than 2000 ms were classed as 'timeouts'. Following the experiment, participants were presented with an online debriefing screen and issued with an anonymous participant code ( Figure 2).

Reaction time
Data pre-treatment. Reaction times shorter than 250 ms were deleted (Hermans et al., 2001). To calculate upper outliers, each participant's data were fitted to an exponentially modified Gaussian distribution (Ratcliff, 1993); responses slower than the 95th percentile of this distribution were deleted as if they were incorrect responses. Finally, as is common practice (Whelan, 2008), reaction times were log transformed prior to analysis.
Analysis of reaction time data. The data were subjected to a 2 (consonance-dissonance) × 2 (timbre) × 2 (numerosity) repeated-measures linear mixed effects ANOVA. Several authors suggest avoiding standardised effect sizes with linear mixed-effect models, advocating instead for simple effect sizes, which are reported here (Baguley, 2009;Rights & Sterba, 2019;Wilkinson, 1999). The analysis revealed no significant main effects of consonance-dissonance, timbre or numerosity. However, as predicted, there was a significant interaction between Consonance and Target Valence, F(1,272) = 12.97, p < .001, revealing clear congruency effects. Post hoc tests confirmed that reaction times to positive words were faster when the words were preceded by consonant chords (mean reaction time (RT) = 586 ms, SD = 117 ms) than when they were preceded by dissonant chords (mean RT = 594 ms, SD = 112 ms), t(34) = 2.28, p = .03, difference in mean RT = 8 ms; similarly, reaction times to negative words were marginally faster when the words were preceded by dissonant chords (mean RT = 588 ms, SD = 116 ms) than by consonant chords (mean RT = 595, SD = 113 ms), t(34) = 1.87, p = .06, difference in mean RT = 7 ms. This interaction was further qualified by a significant threeway interaction Numerosity × Consonance × Target Valence, F(1,272) = 4.42, p = .03. Post hoc tests indicated that the interaction Consonance × Target Valence was present in the case of tetrads, F(1,68) = 11.44, p = .001, but that this interaction was not present in the case of intervals, F(1,68) = 0.88, p = .35. Considering tetrads only, positive targets were evaluated significantly more quickly following a consonant prime compared with a dissonant prime, t p 136 2 58 01 ( )= = . , . , difference in means = 15 ms; the effect was marginal in the case of negative targets -there was a non-significant trend for targets to be evaluated more quickly following a dissonant prime compared with a consonant prime, t p 136 1 78 08 difference in means = 11 ms). These differences were not present in the case of intervals, where the difference in mean reaction times for positive targets was 0.4 ms and the mean RT for negative targets was 4 ms. Figure 3 demonstrates graphically the presence of the effect in the case of tetrads, but not in intervals. There was no significant interaction between timbre and target valence, nor were any two-, three-or four-way interactions significant. In order to test for group differences between musicians and non-musicians, we calculated a priming index, defined as difference in mean response times for incongruent minus congruent conditions (see e.g., Blair et al., 2006). There was no significant difference in priming index between musicians and non-musicians, t p 18 2 0 37 05 .

Accuracy rates
Results for accuracy rates (AR) broadly mirrored those for RT. There was a significant main effect of target valence F(1,272) = 8.97, p = .003. Post hoc tests suggested that negative targets were evaluated more accurately (mean AR = 95.3%) than positive targets (mean AR = 93.4%), t(34) = 3.53, p = .001. All other main effects were non-significant. The two-way interaction Consonance × Target Valence proved marginally significant, F(1,272) = 3.26, p=. 07. Post hoc testing indicated that positive targets were evaluated marginally more accurately following a consonant prime (90.3%) compared with when following a dissonant prime (88.9%) t(34) = 1.7, p = .09; there was no significant difference in AR to negative targets when following consonant (80.7%) versus dissonant primes (81.5%). Furthermore, the three-way interaction Numerosity × Consonance × Target Valence was highly significant, F(1,272) = 9.99, p = .002. As with RT, the interaction Consonance × Target Valence was present in the case of tetrads, F(1,68) = 8.71, p = .004, but not in the case of intervals, F(1,68) = 0.38, p = .58. Considering tetrads alone, ARs for positive words were significantly higher when preceded by consonant tetrads (mean AR = 90.5%) compared with when preceded by dissonant tetrads (mean AR = 79.7%), t(34) = 2.12, p = .04; however, negative words were classified more accurately when preceded by consonant tetrads (mean AR = 87.7% ) than dissonant tetrads (mean AR = 82.5%), t(34) = 2.77, p = .048. No such interaction was present in the case of intervals. Rather than presenting congruency effects as is the case with RT, the effect in AR seems to be to reduce the influence of an over-riding main effect of Target Valence, F(1,272) = 195, p < .001: post hoc testing suggested that positive words (AR = 89.6%) were classified more accurately than negative words (AR = 81.1%), t(34) = 7.5, p < .001.

Discussion
The current study has demonstrated for the first time that pitch numerosity plays a role in consonant/dissonant chords when they are used as primes for valenced target words. This finding has a plausible explanation in the case of dissonant chords: it has been suggested that the addition of pitches to a chord typically increases its acoustic roughness (Bowling & Purves, 2015) which results in more perceived dissonance. It is likely that this high amount of roughness makes the dissonant tetrads effective primes for valenced target words. Interestingly, numerosity plays an equally important role in the case of consonant chords where also only the tetrads were effective primes: consonant chords with a smaller number of pitches (intervals) did not create more congruency with positive words (cf. Winkielman & Cacioppo, 2001) even though they contain less roughness than the consonant tetrads. As the consonant intervals represented pure consonance (the perfect fifth, the perfect octave) they were possibly less loaded in terms of positive valence than the major chord tetrads: major chords have been demonstrated to contain a high amount of positive valence in selfreport studies (see e.g., Lahdelma & Eerola, 2016a). However, in the previous affective priming studies with consonant/dissonant chords Sollberger et al. (2003) used a perfect fifth + octave sonority as the consonant chord (three pitches) which was not a major chord, whereas Steinbeis and Koelsch (2011) used major chords with pitch doublings (four pitches). Both of these studies demonstrated a congruency effect for consonant chords presented with positive target words, implying that it is not necessarily the major chord that is driving this congruency effect.
The finding that intervals are not effective primes compared with tetrads is striking in the light that certain intervals have been demonstrated to convey distinct affective connotations to Western listeners in self-report studies (see e.g., Oelmann & Laeng, 2009;Krantz et al., 2004). The dissonant intervals used in the current experiment (the minor second and the major seventh) have consistently been demonstrated to be among the most dissonant intervals (see e.g., Hutchinson & Knopoff, 1979;Kameoka & Kuriyagawa, 1969;Malmberg, 1918). The dissonance in these intervals is mirrored in their perceived affective connotations on the basis of self-report studies: Maher (1980) found that the minor second interval was perceived as more unstable, displeasing and restless when compared with the consonant perfect fifth and the perfect octave. Interestingly, in Maher's study the participants failed to discriminate any of the 14 presented intervals (all the intervals ranging from the unison to the octave + the minor ninth) on the How-do-you-feel scale. Maher (1980) concludes that the various intervals do not give rise to differential degrees of felt activation and liveliness, a notion that is line with the current experiment's findings that intervals are not as effective primes as tetrads. On the basis of the current findings, intervals are possibly not acoustically complex enough to elicit congruency effects in a reaction time setting when used as primes, at least when in a context of being presented alongside more complex chords (in this case, tetrads).
Based on these results, more research is suggested on specifically the effect of culturally loaded intervals in terms of valence (e.g., the positive major third and the negative minor third and the tritone) to see if intervals by themselves can act as affective primes due to their conventional affective connotations, despite their lesser acoustic complexity. Although the role of timbre was shown to be non-significant in the current study, timbre offers multiple options (e.g., harmonics, envelope, spectral shape) that might contribute to affective priming done with chords. Hence, more diverse timbres should be explored to see if they interact with consonant/ dissonant chords when used as primes in an affective priming paradigm.

Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was carried out with funding from the Osk Huttunen Foundation awarded to the first author, and with additional funds from Durham University's PVC for Arts and Humanities Research Award.