Music-Colour Synaesthesia: A Sensorimotor Account

This article presents a sensorimotor account of music-colour synaesthesia, proposing a radically different perspective than is commonly provided. Recent empirical and theoretical work in music cognition moves away from cognitivist accounts, rejects representationalism and embraces an embodied standpoint. It has been shown that some forms of synaesthesia may be elicited from a concept alone and are often accompanied by shapes and textures. It is from this perspective that a skilful engagement with the environment and relevant sensorimotor contingencies may be identified. Here the role of embodied and enactive perception in general music cognition is extended to music-colour synaesthesia, and an argument is made for how the attributes of bodiliness and grabbiness might be found in a sonic environment, and how music listening might be perceived as an act of doing.

focusing on music-colour synaesthesia. Next, developments of research in general music cognition are outlined that describe engagement with music, including music listening, as an act of doing. The third section discusses how music-colour synaesthesia may indeed be explained from a sensorimotor perspective. The final section discusses challenges presented by synaesthesia to sensorimotor theory, and how these may be resolved arguing for a shift in perspective of musiccolour synaesthesia from being regarded as special to being illustrative of how typical, but individualised music cognition may develop.

Synaesthesia
Researchers are beginning to challenge the assumption that a single mechanism underlies all forms of synaesthesia (Auvray & Deroy, 2015), and Simner (2012), for example, rejects a onefor-all explanation. Although synaesthesia has been described as a merging of the senses, not all types are cross-sensory. Evidence shows that it can be activated by a concept alone (van Leeuwen et al., 2015); the inducer does not have to be physically present (Meier, 2014). For example, Dixon and colleagues (2000) found that synaesthetes responded to the concept of 7 triggered by the presentation of 5+2 in the absence of the actual number, and Ward, Tsakanikos, and Bray (2006) found that the concept of musical notation was enough to elicit a synaesthetic response. These findings have led to the development of alternative theories such as ideaesthesia (Jürgens & Nikolić, 2012;Nikolić, 2009), meaning "sensing concepts" (Mroczko-Wąsowicz & Nikolić, 2014, p. 4), connecting sensation and phenomenal experiences, or qualia, with their conceptual triggers or semantic inducers. Qualia is the term used to describe the qualities of subjective experience associated with certain sensory stimuli (see Jackson, 1982). In music, this might include the difference in experience between a melody played on a piano and the same melody played on a French horn (see also Curwen, 2018). Synaesthetic experiences have been described by Wager (1999) as "extra qualia" (p. 264) that manifest themselves differently from synaesthete to synaesthete. Attempts to explain extra qualia associated with synaesthesia challenge philosophies of mind such as representationalism: "the view that the phenomenal character of an experience supervenes on its representational content [or] the way an experience seems to its subject is . . . the way the experience represents the world as being" (Wager, 1999, p. 263). Tye (1998) ventures that phenomenal character must be identical to its representational content. The challenge to representationalism posed by synaesthesia is that it appears at odds with reality. Experiencing green in response to music does not mean the music is actually green, nor that the synaesthete is experiencing green-ness (Auvray & Deroy, 2015). There is much disagreement on this topic (see Curwen, 2018).
Differences between higher and lower synaesthetes, and between associators and projectors, further highlight the diversity of synaesthetic manifestations. The distinction between higher and lower synaesthetes was first proposed by Ramachandran and Hubbard (2001), based on evidence of cross-activation mechanisms operating at different times and in different locations in the brain. Lower synaesthetes were thought to process information at an early stage in the fusiform areas that manage form and colour perception, while higher synaesthetes were thought to process information at a later stage in areas that manage the conceptual aspects of colour. Associators describe their experience as being in the mind's eye (Dixon et al., 2004;Dixon & Smilek, 2005) or as knowing the colour (Ward et al., 2007), while projectors claim to see colours projected outside the body into external space (Smilek et al., 2001). It appears reasonable to consider associators as higher synaesthetes and projectors as lower synaesthetes, but this has not been substantiated by research and in some studies the two dimensions of synaesthesia are shown to be orthogonal (Ward et al., 2007). The crucial distinction between associators and projectors lies in the experience of the concurrent, while that between higher and lower synaesthetes lies in the nature of the inducer. Lower synaesthetes can perceive colours in their mind's eye, and higher synaesthetes can be projectors. These examples, among many others, illustrate that the potential scope of music-colour synaesthesia is very wide.

Coloured hearing
Tone-colour synaesthesia is the type of music-colour synaesthesia examined most often: single tones and chords, isolated from any musical context, elicit specific colours. It has been attributed to neurological mechanisms, specifically bottom-up processing (see Music-Colour Synaesthesia below). However, the synaesthetic experience of music is not just about the ability to assign a colour to individual tones or chords, nor is it just about sound (Mills et al., 2003). Timbre, tempo, and emotion mediate the experience, and some synaesthetes have to hear an entire musical piece to produce a synaesthetic response (Curwen, 2018). In this article I focus on higher, concept-driven, forms of music-colour synaesthesia.
According to Peacock (1985) there are four broad categories of musical inducer: compositional style, timbre, tonality, and pitch (tone). Concurrents are almost always experienced as colour, but can also be experienced as shapes, spatial layouts and textures (Eagleman & Goodale, 2009). Synaesthetes often experience more than one form simultaneously and in various combinations. For example, GS, the sole participant in Mills et al.'s (2003) study, experienced large blocks of darker colours when hearing heavy metal music, and unpleasant colour combinations when hearing music she disliked. The particular colours were influenced by changes in instrumentation or timbre, and higher and lower pitches were associated with lighter and darker colours, respectively. She also experienced shapes, texture, and movement creating landscapes that she referred to as maps, that moved at speeds in accordance with the tempo of the music and were often easier for GS to follow than standard musical notation.
There are some similarities between synaesthetic experiences and typical non-synaesthetic cross-modal associations, such as those between pitch height and lightness (Eitan & Timmers, 2010;Ward, Huckstep, & Tsakanikos, 2006) and pitch height, size, and brightness (von Hornbostel, 1925;Marks, 1974Marks, , 1987. Marks (1975) argues that the cross-modal mechanisms underlying synaesthesia and general cognition are so similar that the former can be seen as a kind of shorthand for the latter (p. 325). Krueger (2011Krueger ( , 2014 argues that music helps individuals realise emotional and social experiences in everyday life. According to the original extended mind thesis (EMT), cognitive processes are not confined to the head, nor even the body, but extend into the external world (Clark & Chalmers, 1998): a pen and paper aids calculation, a cane helps the blind navigate, and musical instruments facilitate music production (Chemero, 2018). The EMT has been expanded to propose a dynamic coupling system between subject and environment (Sutton, 2010;Kirchhoff, 2012). Colombetti and Roberts (2015) explain that a "self-stimulating, coupled relationship is instantiated" (p. 1259) between a saxophonist and their instrument demonstrating that the relationship extends beyond "skull and skin" (p. 1244) to affective states: the sound produced by the instrument affects the player's emotional experience, influencing what they play next. Reybrouck (2017) suggests that musical sense-making derives from the mediation of cognitive events by sensorimotor interactions with the physical world, which may be multimodal. The addition of a visual modality to an auditory modality may make a listener aware of previously hidden auditory information, modifying their perceptual experience. Interpreting synaesthesia in similar terms challenges the prevailing view that music-colour synaesthesia is a form of cross-sensory imagery with neurological origins.

Music-Colour synaesthesia and general cognition
Notwithstanding the rarity of synaesthetic experiences, and their characteristic automaticity and consistency, music-colour synaesthesia may not be so different from typical music cognition. Barsalou's perceptual symbol systems theory (PSS) explains how a conceptual system grounded in perception might work (1999,2003). When people perceive objects, their basic sensorimotor experiences are captured as sensory symbols so that patterns of activation established during early sensory processing can be re-enacted subsequently. Conceptual systems derive from category knowledge, each category relating to a different component of experience (Barsalou, 2003). Individual concepts are the product of repeated re-enactments of the pattern of activation established when the object was perceived for the first time. These re-enactments are only partial, however; certain elements of the original pattern of activation evoked by the perception of the object remain, but context determines the nature of each subsequent reenactment, adding further multimodal sensorimotor information to previous perceptions of the object (Jacobson, 2013). As Zbikowski (2010) explains: . . . a simulator for the category of conjoined musical events associated with the term "perfect authentic cadence" could be established simply through multiple encounters with exemplars of the category . . . such a simulator would not only include auditory information, but also extend to sensorimotor information about the feeling of performing these events . . . introspective states associated with such cadences, and physical responses to hearing them (Zbikowski, 2010, pp. 35-36).
In music-colour synaesthesia, the concurrent might be mediated by new information that influences the conceptual content of the stimulus. This information might consist of timbre, emotion, tonality or musical style (Mroczko-Wąsowicz & Danko, 2014). There are many different types of synaesthesia, however, and in accordance with Simner (2012) who argues against a one-for-all explanation, and Auvray and Deroy (2015) who suggest that each type be considered separately, my aim here is not to propose a single underlying mechanism. Some synaesthesias may be of neurological origin, others genetic. Nevertheless, it is likely that higher, concept-driven forms of music-colour synaesthesia are based on experience, albeit to varying extents.

Cognitivist approaches
Cognitivist accounts of musical experience propose representations and computations based on an input-output model (Lerdahl & Jackendoff, 1983;Nussbaum 2007): the sensory system receives a stream of external information from which internal representations of the real world are created (i.e., music processing). They do not consider the role of the body; Fodor (1983) argued for the modularity of mind, while more recent neurological explanations focus on the role of areas of the brain thought to be specialised for music (e.g., Peretz & Coltheart, 2003). These explanations are criticised (e.g., Di Paola et al., 2017;Fuchs, 2017;Gallagher, 2017;Varela et al., 1991) yet their core tenets are still endorsed. For example, Huron (2006) explains musical experience in terms of an internal appraisal mechanism based on "some sort of weighted sum" (p. 110) of different representations of listeners' expectations, while Sloboda and Juslin (2010) explored the relationship between real emotions and their representation in the form of emotional responses elicited from music. Yet musical experience is intersubjective and influenced by what individual listeners feel and what they do (Leman, 2008;Reybrouck, 2006). People fulfil their social needs and experience well-being through musical engagement not by processing internal representations but by exploring the external, musical, environment (Krueger, 2009(Krueger, , 2011Reybrouck 2010;Schiavio et al., 2017). Clarke (2005) applied ecological psychology, based on Gibson's (1966Gibson's ( , 1979 theory of affordances, to music perception. Affordances, according to Gibson, are what the environment "offers the animal, what it provides or furnishes, either for good or ill . . . I mean by it something that refers to both the environment and the animal . . . it implies the complementarity of the animal and the environment" (1979, p. 127). Consequently, perception is seen not as an internal process but as the result of the ongoing relationship between the animal's, or agent's, whole perceptual system, rather than isolated receptors such as the ear or the eye, and its environment.

Ecological approaches
But what are musical affordances? Most scholars agree that music affords movement or some form of entrainment (e.g., Clarke, 2005;Leman, 2008;Krueger, 2014;Reybrouck, 2005Reybrouck, , 2012. DeNora describes musical affordances as "moods, messages, energy levels, and situation" (2000, p. 44). A musical event may be an interactive exploration of different "sonic affordances", and a provider of emotional and social affordances without which our ability to relate to others would be significantly diminished (Krueger, 2011). Krueger (2014) subsequently put forward a model of the musically extended mind, explaining that music affords not just movement but also entrainment, observed as moving in time to the beat and sharing the experience with others. As an "emotion-extending resource" (2014, p. 9) musical affordances offer access to extended experiences and expressivity beyond those available to us in non-musical situations.
In their theory of musical affordance, Menin and Schiavio (2012) propose an embodied, motor-based account emphasising the intentional nature of the relationship between musical subjects and objects. When a skilled guitarist demonstrates a motor form of intentionality by using their fingers on the strings in such a way as to reproduce a musical sequence, "this sensory-motor process not only represents the basis of musical understanding, but it can also shed light on the notion of musical affordance, relying on a sub-cognitive, pre-linguistic, intrinsically motor form of intentionality" (p. 210). Applying Ramstead et al.'s (2016) cultural affordances framework to music performance, Einarsson and Ziemke (2017) argue that it is "the situation as a whole that has affordances" (p. 10), offering a still broader perspective.

Embodied approaches
The rejection of the distinction between action and perception is at the heart of Gibson's ecological theory, and key to theories of embodiment (Chemero, 2009;Leitan & Chaffey, 2014;Wilson & Golonka, 2013). Their proponents acknowledge the constitutive (i.e., essential) role of the body in driving cognitive processes and dismiss the key role of mental representations in the cognitive economy of the living system (Thompson, 2007;Varela et al., 1991). Should information be there for the taking in the environment, why would nature build an internal mechanism to do the same job (Rowlands, 2003)? Nevertheless, those who propose embodied approaches to music cognition have struggled to discard all aspects of representation. For example, representational structures are retained in Leman's (2008) theory of embodied music cognition, which otherwise describes the relationship between agent and environment, to capture the richness of musical experience (see Schiavio & Menin, 2013). Recent research frames more "radically embodied" (van der Schyff et al., 2018, p.13) and enactive approaches to musical engagement in the context of musical emotion, communication, and meaning in community music making, and musical creativity (Schiavio et al., 2017(Schiavio et al., , 2019van der Schyff et al., 2018). In contrast to approaches presenting music cognition as a series of internal (i.e., computational, neural) processes and representations, these approaches propose the direct, circular interaction between the agent's body and its social, cultural, and physical environment (Reybrouck, 2014;van der Schyff et al., 2018). Matyja (2010) highlights the importance of O'Regan and Noë's (2001) sensorimotor contingency theory (SCT) in an enactive approach to music cognition. SCT is an account of perceptual consciousness that attempts to explain qualia without reference to representation. It presents a new approach to perception emphasising the influence of motor actions on changes in sensory stimulation (Bishop & Martin, 2014). Focusing on visual experience, "the central idea of our new approach is that vision is a mode of exploration of the world that is mediated by knowledge of what we call sensorimotor contingencies [emphasis in original]" (O'Regan and Noë, 2001, p. 940).

Sensorimotor theory
SCT explains how an agent explores the environment, and how their attention is attracted by (unpredictable) attributes of the environment described as bodiliness, grabbiness and insubordinateness. The extent to which an agent makes use of sensorimotor contingencies determines the quality of their experience: Bodiliness refers to the objectively quantifiable way in which bodily changes modify sensory input; for example, turning your head alters visual input, but has no effect on thoughts. Insubordinateness is the fact that bodily changes, though they have a systematic effect, do not completely determine sensory changes (sensory input can change without bodily changes occurring). Grabbiness concerns the fact that, due to basic properties of sensory systems, sudden transitory changes in sensory input strongly grab our attention and cause perceptual processing to be focused on the sudden event (Degenaar & O'Regan, 2015, p. 2).

Sensorimotor theory and music
How can we relate this to music? If an act of listening to music is an interaction with the sonic environment (Krueger, 2009), relevant sensorimotor contingencies may be obtained in the following ways: bodiliness: turning towards the sound we hear; grabbiness: being alerted at a key or instrumentation change; insubordinateness (relating to aspects of the sonic world beyond our control): music stopping unexpectedly, equipment failure, instrument failure.
If SCT is about actively exploring the environment, doing and interacting, how can it be applied to music listening when we appear not to be doing anything? Yet doing denotes not only bodily movement, but also thinking, imagining, and standing still (Beaton, 2013). Listening to music offers affordances in the form of our memory of previous experiences of musical engagement (Myin, 2016). Stewart et al. (2003) report empirical evidence for a sensorimotor role in reading and playing music in a functional imaging study with non-musicians. Twelve learners undertook 15 weeks of musical training and carried out an explicit music reading task requiring them to press keys on an electronic keyboard. Their results were compared with those of a control group. Activation in the superior parietal lobe (SPL) was observed in the learners but not the control group. The researchers conclude that reading music involves a sensorimotor translation of the notation to appropriate keypresses.
Some have argued that formal musical training is needed to achieve deep musical understanding (e.g., Kivy, 2002), whilst others have shown that an implicit understanding of the structure of Western tonal music can be acquired through exposure alone (Krumhansl, 2010). Importantly, formal musical training is not needed to obtain relevant sensorimotor knowledge. For example, Peñalba-Acitores illustrates how bodiliness and grabbiness might emerge during typical musical engagement (Peñalba, 2011, p. 222): . . . our perception roams around different aspects of the material, exploring melodies, instruments, chords, structure, and style; and we are aware of that exploration through bodiliness . . . we will know that we are experiencing a crescendo because of increasing tension in the muscles; and we will experience rhythm because of the way that it allows us to synchronize our movements (virtual or actual) with the beat; This constitutes bodiliness.
Grabbiness, by contrast, captures the idea that the environment guides the subject in perception . . . In an orchestral piece, a listener might be more likely to be "grabbed" by timbre . . . Or we may be "grabbed" by the unexpected change from minor to major in a tierce de Picardie. 1 Peñalba-Acitores also suggests that listeners unfamiliar with Indian music find it difficult to listen to at first because they attempt to apply the sensorimotor skills they have learned from listening to Western tonal music.

Neurological theories
There are two primary neurological explanations for the cause of synaesthesia, the disinhibited feedback theory (Grossenbacher & Lovelace, 2001) and the hyperconnectivity theory (Ramachandran & Hubbard, 2001). The disinhibited feedback theory suggests that a breakdown of the barriers that normally keep modules and their processing completely separate permits a free flow of information from primary sensory areas to associated areas such as the parietal lobe or limbic system. In this way, if feedback signals are not inhibited, later stages of processing can influence earlier stages of processing (Neufeld et al., 2012). According to the hyperconnectivity theory, both intermodal and intramodal synaesthesias are caused by a bottom-up process arising from unusual direct connections between different modules of the brain such as visual and auditory areas. In infancy, there are many more connections between brain areas than in adulthood. These extra connections are normally pruned as the brain matures, yet in synaesthesia it is thought that this process is not completed fully, leaving some unusual connections behind (see Ramachandran & Hubbard, 2001, pp. 9-10). Both theories emphasise the role of genetic factors for the cause of synaesthesia and provide little role learning in its development. Yet synaesthesia may not just arise from inadequate neural pruning and weakened inhibitory re-entrant feedback. The reasons for such disinhibition or connectivity may be various.

Role of concept
As previously mentioned, some synaesthesias can arise from a conceptual stimulus (Dixon et al., 2000;Mroczko-Wąsowicz & Werning, 2012;Mroczko-Wąsowicz & Nikolić, 2014;Ward, Tsakanikos, & Bray, 2006) implying that synaesthetes may not be predisposed to synaesthesia but, rather, have learned to assign meanings to certain stimuli for the purpose of strengthening their knowledge and understanding of abstract concepts (van Leeuwen et al., 2015). Individual synaesthetes frequently disagree as to the specific colours and imagery associated with musical inducers, suggesting that the learned meanings assigned to stimuli are neither random nor universally applied.

Role of body, action and environment
The similarities that exist between synaesthetic pairings and the cross-modal associations commonly made by the general population between colour, music, emotion, pitch-height and pitch size indicate that synaesthetes and non-synaesthetes employ comparable mental processes (Gallace & Spence, 2006;Isbilen & Krumhansl, 2016;Marks, 1987Marks, , 2004Mondloch & Maurer, 2004;Palmer et al., 2016;Palmer et al., 2013;Tsiounta et al., 2013;Walker et al., 2010;Ward, Huckstep, & Tsakanikos, 2006). Both groups are exposed to similar learned cultural and environmental associations. For example, large objects make bigger and louder sounds on impact than smaller ones, and higher pitches are associated with smaller animals (Spence, 2011). Parise and Spence (2009) hypothesise that, according to Bayesian theory, pairings such as those between pitch and size are based on the individual's prior knowledge that these cross-modal associations "go together" (p. 2) in the natural environment. Musiccolour synaesthesia may simply be a typical musical experience with "extra qualia," as described by Wager (1999, p. 264), for some people. Nevertheless, as we have seen, the phenomenological experience of seeing the colour green when hearing the pitch class A presents a challenge to representationalism (Alter, 2006;Auvray & Deroy, 2015;Brogaard, 2016;Curwen, 2018;Rosenberg, 2004;Wager, 1999). Embodied and enactive music cognition research rejects explanations of musical experience as a series of internal cognitive processes (Reybrouck, 2005(Reybrouck, , 2012 and argues rather that "musical experience . . . is not something that is done to us" [but instead is] "something we do" (Krueger, 2011, p. 2). A more holistic and embodied approach to understanding musical experience (Schiavio & van der Schyff, 2016) can be applied to the development of a sensorimotor explanation for music-colour synaesthesia. Deroy and Spence (2013) suggest that the cross-modal correspondences underlying synaesthesia may be grounded in sensorimotor associations:

Synaesthesia and sensorimotor contingency theory
. . .most people match angular shapes with the word "takete" while matching rounded shapes with the word "maluma" . . . it is the sharp vocal transitions made by the mouth when uttering the plosive sounds in "takete" that people map onto the sharp/angular shape . . . this cross-modal correspondence would then become "embodied" and grounded in sensorimotor associations (p. 1249).
However, it is important to acknowledge the challenges that synaesthesia poses to the basic assumptions of sensorimotor theory, which Mroczko-Wąsowicz (2015) highlights in the following three objections: 1. Synaesthetic concurrents are generated internally, and do not arise from light reflecting from a surface, changing the angle of the head, or eye saccades.

Synaesthetic colour does not adapt away as in colour inversion in normal vision. It is generally consistent and unchanging. 3. Synaesthetes can tell the difference between synaesthetic colours and veridical colours,
suggesting that synaesthetic colours lack perceptual presence (i.e., of being real and existing in the world).
However, higher, concept-driven forms of music-colour synaesthesia can be reconciled to sensorimotor theory and the above objections resolved, as discussed below.

Synaesthetic colours are generated internally
Synaesthesia has little to do with vision per se: "synesthetic visual responses to music aren't affected by shutting or moving the eyes" (Ward, 2013, p. 51) and concurrents do not arise from normal vision. The colours experienced by associators are often described as being in the mind's eye, or of knowing a colour. Yet this does not mean that a synaesthetic experience is not the result of an individual's interaction with their direct environment. Affordances derived from interactive patterns in previous experiences of musical engagement can be understood as "something we do now, in the light of what we have done before" (Myin, 2016, p. 100). Similarly, Barsalou's PSS theory proposes that a pattern of activation established during early basic sensorimotor processing can be re-enacted subsequently (1999,2003). For example, in other synaesthesia research, Mroczko-Wąsowicz and Werning's (2012) presented two semiprofessional swimmers who associated synaesthetic colours with four swimming strokes (breaststroke, crawl, butterfly and back stroke). In a Stroop-like task (Stroop, 1935) both named colours faster when shown photographs of swimmers in stroke-congruent colours. Simply thinking about or imagining the swimming stroke was enough to re-enact the original activation pattern and elicit a colour response; the strokes did not have to be executed physically. Beaton (2013) argues that colour does not exist independently in the external world, nor just in our minds, but as the result of our interaction with the world (see also Varela et al., 1991) emphasising that learned associations are an important part of individuals' personal experience with colour, and what it means to them. For example, Gilbert et al. (2016) found that participants deliberately matched similar colours to similarly valenced emotion terms. Participants' choices differed significantly as a function of emotion and were moderated by sex and age. When participants in another study were asked to judge the effects of colour on emotion (Wilms & Oberfield, 2018), the combination of hue, saturation, and brightness was important, and Palmer and Schloss (2016) found that preferences for a particular colour were influenced by interaction with an object of the same colour. Barsalou (2003) also emphasises the role of personal experience when re-enactments of original sensorimotor patterns are tailored to the context of the individual's current situation. Personal experience may explain disagreements between synaesthetes as to the colours associated with the same inducer.
Visual perceptual experiences can be obtained from stimuli other than those produced by light reflecting from a surface, as demonstrated by the visually impaired with tactile-vision substitution systems (TVSS), as described by Bach-y- Rita and Kercel (2003), or other forms of sensory substitution such as The vOICe (e.g., Pasqualotto & Esenkaya, 2016). For example, the profoundly deaf Scottish virtuoso percussionist, Dame Evelyn Glennie, writes of how "my whole body is similar to an ear, every surface has learnt to become a conduit, bringing meaning and sense to my brain" (2016, para.11).
It can therefore be argued that what underpins music-colour synaesthesia is the development of a conceptual system, in the form of sensorimotor features associated with a musical inducer the first time it is heard, and enriched through repeated exposure. For example, the individual may experience, either as an instrumentalist or a listener, an intense emotional response to the music, associated with a certain group of colours that in turn become associated with it. Different individuals presented with the same inducer experience it in different contexts, however, which would explain why different synaesthetes see different colours. A synaesthetic colour experience might be the result of subsequent re-enactments of the conceptual system while remembering, producing or listening to the music. For example, Ward, Tsakanikos and Bray (2006) showed that synaesthetic colour was determined by musical context rather than mode of presentation or form of stimulus. In the notation of Western music, a dot presented on the middle line of a five-line stave has different meanings depending on the clef (e.g., B in treble, D in bass). Synaesthesia is elicited at the conceptual level, so the same colour will be assigned to a D whether it is shown on the middle line of a bass stave, below the bottom line of a treble stave or as a letter, whether upper-or lower-case.
The variation in colour preferences from synaesthete to synaesthete for the same stimuli reflects the differences in the meaning of the concept surrounding the stimuli for each individual. According to Gardenfors (2004), concepts "are intrinsically dynamic entities, arising and adapting continuously as the agent engages with its environment . . . concepts are never free-floating entities but are always concepts for a particular agent, who comes with her own perspectival biases" (p. 170). Barrett (2011) makes an interesting link between Gibson's affordances and Jacob von Uexküll's Umwelt (1992). The notion of Umwelt is described by Barrett as "the world as it is experienced by a particular organism" (p. 80) implying, as in Gibson's theory, that the same environment will not offer the same affordances to each animal. An agent is only sensitive to those stimuli relevant to it from within its own environmental niche. For example, humans do not need to see ultraviolet light or hear very high-pitched sounds, neither do we need a heightened sense of smell to detect and avoid other predators in the same environment. None of these things forms part of our Umwelt. Reybrouck (2001Reybrouck ( , 2005 considers music listening relevant to Umwelt research, arguing that "dealing with music can be considered as a process of knowledge acquisition" (2001, p. 623) dependent on the individual listener's previous interactions with their sonic environment and the meanings that listener might attribute to the sounds. Reybrouck claims that "what is really important is not the acoustical description of the sound, but the sounds as they are experienced by the listener" (2001, p. 618). Non-synaesthetes are happy to accept reality as it is presented to them: without unobtainable synaesthetic experiences (Eagleman, 2012). Similarly synaesthetes, whose phenomenal experience of music includes colours, accept their wider Umwelt. Kohler's (1964) study describes the experience of wearers of coloured goggles who report an adaption over time until the goggles no longer interfere with their normal vision. Yet, as Ward (2012) points out, it is often overlooked that the adaptation does not mean that the colours have returned to normal. The colours are still inverted, but the wearer has had to adjust to them to be able to navigate their environment with relative ease. In the case of synaesthesia, seeing a synaesthetic colour does not impair the synaesthete's ability to see veridical colour. An associator's colours held in the mind's eye do not arise from light reflecting off a surface (Ward, 2012) so there is no need for them to disappear to accommodate veridical colours. Synaesthetes can see both veridical, and synaesthetic colours, just as well. For example, in Ward, Tsakanikos and Bray's (2006) study synaesthetes were shown musical notes in congruent and incongruent colours and asked to name their synaesthetic colour, ignoring the veridical colour. Although a Stroop effect was observed, naming synaesthetic colours did not interfere with the synaesthetes' normal vision. At no time did they see the written notation as anything other than its veridical colour, black. Notably, Mroczko-Wąsowicz (2015) suggests that there is no real need for synaesthetic colours to adapt away, as they do not carry the same colour information about the objects within the synaesthete's environment as veridical colours (see p. 10). Synaesthetic colours associated with a musical event may instead provide information about a concept, emotion, meaning, property or a previous action.

Synaesthetic colours lack perceptual presence
If a synaesthete is aware that the colours they see are not veridical, how can their experience be viewed as an interaction with the real world, and how can the potential lack of perceptual presence (of being or existing in the world) be responded to? According to Noë (2001), perceptual presence can be explained by the practical mastery of sensorimotor contingencies. For example, we know what the reverse side of a tomato looks like even though we cannot see it. Our sensory responses, elicited by the tomato, are such that we know how the tomato will behave in a variety of situations.
But how can we account for phenomena such as synaesthesia, in which "raw sensory experience ('qualia') remains but perceptual presence is lacking" (Seth, 2014, p. 98)? Seth's Predictive Perception account of Sensorimotor Contingencies (PPSMC) accounts for normal perception and synaesthesia through the interpretation of counterfactuals and predictive processing. Generative models predict an outcome based on how sensory inputs would change in various action situations, even if those actions did not actually happen. The richness of these counterfactually encoded sensorimotor contingencies determines the degree of perceptual presence. Counterfactuals might be understood as statements of what would occur if something other than the present state of affairs happened (e.g., the glass would break if I were to drop it on the floor). A predictive model based on counterfactuals takes into account not just the likely cause of a sensory input, but the likely cause of a sensory input based on a repertoire of possible actions (Seth, 2014). Used in everyday science (Beaton, 2013), counterfactuals inform us how an interaction might occur between an agent and its environment in various scenarios. The interaction may, or may not, be carried out, but it could possibly happen in reality. Seth's theory claims that the lack of perceptual presence in synaesthesia is due to poor counterfactuals owing to a smaller, or non-existent, repertoire of likely real-world sensory inputs.
This interesting account has received a number of responses (Froese, 2014;Hohwy, 2014;Madary, 2014;Metzinger, 2014;O'Regan & Degenaar, 2014;Rouw & Ridderinkhof, 2014;van Leeuwen, 2014). Madary (2014) compares counterfactual richness in visual and other sensory modalities where, in the latter cases, only some counterfactual information may be required for perceptual presence. For example, there are far more counterfactual possibilities available to us in vision than from sound. We can visually observe an object from many different angles by moving our eyes alone, but fewer options are available to us from an auditory perspective. This might suggest that auditory input need be far less counterfactually rich than visual input to achieve perceptual presence. Madary proposes a modification to Seth's proposal. Instead of the degree of presence depending upon the degree of richness of counterfactual information, "some counterfactual information" regarding sensorimotor contingencies is required for perceptual presence (p. 132). However, Madary questions whether even this modification might explain the apparent lack of presence in synaesthetic concurrents, asking whether there are any sensorimotor contingencies in synaesthetic concurrents at all? Madary suspects not, but one might argue otherwise. Froese's (2014) main challenge to Seth's explanation is that counterfactual predictive processing only appears to address perceptual presence and not the appearance of reality, and that it is "the absence of the latter, and not of the former, that is an essential property of synesthetic experience" (p. 126). By not distinguishing sufficiently between the two, Froese argues that Seth does not account for types of synaesthesia that in fact might present some kind of perceptual presence. While it may be more difficult to attribute sensorimotor contingencies to some forms of synaesthesia than others, music-colour synaesthesia is often accompanied by shapes, textures and moving landscapes (Eagleman & Goodale, 2009) similar to the tunnels reported in spatial sequence synaesthesia (Gould et al., 2014). Although "this contextual space appears as distinct from the real world" (Froese, 2014, p. 127) counterfactuals by themselves cannot explain the lack of reality of concurrents, as the experience of counterfactuals may remain surprisingly rich. An enactive account of perception does not attempt to explain veridical experience in terms of internal representation, but by how that experience "is shaped by the real world" (p. 127). Indeed, such phenomenal experience, as reported in music-colour synaesthesia, might offer a perceptual presence in a similar form to those Beaton (2013) attributes to visual memory. Counterfactuals may operate even in imaginary contexts, and it is only the lack of reality that distinguishes synaesthetic colour from veridical. Beaton (2013) claims that both visual memory and visual imagination are themselves types of interaction with the world and gives examples of how we are "poised to act" (p. 306) even when encountering hallucinations of a unicorn, or an illusion of a tomato. Although the unicorn does not and cannot exist in the world, we can gain an understanding, by using counterfactuals, of how we would act if these objects were actually present (Shoemaker, 1994). For example, O'Regan and Degenaar compare a synaesthetic experience to the ability "to vividly imagine things that are absent" suggesting that the relevant cortical activity remains "dangling" (2014, p. 131): in the sense that the cortical activity is not related to current sensorimotor events happening in the environment (Hurley & Noë, 2003). Indeed, Schubotz (2007) provides evidence supporting an explanation as to how a simulation of events, including auditory events, may be realised in our sensorymotor system, even those that we are unable to reproduce ourselves.
According to O'Regan & Degenaar (2014) "sensorimotor theory itself already has the resources to explain synaesthesia" (2014, p. 131). Bodiliness, grabbiness and insubordinateness are able to go beyond the idea of explaining perceptual presence in terms of counterfactual richness, and are equipped to explain all sensory experience. The argument here is that sensorimotor contingencies can be attributed to cases of music-colour synaesthesia. We can imagine our interaction with tunnels and shapes in a moving spatial landscape or the expected feel of certain textures. Resonating with Barsalou's PSS theory, a fragment of the original sensorimotor state might be placed in memory (Jacobson, 2013) explaining why a synaesthetic concurrent may appear disembodied from the original sensorimotor qualities of its inducer. It may appear that a synaesthetic concurrent "seems to have little to do with the SMCs underwriting the perception of the inducer" (Seth, 2014, p. 105), but once the sensorimotor contingencies of experiencing red have been mastered, then red can be re-enacted in an atypical way.

Music-Colour synaesthesia as a variant of general human cognition
Although never directly referring to an enactive or sensorimotor approach, Sollberger (2013) stresses that the extent to which an individual is able to interact with their environment has an important functional role when characterising a perceptual experience as veridical. Furthermore, if the perceiver is able to distinguish and interact with external objects in the world, then how the experience is embodied, whether as a taste or a coloured sound, is irrelevant (Sollberger, 2009, pp. 151-152). Acknowledging that different types of synaesthesia might require different explanations for their cause, Sollberger argues that not all forms of synaesthesia should be viewed as a non-veridical experience, and that some synaesthesias should be viewed as a "normal variant of human perception" (2013, p. 171). Sollberger refers to music-colour synaesthesia as a form that presents to the perceiver something about their real-world environment. Specifically, synaesthetes with this type of synaesthesia are able to meet the following two conditions: They literally attribute the sensory properties of the synesthetic experiences to the distal stimulus itself.
They do not take their synesthetic experiences to be nonveridical, e.g., illusory or hallucinatory (2013, p. 173). Sollberger (2013, p. 174) cites experiences from music-colour synaesthetes collected by Cytowic (2002): A person who sings with little phrasing or variation in volume has a straight line voice. A baritone has a round shape that I feel. This is so obvious, it's all very logical. I thought everyone felt this way. When people tell me they don't, it's as if they were saying they don't know how to walk or run or breathe (Cytowic, 2002, p. 28).
The shapes are not distinct from hearing them-they are part of what hearing is. The vibraphone, the musical instrument, makes a round shape. Each is like a little gold ball falling. That's what the sound is; it couldn't possibly be anything else (Cytowic, 2002, p. 69).
The colours and shapes experienced in music-colour synaesthesia are a fundamental part of the phenomenal character of musical experience, or the "what it is like" (Nagel, 1974, p. 437) to hear music, for synaesthetes (Chalmers, 1996;Shoemaker, 1994). Synaesthetes are often surprised to learn that not everyone experiences music as they do and cannot imagine experiencing music in any other way. Perhaps the question we should be asking is not, "what is it like to have synaesthesia" but, "what is it like not to have synaesthesia?" Historically, synaesthesia has been examined as a sensory/perceptual condition distinct from typical cognition. Yet, many different types and sub-categories of synaesthesia exist, and purely neurological explanations remain inconclusive. The similar mental processes employed by synaesthetes and non-synaesthetes (Simner, 2012) suggest that certain synaesthesias can be described in terms of typical cognition. Recent research in general music cognition has moved away from a cognitivist approach, and explains musical experience via embodied and enactive theories (Schiavio et al., 2017;Schiavio et al., 2019). It is possible that certain forms of synaesthesia may arise from purely neurological or genetic factors. However, the direction of research in general music cognition and in synaesthesia point to a sensorimotor approach as a promising next step in explaining the phenomena of synaesthesia associated with music.

Conclusion
I have argued that music-colour synaesthesia should be examined not as a separate and distinct condition, but as a continuation of typical perception and cognition. Research in general music cognition has embraced embodied and enactive accounts that include an active role for the body and its situated power of action. Central to these accounts is how engagement with music might be regarded as an act of doing, in accordance with Gibson's theory of affordances and sensorimotor theory, and how certain attributes of a musical environment attract a subject's attention in the form of their bodiliness, grabbiness and insubordinateness (Peñalba, 2011;Krueger, 2009). Yet actively exploring and interacting with a musical environment should not be restricted to physical acts of engagement we more commonly associate with doing, but can extend to imagining and thinking (Beaton, 2013;Myin, 2016). Deep musical understanding can be acquired without formal musical training through attentive everyday listening (Krumhansl, 2010). I have described how a synaesthete can obtain a mastery of sensorimotor contingencies and become poised to act (Beaton, 2013), and how the quality of the synaesthetic experience is governed by a level of "dangling cortical activity" (Hurley & Noë, 2003, p. 158). The similarities in the progression of research in synaesthesia and general music cognition show how music-colour synaesthesia might be reconciled with a sensorimotor account and viewed as a "normal variant of human perception" (Sollberger, 2013, p. 171). The phenomenon of shapes, colours and textures experienced on hearing music represents something about the real-world environment for the synaesthete, and is integral to their perception, experience, knowledge, and musical understanding.
These claims present an opportunity for future empirical research. Starting from the hypothesis that synaesthesia associated with music is mediated by concept and context but grounded in sensorimotor action, the commonalities between the mechanisms underlying music-colour synaesthesia and general music cognition, might be tested. A group of non-synaesthete musicians could undertake sufficient training until they were able to produce a series of triads on a keyboard to which they could reliably assign a specific colour. A Stroop-like task would verify that congruent colour-triad trials were produced with faster times and with fewer errors than incongruent colour-triad trials. In a second study, the newly trained group might be compared to a group with pre-existing music-colour synaesthesia. Ward, Tsakanikos, and Bray (2006) suggest that the involvement of the superior parietal lobe (SPL) in sensorimotor transformations is likely to be important in synaesthesia. The application of transcranial magnetic stimulation (TMS) disrupts neural processing in the SPL. Both groups would undertake the Stroop task whilst receiving TMS over the right intraparietal cortex. It is expected that in the noncongruent trials, interference would no longer be observed in either group as the TMS would un-bind the colour-triad associations. I hope this article will encourage a shift in thinking about music-colour synaesthesia as well as promote investigations of the role of our sensorimotor system and actual as well as imagined interactions with the environment in various forms of synaesthesia.