Auditory imagery ability influences accuracy when singing with altered auditory feedback

In this preliminary study, we explored the relationship between auditory imagery ability and the maintenance of tonal and temporal accuracy when singing and audiating with altered auditory feedback (AAF). Actively performing participants sang and audiated (sang mentally but not aloud) a self-selected piece in AAF conditions, including upward pitch-shifts and delayed auditory feedback (DAF), and with speech distraction. Participants with higher self-reported scores on the Bucknell Auditory Imagery Scale (BAIS) produced a tonal reference that was less disrupted by pitch shifts and speech distraction than musicians with lower scores. However, there was no observed effect of BAIS score on temporal deviation when singing with DAF. Auditory imagery ability was not related to the experience of having studied music theory formally, but was significantly related to the experience of performing. The significant effect of auditory imagery ability on tonal reference deviation remained even after partialling out the effect of experience of performing. The results indicate that auditory imagery ability plays a key role in maintaining an internal tonal center during singing but has at most a weak effect on temporal consistency. In this article, we outline future directions in understanding the multifaceted role of auditory imagery ability in singers’ accuracy and expression.

Supplemental Table 2. Participant-chosen songs as performed with the reference tempo and key centre agreed at the start of the trials.
Reference tempo and the first two bars for tonal reference were provided at the start of each trial.Timing factors and complexity measures are presented for each piece (NB: ambitus = melodic range (semitones), complebm = melodic complexity, nPVI = durational variability of note events).

Supplemental Results (SR)
SR-1: Examining Groups by BAIS Score, Formal Training, and Primary Instrument Preliminary analysis was conducted to establish any group differences in terms of musical imagery ability as measured by BAIS-V and BAIS-C scores.The groups were divided based on indications made at the time of sign-up and again on the Gold-MSI of (1) being primarily a vocalist or an instrumentalist and (2) having been formally trained on the principal instrument or not.
Formal Training.For BAIS-V score, there was no significant difference between the participants with formal training (M = 5.26, SD = 0.7) and those without (M = 5.0, SD = 0.69), t(14) = 0.63, p = .54).This was also the case for BAIS-C, t(14) = 1.38, p = .24,although variance between the participants with formal training (M = 5.42, SD = 0.64) and those without (M = 4.64, SD = 1.07) was unequal.Given the lack of any statistical differences in BAIS scores between vocalists and instrumentalists, or participants with and without formal training, no further distinction was made between participants on these grounds.All participants were examined as a single cohort of confident, skilled singers regularly performing vocals in some capacity.
Aggregating BAIS Scores.Additionally, participants' BAIS scores from the two subscales had a strong positive correlation (r = .76,R2 = .57,p < .001, Figure .This is consistent with previous research using the questionnaire (Halpern 2015;Pfordresher and Halpern 2013) and indicates that individuals with ability to produce a more vivid auditory image may also have greater control over that image.Given this strong relationship between the BAIS subscales, BAIS-V and BAIS-C scores were averaged for each participant and this aggregate BAIS score (referred to simply as "BAIS score") is used in all further analyses.

SR-2: Examining Potential Covariates in Self-Selected Music
We also determined whether the participants' self-selected pieces might introduce covariates into the analyses.
Music Complexity.Several elements of complexity were used to compare the different participant-chosen pieces; these measures were calculated using the MIDI toolbox functions for MATLAB (Eerola and Toiviainen 2004) and included the melodic range in semitones (ambitus), melodic complexity (complebm, derived from the expectancybased model of melodic complexity by Eerola and North (2000) and Schaffrath (1995)), durational variability of note events (nPVI, by Grabe and Low (2002) and Patel and Daniele (2003)), note density as number of notes per beat (notedensity), and tonal stability of notes in a melody (tonality, by Krumhansl (1990)).
Correlations between these features in each song and the respective participant's BAIS score, years of study, and years of performance experience were all found to be weak (r < ± 0.39).Ambitus and years of study (r = -0.48),note density and melodic complexity (r = .66),BAIS score and years of performance (r = .6),and durational variability of note events with note density (r = -.52),melodic complexity (r = -0.44),and BAIS score (r = 0.43) were found to be moderately correlated (Figure SR-2.1).However, these were all found to be non-significant (p > .05).As the complexity of the pieces in these dimensions did not correlate with any of our participant measures, we assume that the participantselected music does introduce confounding factors in our primary analyses.
Interaction with Delays.Additionally, potential interactions with DAF were examined; given previous research on timing accuracy with delays (Pfordresher and Palmer 2002), it is possible that the event timing of some of the chosen pieces might result in DAF occurring at a binary subdivision of the beat, thus creating a less distracting delay for some participants.For instance, if the IOI between the notable beats of the measure was 400 ms, the 200 ms DAF condition would result in delays occurring at a subdivision of the beat and could potentially benefit a participant's accuracy in timekeeping.In order to determine whether this was an applicable factor in any of the participant-selected pieces, the IOI between notable beats in each piece was determined using the reference tempo in BPM (provided at the start of each performance) and the meter of the piece.The IOI of notable beats, as well as the measure length, in milliseconds is presented in Supplemental Table 2.
In both cases, the length of the beat and the measure for all pieces would not provide reasonable binary subdivisions with the two DAF conditions; we assume interactions with delays are a likely confounding factor in comparison of participant experience and performance outcomes in the DAF conditions.

SR-3: Control Testing between NF and HF Tasks
As all AAF conditions were delivered via headphones, it was first necessary to test whether using headphones had significant impact on accuracy compared to singing without headphones.For each of the three accuracy measures, the non-AAF performances done in the Normal Feedback (NF) control conditions were compared with Prepared using sagej.cls the Headphones Feedback (HF) performances in a Welch Two-Sample t-test.There was no statistical difference between these performances for any of the accuracy measures: TRD, t(82) = 0.32, p = .75;CV, t(86) = 0.27, p = .79;MBs, t(54) = 0.34, p = .73.Given there appears to be no effect of headphones within the feedback loop, we use HF performances as controls for further analyses and assume their use does not confound responses to AAF stimuli.
To ensure that BAIS score would not be a confounding factor within this control performance (e.g., confident singers should be able to perform well without AAF, regardless of imagery skill) and to justify using an average control score for the group in group-adjusted analyses, a correlation analysis was performed.There were no significant or strong correlations found between participants' aggregate BAIS scores and their controls.For TRD in the Normal task, r = 0.28, p = .29;Toggled, r = 0.038, p = .89;TVD r = -0.17,p = .56.For CV, Normal task, r = 0.28, p = .29;Toggled, r = 0.34, p = .2;TVD, r = 0.11, p = .72).For MBs, Toggled, r = 0.31, p = .24;TVD: r = -0.053,p = .86).We therefore assume control performance accuracy itself is not affected by BAIS score.

SR-4: Additional Details for Individual-Adjusted TRD Analysis
TRD scores were transformed with Ordered Quantile Normalizing Transformation (ORQ) (Peterson and Cavanaugh 2019).There was no significant three-way interaction between the independent variables; a planned simple two-way fit for each BAIS group showed the effect of condition on TRD was significant for the low-BAIS group, F(3,144) = 3.70, p = .013.Bonferroni-adjusted pairwise comparisons show a significant difference between the two groups, t(162) = -2.97,p < .001, in the Whole Tone Pitch Shift condition.Low-BAIS participants had higher TRD compared to their control performances than high-BAIS participants.The Toggled & Voice Distraction task differed significantly between the groups, t(160) = 3.69, p < .001,with low BAIS participants again having higher TRD compared to their control performances.Fullfactorial results for individual-adjusted TRD are found in Table SR

Figure
Figure SR-1.1.Positive correlation between participant scores on the BAIS subscales linear (black) and loess (grey) regression.

Figure
Figure SR-2.1.Correlation matrix between participant experience measures (BAIS score, years of performance experience, and years of theory study) and respective complexity measures for each piece calculated with MIDI toolbox functions (NB: ambitus = melodic range (semitones), complebm = melodic complexity, nPVI = durational variability of note events).

Table 4 .
Participant Demographics: Participants are ordered by aggregate BAIS score, demonstrating the median split and the relation of other demographic information. -4.1.

Table SR -5.1.
Full-factorial results from analysis of the effect on individual-adjusted CV by interaction between BAIS Group, Condition, and Task.SR-6: Additional Details for Individual-Adjusted MBs AnalysisFull-factorial results for individual-adjusted MBs are found in Table SR-6.1.Two-Way Interaction by Group (ANOVA Type II)

Table SR -
6.1.Full-factorial results from analysis of the effect on individual-adjusted MBs by interaction between BAIS Group, Condition, and Task.SR-9: Additional Details for Group-Adjusted MBs AnalysisTwo-way interaction by group revealed a significant effect of task for the low-BAIS group only, F(1,93) = 5.57, p = .02.Full-factorial results for the group-adjusted MBs can be found in TableSR-9.1.Two-Way Interaction by Group (ANOVA Type II) (c) Group Pairwise Comparisons (Bonferroniadjusted), Task

Table SR -
9.1.Full-factorial results from analysis of the effect on group-adjusted MBs by interaction between BAIS Group, Condition, and Task.