Introduction
Inattentional deafness (ID) refers to the neglect of unexpected auditory information. This is a safety critical issue, particularly in scenarios that rely on auditory warnings (e.g.,
Bliss, 2003). For example, Dehais and colleagues (2014) reported that 11 out of 28 highly trained pilots failed to notice the auditory alarm for landing gear failure that occurred simultaneously with a buffet-inducing windshear.
Typically, ID is attributed to the reduced availability of cross-modal attentional resources to process auditory information, caused by high perceptual load in the competing visual modality (
Macdonald & Lavie, 2011;
Molloy, Griffiths, Chait, & Lavie, 2015;
Raveh & Lavie, 2015). Thus, the demands of visuomotor control caused by sudden windshear, in the example provided previously (i.e.,
Dehais et al., 2014), consumed the available mental resources that would otherwise have gone toward recognizing and responding to the auditory alarm.
This account is supported by both psychophysical as well as neuroimaging evidence. To test for ID, participants are often required to perform visual tasks of varying perceptual difficulty while irrelevant sounds are presented in the background (
Macdonald & Lavie, 2011; Raveh & Lavie, 2015). Those who experience high visual load (e.g., discriminate two lines for their lengths, 3.6° vs. 3.8°) are less likely to hear unexpected sounds than those who performed an easier task (e.g., discriminate two lines for their colors, blue vs. green). Besides behavioral results, Molloy and colleagues (2015) reported that increasing visual search difficulty attenuated auditory evoked potentials of magnetoencephalographic (MEG) recordings to irrelevant audio tones. In other words, information processing demands in the visual modality reduced brain responses and thus the ability to detect irrelevant stimuli in the auditory modality. This finding agrees with neuroimaging studies conducted in experiments resembling flight control scenarios (
Dehais, Roy, Gateau, & Scannella, 2016;
Giraudet, St-Louis, Scannella, & Causse, 2015;
Scannella, Causse, Chauveau, Pastor, & Dehais, 2013). In an EEG/ERP study, participants were presented with video clips of a primary flight display with flight indicators and required to decide if landing was feasible or not while responding to auditory targets when they occurred. Here, participants were more likely to miss alarms when the simulated scenario presented indicator values that suggested degradation of aircraft status (i.e., heading, magnetic declination, wind speed). More importantly, ERP responses to the presentation of target tones in such high-load aviation-decision scenarios exhibited a smaller P300 component—namely, a positive deflection in the Pz electrode recording, around 450 to 600 milliseconds post sound presentation—than low-load scenarios (
Giraudet et al., 2015).
The amplitude of ERP components to visual or auditory stimuli can be treated as an index for information processing—namely, how aware one is of the presented stimuli. An influential account of the functional distinction has previously been provided by
Parasuraman and Beatty (1980) whereby the early negative deflection (i.e., N100) is likely to reflect event detection while the later positive deflection (i.e., P300) is associated with both event detection and recognition. Given this, the reported finding of
Giraudet et al. (2015) suggests that high-load scenarios that are encountered in the visual domain reduce the brain’s capacity to recognize task-relevant events in the auditory domain.
Dual-task paradigms are often employed to study resource conflicts across operational domains (e.g., driving while using the phone). With EEG/ERP measurements, it is possible to investigate not only the behavioral consequences of resource conflicts but also the potential conflicts of information processing at the neural level (e.g.,
Wickens, Kramer, & Donchin, 1984). In the context of steering, increasing the difficulty of a primary visuomotor control task results in larger ERP amplitudes (i.e., P300) to secondary task stimuli if they are presented visually, while smaller P300 amplitudes are associated with secondary task stimuli that are presented in the auditory modality (
Sirevaag, Kramer, Coles, & Donchin, 1989; Wickens, Kramer, Vanasse, & Donchin, 1983). This concurs with a basic tenet of attentional load theory (
Lavie, 1995,
2005) whereby perceptual load in one modality biases the allocation of cross-modal resources to this modality at the cost of another.
Until now, ID is said to occur because of a lack of available resources for processing auditory information. However, cross-modal competition is not a necessary condition for this to happen. A lack of obvious task demands in the auditory domain could also diminish the brain’s capacity to respond, process, and identify auditory information. In other words, while ID could result from an
active fatigue of cross-modal resources, which is the favored account thus far, it could also result from the
passive fatigue of resources selective for auditory processing (see
Desmond & Hancock, 2001;
May & Baldwin, 2009). In the context of driving, long durations of experiencing a monotonous environment (e.g., a straight road) has been shown to result in worse steering (
Thiffault & Bergeron, 2003), which is referred to as a consequence of
underload as opposed to
overload. According to one account, underload conditions cause operators to withdraw resources from a task and induce them to rely on mental schemas of the task scenario instead (
Gimeno, Cerezuela, & Montanes, 2006); auditory alarms tend to occur infrequently across many operational scenarios (
Cummings, Gao, & Thornburg, 2016)—for example, in the supervision of nuclear power plants (
Carvalho, dos Santos, Gomes, Borges, & Guerlain, 2008), air traffic control (
Thompson et al., 2006), and anesthesiology (
Watt, Maslana, & Mylrea, 1993). Thus, operational requirements for constant vigilance for the occurrence of rare auditory warnings is inefficient for limited mental or attentional resources (
Desmond & Hancock, 2001;
Gimeno et al., 2006;
Manly, Robertson, Galloway, & Hawkins, 1999). For warning sounds to be effective, relevant auditory alarms should occur neither too frequently nor infrequently. If auditory alarms occur too frequently, operators might disregard warning sounds completely. This phenomenon, termed
alarm fatigue, has been reported especially in the health care domain (
Cvach, 2012) and corresponds with the concept of active fatigue, as mentioned previously—when the sheer number of auditory alarms overwhelms the operator (i.e., drains their resources), auditory warnings are ignored. Similarly, when auditory alarms occur infrequently, passive fatigue is likely to occur, and resources are withdrawn from the seemingly irrelevant (auditory) modality.
Given this, we would like to revisit the first example that was provided for ID (i.e., ID for aviation warnings during flight control;
Dehais et al., 2014). In this study, the authors observed that participants who had experienced and noticed a critical auditory alarm in the first trial were five times more likely to detect it in subsequent trials, even in windshear conditions that imposed high visuomotor demands. Given this, we currently posit that ID results from a combination of active fatigue—due to the cross-modal demands from the visual domain, such as vehicle handling (
Dehais et al., 2014), visual search (
Raveh & Lavie, 2015), aviation landing decision (
Giraudet et al., 2015)—as well as passive fatigue in the auditory modality due to the absence of obvious task demands.
How can we evaluate the possibility that the absence of obvious task demands in the auditory domain reduces our capacity for processing sounds? In the current work, we do so by measuring the involuntary neural responses of our participants’ brains to task-irrelevant sounds in their auditory environment. Complex environmental sounds (e.g., human laughter, dog barks) are known to generate characteristic ERPs (termed
distraction potentials;
Escera & Corral, 2003) even when they bear no task relevance. It is believed that the distraction potential consists of neural components that are responsible for how we detect these unexpected events (i.e., N1), orient our attentional resources to these events (novelty-P3), and reorient the resources back to the task at hand (i.e., reorientation negativity; RON) (
Escera & Corral, 2003,
2007;
Horváth, Winkler, & Bendixen, 2008;
Wetzel & Schröger, 2014). Furthermore, recent evidence suggests that the novelty-P3 consists of two subcomponents that are functionally distinct. While an early subcomponent (early novelty-P3; e-nP3) was shown to be determined by how unexpected the eliciting sound is, in terms of the difference of its physical properties with respect to its environment, a later subcomponent (late novelty-P3; l-nP3) was shown to be determined by the relevance of the eliciting sound (
Gaeta, Friedman, & Hunt, 2003;
Strobel et al., 2008). These results suggest that only the earlier subcomponent of the novelty-P3 is directly related to the orientation of attention to unexpected events. Its later subcomponent, on the other hand, resembles the well-known P300, an ERP component that is also elicited by task-relevant auditory stimuli (for a summary of P300, see
Polich, 2007). Interestingly, similarities between l-nP3 and P300 were also shown in terms of their neural origin. Independent component analysis as well as scalp current density analysis revealed the involvement of posterior-parietal neural regions in the generation of both l-nP3 and P300 (
Debener, Makeig, Delorme, & Engel, 2005;
Yago, Escera, Albo, Giard, & Serra-Grabulosa, 2003). The spatial topography of l-nP3 and P300 is typically linked to working memory updating operations (
Brázdil, Rektor, Daniel, Dufek, & Jurák, 2001;
Knight, 1996). Thus, e-nP3 and l-nP3 might underlie different attentional processes, respectively, the attentional orienting to an unexpected event (e-nP3) and the updating of working memory (l-nP3).
In previous work, we established that visuomotor control demands can diminish the late neural responses (i.e., e- and l-nP3) to task-irrelevant environmental sounds (
Scheer, Bülthoff, & Chuang, 2016). Others have shown similar findings with visual tasks, such as playing Tetris (
Dyke et al., 2015;
Miller, Rietschel, McDonald, & Hatfield, 2011). This reflects cross-modal demands of the visual modality on auditory processing. In the current work, we required half of our participants to perform an auditory detection task for target pure tones while performing a visuomotor control task (i.e., compensatory roll compensation with rotorcraft dynamics). Given the theorizing thus far, we hypothesize that selective ERP components to task-irrelevant environmental sounds will be larger when the auditory modality is task-relevant compared to when participants are not required to monitor it. Furthermore, we believe that such an effect would reflect the allocation of modality-specific resources to the auditory modality and should be independent of cross-modal demands imposed by a visuomotor task. Finally, the affected ERP component(s) will allow us to infer the stage of auditory information processing that suffers during ID from a reduced capacity by our manipulation of auditory irrelevance, which is independent of those imposed by general cross-modal task demands. The implications of this are discussed in more detail after the results are presented.
Discussion
We found that task relevance of the auditory modality selectively increased the l-nP3 potential. This suggests that auditory relevance increased the likelihood that our working memory will be updated for the occurrence of environmental sounds (
Cycowicz & Friedman, 1998;
Gaeta et al., 2003;
Strobel et al., 2008). ERP components that underlie the detection and the orientation to environmental sounds, namely, N1 and e-nP3, respectively, were not affected by auditory relevance. Critically, this influence of auditory relevance on l-nP3 was independent of whether or not participants were required to perform a visuomotor task. Therefore, we conclude that auditory relevance enhances the likelihood that we update our working memory for environmental sounds regardless of the cross-modal demands of a concurrent visuomotor task. This supports our hypothesis that ID is not solely caused by high workload demands in the visual domain.
In this work, we specifically analyzed the ERPs that were generated as a response to environmental sounds. Our goal was to investigate our participants’ capacity for processing unexpected and task-irrelevant auditory events. Such ERP waveforms have been termed
distraction potentials because they indicate our available capacity to engage with events that have no immediate relevance (
Escera & Corral, 2003). To recapitulate, the deflections in the distraction potential are, respectively, associated with our capacity to detect (N1), recognize (e-nP3), and update our working memory for changes in our (auditory) environment (l-nP3). The current results show that only the l-nP3 component was selectively enhanced by auditory task relevance (
Figure 2). It should be pointed out that the ERP analysis that is employed in this study is data driven. This means that we did not restrict our analyses to a priori ERP components.
If we assume that the chain of ERP components, which compose the distraction potential, reflects the consecutive steps that are necessary to process auditory events, the current results suggest that auditory task-irrelevance selectively reduces our capacity to update our representation of our surroundings. It does not impair our ability to detect or orient toward changes in the environment. To understand and ideally improve our ability to detect changes in our environment, the other stages of auditory information processing, reflected by the components of the distraction potential, should also be taken into account. Current evidence suggests that the detection of and attention-orienting to an unexpected auditory event increases with increasing difference of an auditory event from its immediate environment. For example, larger deviations in an unexpected auditory event’s physical properties from the expected event tend be reflected in larger amplitudes of the early negative ERP component (N1 and mismatch negativity [MMN]) (
Rinne, Särkkä, Degerman, Schröger, & Alho, 2006) and e-nP3 (
Gaeta et al., 2003). Such findings could be used to improve the detectability of warning sounds by making them more distinct from their immediate environment. Besides detecting unexpected auditory events and updating our working memory, it is also relevant whether and how operators are able to orient their attention away from the unexpected auditory event and back to the main task. This process is reflected by the RON component. The main task in this study was a continuous manual tracking task that did not contain discrete events. Thus, it precluded an evaluation of reorientation of attention from the auditory modality back to the primary task. Future research could employ a step-tracking task instead to directly evaluate the RON component to understand the influence of auditory relevance on the efficiency of reorienting attentional resources to the main task.
High cross-modal demands of the visual domain can impact the different stages of auditory processing in a more general fashion or more selectively, depending on how visual demands are manipulated in the first place. In a previous study that is directly comparable to this work, we demonstrated a more general cross-modal influence of the concurrent visuomotor task on distraction potentials than is currently observed (
Scheer et al., 2016). Specifically, the requirements of the visuomotor task attenuated e-nP3, l-nP3, as well as the RON while sparing N1/MMN. The influence of cross-modal demands on auditory processing has been suggested to depend on whether the demands of the visual task are manipulated at either the perceptual or cognitive level (
Lavie, 1995,
2005). Manipulations of high perceptual load in the visual task have been found to selectively decrease N1/MMN, where the argument would be that reduced auditory sensitivity is caused by the participants’ inability to even detect the occurrence of auditory events in the first place (
A. F. Kramer, Trejo, & Humphrey, 1995;
Scannella et al., 2013;
Singhal, Doerfling, & Fowler, 2002). On the other hand, manipulating the cognitive demands of the visual task—for example, working memory load in a visual n-back task (
SanMiguel, Corral, & Escera, 2008) or the complexity of an aviation decision task (
Giraudet et al., 2015)—can selectively decrease later components such as P3 or RON. Although there are different reasons for why and how high visual task demands might induce ID, it appears that auditory irrelevance has a more specific impact. It reduces our capacity to update our representation of the auditory environment, which is a plausible factor that could give rise to ID.
Our current results demonstrate that auditory relevance increased the capacity for auditory processing at the l-nP3 stage independent of visuomotor demands. The experimental manipulation here did not create conditions that resulted in substantial conflict between the visuomotor and auditory task in any way that was apparent at the behavioral (i.e., visuomotor performance) or subjective (i.e., NASA-TLX workload) level (see
Figure 3). Therefore, we believe that auditory relevance has a modality-specific influence on resource capacity.
It continues to be debated whether attentional resources are shared between the modalities (i.e., cross-modal) or specific to them (i.e., modality-specific) (e.g.,
Keitel, Maess, Schröger, & Müller, 2013;
Talsma, Doty, Strowd, & Woldorff, 2006;
Wahn & König, 2017). Experimental evidence exists for both assumptions. Numerous dual-task studies have shown that increased demands in a task presented in one modality often decrease performance levels in a concurrent task that is presented in another modality (
A. F. Kramer, Wickens, & Donchin, 1983;
Sirevaag et al., 1989;
Wickens et al., 1983). Nonetheless, the capacity of modality-specific resources can also be manipulated, similar to this study, without influencing the availability of resources in a separate modality.
Keitel et al. (2013) employed a more direct approach than we have currently adopted whereby concurrent streams of visual and auditory lexical items were presented and participants were explicitly instructed to attend either to the visual or auditory stream or both. Steady-state EEG/MEG responses indicated that a shift of attention to either sensory stream of information could raise neural activity to that modality without diminishing activity in the unattended modality. Similarly, in our study, we find that l-nP3 to irrelevant sounds can be enhanced by introducing modality relevance independent of cross-modal visuomotor demands. Thus, it is likely that both cross-modal and modality-specific resources exist (cf.
Talsma et al., 2006). Our current results suggest that both of them can have an influence on the phenomenon of ID. This would suggest that increasing the capacity of modality-specific resources by making the modality relevant compensates for the risk of ID.
The current findings have at least three important implications for human factors applications. To begin, decreased l-nP3 could be used to index the risk of ID. This means that the operational scenarios that carry the risk of ID could be evaluated without relying on the observation of behavioral
misses, which occur rarely, if at all. Task-irrelevant environmental sounds can be embedded in many operational scenarios without compromising their integrity. Future research in signal processing and state classification could also be motivated to perform this assessment in real time, instead of the offline analysis that was performed here. Recent progress in the design of classification algorithms for ERPs is promising and shows that a classification of the state of the human operator is possible even with single trials (
Blankertz, Lemm, Treder, Haufe, & Müller, 2011;
Freeman, Mikulka, Prinzel, & Scerbo, 1999;
Wilson & Russell, 2003). For example, mental workload can be classified with an accuracy of more than 70% after only three presentations of the stimulus of interest using ERP measures (
Brouwer et al., 2012). More promising than the risk evaluation of ID is the potential prevention of its occurrence. Our findings show that unexpected auditory information generates larger l-nP3 responses when the auditory modality contains a simple task that neither interferes with visuomotor control nor increases perceived workload. Requiring pilots to perform simple and frequent tasks in the auditory modality could heighten their awareness of the auditory environment, even in situations that pose high visual demands. This could prevent the occurrence of ID to critical auditory warnings (e.g.,
Dehais et al., 2014).
The current work is limited in that we did not directly observe ID in the overt behavior of our participants. This would have been challenging given the scarcity of its occurrence. Nonetheless, previous studies provide us with sufficient reason to believe that the increased amplitude of l-nP3 to an auditory stimulus reflects the heightened awareness of the auditory environment, which consequently relates to an ability to respond to the given stimulus (
Gaeta et al., 2003). Here, increases in l-nP3 corresponded with a necessity to produce an overt reaction to unexpected auditory events relative to situations where no responses were required.
Our findings have implications that are beyond the identification and mitigation of ID. Recent years have witnessed an increasing interest in auditory displays, namely, the presentation of complex data through non-speech sounds (
Hermann, 2008;
G. Kramer, Walker, & Bargar, 2010). However, it remains unclear whether the auditory presentation of complex data would interfere with visual data processing. Our current study suggests that modality-specific resources exist and parallel processing of visual and auditory information can occur without interference. Furthermore, our results suggest that environment sounds can result in an update of working memory content even when no response to these sounds is required. This suggests that auditory displays could indeed produce a background awareness of the system state even when operators are involved in a visual task and do not have to respond to the auditory events.
To conclude, the current findings demonstrate that irrelevance of the auditory modality selectively diminishes late-nP3 responses to environmental sounds. We believe that this is a concomitant factor to the occurrence of ID in the real world given the rare occurrence of auditory warnings and hence a default perception of the auditory modality as being task-irrelevant. Auditory irrelevance and its impact on our reduced ability to update our representation of the auditory environment is an independent factor that does not interact with visuomotor demands.