Imagery Rescripting Versus Extinction: Distinct and Combined Effects on Expectancy and Revaluation Learning

Anxiety disorders are effectively treated with exposure therapy, but relapse remains high. Fear may reinstate after reoccurrence of the negative event because the expectancy of the aversive outcome (unconditioned stimulus [US]) is adjusted but not its evaluation. Imagery rescripting (ImRs) is an intervention that is proposed to work through revaluation of the US. The aim of our preregistered study was to test the effects of ImRs and extinction on US expectancy and US revaluation. Day 1 (n = 106) consisted of acquisition with an aversive film clip as US. The manipulation (ImRs + extinction, extinction-only, or ImRs-only) took place on Day 2. Reinstatement of fear was tested on Day 3. Results showed expectancy learning in both extinction conditions but not in the ImRs-only condition and no enhanced revaluation learning in ImRs. The combination of ImRs and extinction slowed down extinction but did not protect against reinstatement, which pleads in favor of stand-alone interventions in clinical practice.

memory representation is created (CS → no US) that competes with and inhibits the original CS → US memory representation (Bouton, 2002). This results in an ambiguous meaning of the CS after extinction: The CS now predicts both the occurrence and the nonoccurrence of the US (Bouton, 1993). The CS → US trace is inhibited but can easily win the retrieval competition under certain circumstances, which leads to return of fear (i.e., relapse). One way in which this can happen is during reinstatement, when an encounter with the original US results in a return of the CR after successful extinction, which reinstates the CS → US association in memory (Bouton, 2002;Vervliet et al., 2013).
According to Davey (1997), the CR can be influenced by two processes: US expectancy and US revaluation. US expectancy is assumed to be addressed in extinction learning, in which the expectation of the US occurring as signaled by the CS is lowered. However, US revaluation is thought to be not directly targeted by extinction. During US revaluation, the mental representation of the US changes (e.g., in valence and/or meaning). Imagery rescripting (ImRs) is an intervention that has been proposed to work through US revaluation (Arntz, 2011(Arntz, , 2012. During ImRs, the outcome of an aversive event in memory is being mentally rescripted into a more positive one. There are several protocols on how to apply ImRs. Smucker et al. (1995) developed the treatment to mitigate symptoms of posttraumatic stress disorder (PTSD) related to childhood abuse. Their intervention includes (a) reliving the traumatic scene (i.e., prolonged imaginal exposure) and (b) changing the memory using mental imagery (i.e., ImRs). Arntz and Weertman (1999), on the other hand, suggested that prolonged exposure is not necessary because ImRs "is not based on extinction, but on processing new, corrective information about the meaning of the event" (p. 719). As a result, their protocol includes (a) affective-memory activation, (b) the patient intervening in the scene as the adult self, and (c) the patient experiencing this intervention by the adult self from the child perspective.
As a therapeutic technique, ImRs is effective in several psychological disorders, such as social anxiety disorder (e.g., Frets et al., 2014;Norton & Abbott, 2016) and PTSD (Raabe et al., 2015; for a review and meta-analysis, see Morina et al., 2017). The question remains, however, whether the proposed revaluation mechanism is indeed what makes ImRs effective. Only two studies assessed this directly. Given that these studies used healthy participants and applied an experimental (i.e., lab) version of ImRs, we refer to this version of ImRs as "ImRs exp ." That is, ImRs has been adjusted to fit into the fearconditioning paradigm. For example, we used a newly acquired fear memory instead of an autobiographical memory, a standardized script instead of a personalized one, and no adult and child perspective. Note that memory activation and alteration are still included in ImRs exp . Dibbets et al. (2012) were the first to study working mechanisms of ImRs exp and extinction in a 1-day fearconditioning paradigm. In their study, participants read a script describing an aversive event (i.e., a little boy dies after a car accident), after which a picture of a car (CS) was paired with a picture of a mutilated boy (US) in Context A. In the manipulation phase, all participants went through an extinction phase, three of which in a new Context B (ABA groups) and one of which in the acquisition context A (AAA groups). Participants received either a script with a positive ending related to the US to imagine during extinction (ABA-ImRs exp ), a US-unrelated script to imagine during extinction (ABA-imagery control), or regular extinction (ABA-no and AAA-no). Return of fear was tested in Context A in all groups. They found that adding ImRs to extinction led to less return of fear and more US revaluation. Note that they also found that ImRs exp led to slower extinction compared with an extinction-only condition. A possible explanation may be that ImRs exp required additional cognitive efforts (i.e., mental imagery), which resulted in a more complex and thus longer learning process. Dibbets et al. (2018), using the same paradigm and population, found that the aversiveness of the US representation was decreased after (vs. before) ImRs exp (without extinction). These findings provide preliminary evidence for revaluation learning in ImRs exp .
Our main aim was to test the distinct and combined effects of extinction and ImRs exp on expectancy learning (targeting the CS) and revaluation learning (targeting the US). We included an ImRs exp -only condition, an extinction-only condition (EXT-only), and an ImRs exp + extinction combination (ImRs exp +EXT) condition to optimally delineate specific effects. We employed a 3-day fear-conditioning paradigm, which allows for consolidation of learned associations (Stickgold, 2005;Walker & Stickgold, 2004), thereby promoting translation to clinical practice, during which fear memories are usually consolidated before treatment (see James et al., 2015;Siegesleitner et al., 2019). An emotional memory was formed on the first day using an aversive film and an acquisition phase. Following Kunze et al. (2015), we used a meaningful reinforced CS (CS+; picture from the film) and US (fragment from the film) to increase ecological validity of the fear-conditioning paradigm. The manipulation (EXT-only, ImRs exp -only, and ImRs exp +EXT) took place on the second day, and a fear reinstatement test took place on the third day. Regarding expectancy learning (as measured by US expectancy and physiological measures; Hypotheses 1-3), we expected a larger decrease in fear from premanipulation to postmanipulation in both extinction conditions (EXTonly and ImRs exp +EXT), compared with the ImRs exp -only condition (Hypothesis 1). Second, in line with Dibbets et al. (2012), we expected slower extinction in the ImRs exp +EXT condition compared with the EXT-only condition (Hypothesis 2). Third, we hypothesized that the ImRs exp +EXT condition would result in lower fear reinstatement than either intervention alone because of the combination of expectancy learning and revaluation, which in theory should address the entire CS → US → CR association (Hypothesis 3). Fourth, and finally, we expected that the ImRs exp conditions (ImRs exp -only and ImRs exp +EXT) would show more US revaluation compared with the EXT-only condition, as measured by US aversiveness and US-related emotion ratings (Hypothesis 4). Exploratively, we looked at revaluation as a mediator between the ImRs exp manipulation and fear.

Participants
A total of 120 participants between 18 and 40 years old were recruited between March 2019 and February 2020. Participants were screened online. Exclusion criteria were a history of physical or sexual assault or abuse, PTSD symptoms (with or without diagnosis), a diagnosis of one or more psychiatric disorders, and serious medical problems. All participants gave written informed consent and received either 16 euros or course credits for their participation. Fourteen participants were excluded from all analyses because of technical problems (n = 8), noncompliance (n = 2), or incomplete data (n = 4). The final sample consisted of 106 participants (27 male; mean age = 22.17 years, SD = 2.85), and the random condition allocation was as follows: ImRs exp +EXT (n = 34), ImRs exp -only (n = 36), and EXTonly (n = 36). The study was approved by the faculty's ethics committee (FETC18-133) and was carried out in accordance with the Declaration of Helsinki.

Material and measures
Film. A short film clip (1.27 min) from Irréversible (Noé, 2002) was shown. The film clip consists of men fighting and shouting in a club, ending in a violent attack with a fire extinguisher. The film clip was rated as aversive and induced distress and anxiety in previous studies (Arnaudova & Hagenaars, 2017;Krans & Bos, 2012).
Conditioning stimuli. Two different pictures were used as CSs. The reinforced CS (CS+) was a picture of a fire extinguisher. The unreinforced CS (CS-) was a picture of a fire reel. The US consisted of a 2-s part of the film clip, including sound, depicting the beating with the fire extinguisher. A meaningful CS+ (i.e., relevant to the violence depicted in the film clip) was chosen to induce a stronger and more meaningful fear association between the CS+ and the US. This increases ecological validity because it mimics real life more closely than a neutral or unassociated CS+ (Carpenter et al., 2019;Kunze et al., 2015). Moreover, Kunze et al. showed delayed extinction compared with a nonmeaningful fear-conditioning procedure (i.e., with a neutral film clip), which improves the assessment of differences between conditions in extinction curves (i.e., Hypothesis 2).

Questionnaires.
PTSD symptoms. To screen for PTSD symptoms, we used the Dutch version of the Primary Care PTSD screen for DSM-5 (PC-PTSD; Prins et al., 2016). The questionnaire consists of six questions that can be answered with yes or no. The first question assesses the presence of potential traumatic events. If yes, five additional questions assess PTSD symptoms according to the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 2013).
Neuroticism. Neuroticism is a personality trait that is associated with enhanced fear learning (e.g., Hur et al., 2016). To check whether neuroticism scores were equally distributed between conditions, the neuroticism scale of the short version of the Eysenck Personality Questionnaire (EPQ) was used (Sanderman et al., 1991(Sanderman et al., , 2012. This scale consists of 12 questions that can be answered with either yes (1) or no (0), which creates a sum score ranging from 0 (low neuroticism) to 12 (high neuroticism). Internal consistency was moderate in our study, Cronbach's α = .69. Imagery ability. Mental imagery is an important component of ImRs. Therefore, the ImRs exp manipulation could theoretically work better for people with higher imagery ability (see Dibbets et al., 2012). Moreover, there is evidence that mental imagery may enhance fear responses in a conditioning paradigm . To measure mental imagery ability, we used the Plymouth Sensory Imagery-Questionnaire (Psi-Q; Andrade et al., 2014). This questionnaire contains seven sets of three items each; every set assesses imagery in a different sensory modality (sight, smell, sound, touch, taste, bodily sensation, and emotion). Participants rated the vividness of their imagery on an 11-point scale from 0 (no image at all) to 10 (as vivid as real life). For the randomization check, a mean score across modalities was created. Internal consistency was good in the current sample, Cronbach's α = .89.
ImRs exp checks. Participants in the ImRs exp conditions were asked how well they were able to imagine the script and how credible they found the story of the script on two scales from 0 (not at all) to 10 (very much). They were also asked to indicate which of the two memories was stronger (script or film memory).
Stimulus ratings. US expectancy was used as an explicit measure of fear and was assessed using a visual analogue scale (VAS) ranging from 0 (not at all expecting the film clip) to 100 (definitely expecting the film clip).
Revaluation learning was assessed with measures of US aversiveness, US-related positive and negative emotions, and, exploratively, US vividness. Participants were instructed to imagine the US as vividly and detailed as possible and then provide their ratings.
Participants rated US aversiveness on a VAS rang ing from 0 (not at all unpleasant) to 100 (extremely unpleasant).
Positive and negative emotions (PNE) regarding the imagined US were rated using a VAS ranging from 0 (not at all feeling like this) to 100 (very much feeling like this). Emotions were chosen on the basis of a pilot study. Positive emotions (PE) included "proud," "hopeful," "strong," and "satisfied." Negative emotions (NE) were "scared," "guilty," "shocked," and "powerless." The means of the four positive and the four negative emotions were used to indicate PE and NE, respectively. For reasons of brevity, we refer to PE and NE as PNE unless it concerns one of the specific scales.
Participants rated US vividness on a VAS ranging from 0 (not at all vivid) to 100 (extremely vivid).
CS+ and CS-aversiveness ratings were collected by showing the two pictures accompanied by a VAS ranging from 0 (not at all unpleasant) to 100 (extremely unpleasant).
Physiological measures. An ActiveTwo electroencephalography system (BioSemi, Amsterdam, the Netherlands) was used to measure fear-potentiated startle (FPS) and skinconductance response (SCR) as additional indicators for fear as a part of expectancy learning. Two reference electrodes were placed on the forehead, approximately 1 cm below the hairline (Blumenthal et al., 2005). The signal was recorded with Actiview software (Biosemi, Amsterdam, the Netherlands).
FPS was measured with electromyography of the left orbicularis oculi muscle using two 4-mm Ag/AgCl electrodes filled with Signa gel (Parker Laboratories, Fairfield, NJ) electrolyte conductive gel. SCR was measured using two 5-mm Ag/AgCl electrodes filled with Signa gel placed on the proximal part of the palm of the left hand, with approximately 1.5 cm between the electrodes (Boucsein et al., 2012).

Procedure
The experiment took place on 3 consecutive days, with 23 to 25 hr between each session (see Fig. 1).
Typical conditioning trial. Each CS trial started with an 8-s CS presentation. US expectancy was rated during the first 7 s of each CS presentation by showing the scale in the bottom of the screen. The startle probe was white noise presented 7.5 s after CS onset through headphones at 100 dB(A) for 50 ms. In reinforced trials, the US was presented directly after CS+ offset. The intertrial interval (ITI) was a black screen with a white fixation cross that appeared for 15 to 25 s (M = 20 s). Noise-alone (NA) trials were similar to the ITI; a startle probe was added at trial onset. Trials were always randomly presented in blocks of one CS+, one CS-, and 1 NA trial, which results in a maximum of two consecutive trials of the same stimulus.

Day 1.
Preparation. The first day consisted of the film and the acquisition phase for all participants. After signing the informed consent, electrodes were attached, and headphones were put on. Participants then rated CS aversiveness (No. 1, prefilm) and completed the EPQ and the PsiQ. Imagery was practiced by a neutral imagery exercise in which participants imagined a neutral script about preparing their lunch as vividly and with as much detail as possible (following Hagenaars & Arntz, 2012). The experimenter guided the practice session by focusing on present tense, first-person perspective, and sensory details. Then, participants watched a film clip. In line with trauma-film instructions (see Arnaudova & Hagenaars, 2017), they were instructed to immerse oneself in the film. After the film, CS aversiveness (No. 2, postfilm) was measured again. Then, eight NA startle probes followed to habituate participants' startle responses.
Acquisition phase. Participants were instructed to learn which of the two CSs would be followed by the film clip and to indicate their expectancy of the film clip on the scale during the CS presentation. Participants started with a practice phase to get used to the timelimited ratings. During this phase, a picture of a swing and a picture of a slide were both presented once, unreinforced, including US expectancy ratings and a startle probe. Then, the acquisition phase started. Both CSs and NA trials were presented eight times. The CS+ was paired with the US six out of eight times (i.e., reinforcement rate of 75%, first and fifth trials unreinforced; see Lonsdorf et al., 2017). The CS-was never paired with the US. After acquisition, participants were shown the two CSs and asked to indicate which image was followed by the film clip to check for explicit contingency ImRs exp -only. Participants in the ImRs exp -only condition were asked to read a script with a positive ending related to the film. They were instructed to immerse oneself in the script, as if they were witnessing the situation in reality. In the script, the participant witnesses the violent attack with the fire extinguisher and calls the security guard, who stops the fight. The injuries do not seem too bad. Participants were instructed on screen to close their eyes and start imagining the script as vividly and detailed as possible and to end their imagery when they heard a tone (i.e., after 8 s). They repeated the imagery of the script 12 times.
EXT-only. Participants in the EXT-only condition were told that they would see the same pictures as the day before and were asked to rate their US expectancy again. They received 12 unreinforced CS+, CS-, and NA presentations (i.e., 36 trials in total). ImRs exp +EXT. Participants in the ImRs exp +EXT condition were given the same script and instructions as the ImRs exp -only condition. Then, they were told that they would see the same pictures as the day before and were asked to rate their US expectancy again. They received 12 unreinforced CS+, CS-, and NA presentations (i.e., 36 trials total). Participants imagined the positive script from CS+ offset until a signal tone (i.e., after 8 s).
Day 3. The third day started with attaching the electrodes and putting on the headphones. The experiment started with US (No. 4, prespontaneous recovery) and CS (No. 6, prespontaneous recovery) ratings, and eight NA startle probes were presented.

Spontaneous recovery phase.
No expectancy measures were included in the manipulation phase for the ImRs exponly condition. Therefore, we included a spontaneous recovery phase to assess premanipulation compared with postmanipulation US expectancy for all groups.
Participants were told that they were going to see the two different pictures from Day 1 (i.e., CS+ and CS-) and that they had to rate their expectancy of the film clip again. Then, four trial blocks were presented, all unreinforced. After these four blocks, US (No. 5, prereinstatement) and CS (No. 7, prereinstatement) ratings were assessed.
Reinstatement phase. Directly after the US and CS ratings and with no further instructions, participants were shown a fixation cross for approximately 30 s. Then, three unexpected US presentations followed. Approximately 20 s after the last US presentation, four trial blocks followed, all unreinforced. After this, US (No. 6, postreinstatement) and CS (No. 8, postreinstatement) ratings were assessed. Participants were debriefed, and all electrodes were removed.

Data preparation
FPS. FPS data were filtered (28-500 Hz), rectified, and filtered again (15.9 Hz; Blumenthal et al., 2005). Because of a variable delay in startle-probe delivery (0-100 ms), we extended the recommended window for peak detection to 20 to 200 ms after probe onset. Peak data were baseline corrected by calculating the mean value of the 30 ms before probe onset through the 20 ms after probe onset and taking that as the baseline value. Peaks were then standardized by creating a z score for each participant's mean response and standard deviation each day, across stimuli (Blumenthal et al., 2005).
SCR. SCR data were filtered (low-pass filter = 10 Hz; notch filter = 50 Hz), after which entire-interval responses were calculated by taking the peak in a 1-to 7-s window after CS onset and applying a baseline correction with the mean of 2 s before CS onset (Pineles et al., 2009). A response criterion of 0.02 μS was applied (Landkroon et al., 2020). Peaks were subsequently range corrected and square root transformed (Boucsein et al., 2012).

Data analyses Prehypothesis analyses.
Randomization check. Sex differences between conditions were tested with a χ 2 test. Differences in age, neuroticism (EPQ), imagery ability (PsiQ), and Day 1 US aversiveness ratings were analyzed with one-way analyses of variance (ANOVAs).
For all analyses, Greenhouse-Geisser (ε < .75) or Huyn-Feldt (ε > .75) correction was applied in case of violation of sphericity. The α level was set at .05 for all analyses. In case of a significant main effect, post hoc tests with Bonferroni correction (α = .05 divided by 3 = .017) were conducted. Effect sizes were reported in case of a significant effect only.
FPS. There was a main effect of stimulus, F(1.82, 165.19) = 43.57, p < .001, η p 2 = .324, 95% CI = [.209, .420]. The mean startle amplitude was higher for the CS+ than for the CS-, t = 4.39, p < .001. The mean startle amplitude for both the CS+ and the CS-were higher than for NA trials, ps < .001.  This stimulus differentiation did not change over time (Stimulus × Time interaction), F(14, 1232) = 0.78, p = .689. There were no interactions with condition, ps > .139, which indicates that there was no evidence for differences between conditions regarding acquisition in terms of FPS (see Fig. 3).

Hypothesis 1: premanipulation compared with post manipulation
Hypothesis 1 stated that US expectancy, FPS, and SCR would decrease more at premanipulation than at postmanipulation in the ImRs exp +EXT and EXT-only conditions than in the ImRs exp -only condition. FPS. The Stimulus × Time × Condition interaction was not significant, F(4, 168) = 1.23, p = .302. The Time × Condition interaction showed a trend, F(2, 84) = 3.02, p = .054, η p 2 = .067. Post hoc t tests showed that the ImRs exp +EXT condition tended to have a larger increase in FPS in general compared with the EXT-only condition, p = .024. There were no other significant differences among conditions, ps > .122.
SCR. The Stimulus × Time × Condition and Time × Condition interactions were not significant, F(2, 103)s < 1.69, ps > .191, which indicates no evidence for changes from premanipulation to postmanipulation in SCR. In conclusion, in line with Hypothesis 1, explicit US expectancy decreased from the final acquisition to first spontaneous recovery trial (i.e., premanipulation vs. postmanipulation) in both EXT groups but not in the ImRs exp -only condition. However, this pattern could not be observed in the FPS and SCR data.

Hypothesis 2: extinction rate
Hypothesis 2 stated that the ImRs exp +EXT condition would show slower extinction than the EXT-only condition, as measured by US expectancy, FPS, and SCR. 137. In line with Hypothesis 2, the ImRs exp +EXT condition showed a slower extinction rate regarding US expectancy for the CS+ than did the EXT-only condition (Extinction Trial 12 vs. Trial 1, p = .018). See Figure 2 for a graphical presentation of the extinction curves.
To gain more insight into our extinction data, we performed an exploratory post hoc χ 2 test to check whether the conditions differed in the number of participants reaching the extinction criterion of the US expectancy rating for the CS+ on the last extinction trial ≤ 25 (see Dibbets et al., 2012). Significantly more participants did not reach the criterion in the ImRs exp +EXT condition (n = 8) compared with the EXT-only condition (n = 2), χ 2 (1, N = 70) = 4.61, p = .032. This was also true after correction for baseline responding to the CS-, χ 2 = 5.48(1, N = 70), p = .019; ImRs exp +EXT: n = 7; EXT-only: n = 1. F(10.07, 604.11) = 2.11, p = .022, η p 2 = .034, 95% CI = [< 0.001, 0.050]. The ImRs exp +EXT condition showed a larger general decrease in FPS than did the EXT-only condition (Extinction Trial 12 vs. Trial 1, p = .030), see Figure 3. This is contrary to Hypothesis 2. To conclude, in line with Hypothesis 2, explicit US expectancy ratings show slower extinction in the ImRs exp +EXT condition compared with the EXT-only condition. However, the FPS data showed, contrary to Hypothesis 2, a larger decrease for both CSs in the ImRs exp +EXT condition. The SCR data revealed no evidence for differences between the EXT conditions.
To gain more insight into the unexpected reduced reinstatement of US expectancy for the CS+ in the ImRs exp -only group, we conducted exploratory post hoc analyses on the spontaneous-recovery data. That is, a 4 (Time: Spontaneous Recovery Trials 1-4) × 3 (Condition: ImRs exp -only, EXT-only, ImRs exp +EXT) repeated measures ANOVA was conducted on US expectancy for the CS+. This analysis revealed a significant Time × Condition interaction, F(4. 02,198.94) = 4.22, p = .003, η p 2 = .079. However, subsequent t tests on the difference scores (Trial 4 -Trial 1) did not show any group differences after α correction, ps > .029.
FPS. The Stimulus × Time × Condition interaction was not significant, F(4, 174) = 0.25, p = .908, but there was a significant Time × Condition interaction, F(2, 87) = 3.45, p = .036, η p 2 = .073, 95% CI = [< .001, .182]. Post hoc t tests showed effects at trend level after α correction. The ImRs exp -only condition tended to show less overall reinstatement than did the ImRs exp +EXT and EXT-only conditions, ps < .024. The ImRs exp +EXT and EXT-only conditions did not differ from each other on FPS at reinstatement, t < 1. Thus, Hypothesis 3 was not confirmed in the FPS data.
SCR. The Stimulus × Time × Condition and Time × Condition interactions were not significant, F(2, 103)s < 1.01, ps > .367, which indicates no evidence for differences in SCR at prereinstatement compared with postreinstatement between conditions. This does not confirm Hypothesis 3.
In conclusion, in contrast to Hypothesis 3, we did not find lower reinstatement in the ImRs exp +EXT condition, but we did find it in the ImRs exp -only condition instead, at least on US expectancy (CS+) and for startle (CS+ and CS-) at trend level.
In conclusion, and counter to Hypothesis 4, we did not find evidence for the expected differences between conditions on US aversiveness and PNE. On the contrary, we found a decrease in PE for the ImRs exp -only condition, but for not the other conditions, from prereinstatement to postreinstatement.
The change in CS aversiveness at premanipulation compared with postmanipulation differed for each condition, as evidenced by a significant Time × Condition interaction, F(2, 103) = 3.40, p = .037, η p 2 = .062, with a marginally significant difference for each stimulus (Stimulus × Time × Condition interaction), F(2, 103) = 2.93, p = .058, η p 2 = .054. Participants in the ImRs exp +EXT (p < .001) and EXT-only (p = .002) conditions showed a larger decrease in CS (i.e., average of CS+ and CS-) aversiveness than participants in the ImRs exp -only condition. The ImRs exp +EXT and EXT conditions did not differ from each other, p = .795. Within-groups analyses showed that in the ImRs exp +EXT and EXT-only conditions, the CSs were rated on average as less aversive at postmanipulation compared with premanipulation, ps ≤ .001, whereas no evidence was found for a change in CS aversiveness in the ImRs exp -only condition at premanipulation compared with postmanipulation, p = .659.
CS aversiveness across stimuli changed differently at prereinstatement compared with postreinstatement between conditions (Time × Condition interaction), F(2, 101) = 4.66, p = .012, η p 2 = .084, and showed no evidence for differences for each stimulus (Stimulus × Time × Condition interaction), F(2, 103) = 2.64, p = .077, η p 2 = .050. Participants in the ImRs exp +EXT condition showed a stronger increase in CS aversiveness than participants in the ImRs exp -only condition, t(65.92) = 2.69, p = .009. The EXT-only condition did not differ from either other condition, ps > .083. Within-groups analyses showed that participants in the ImRs exp +EXT and EXT-only conditions rated both CSs on average as more aversive at postreinstatement compared with prereinstatement, ps < .001. The ImRs exp -only condition did not show a significant change in CS aversiveness at prereinstatement compared with postreinstatement, p = .652. See Table 2 for descriptive statistics.
ImRs exp compliance checks. Participants generally reported being able to imagine the script in both ImRs exp conditions (M = 7.55, SD = 1.26); there was no evidence for differences between conditions, t < 1. Mean credibility of the script was 5.62 (SD = 2.57), and there were no significant condition differences, t < 1. Most participants (73.6%) indicated that their memory of the film was stronger than their memory of the script after the experiment; 17.0% had a stronger memory of the script, and 9.4% indicated that both memories were equally strong. This did not significantly differ between conditions, χ 2 (2, N = 53) = 2.86, p = .240.

Discussion
In the present study, we compared the effects of ImRs and extinction procedures on expectancy and revaluation learning in a 3-day fear-conditioning paradigm. We expected that extinction would mainly target expectancy learning (i.e., US expectancy, FPS, and SCR) and that ImRs exp would mainly target revaluation learning (i.e., US aversiveness and PNE). We did find evidence that expectancy learning occurred in both extinction conditions. However, we did not find evidence for enhanced revaluation learning in the ImRs exp conditions.
Our US expectancy data diverged from our physiological data. Therefore, we discuss US expectancy results first. As expected, compared with the conditions that contained an extinction procedure (Hypothesis 1), the ImRs exp -only condition resulted in a smaller decrease in US expectancy at premanipulation than at postmanipulation. This suggests that ImRs exp and extinction indeed have different working mechanisms. Whereas extinction largely relies on expectancy learning, ImRs exp does not. This may have important implications for clinical practice; specific treatments with distinct working mechanisms may be tailored to different patients (see, e.g., Fisher, 2015;Fisher et al., 2019). Currently, it is unclear whether exposure should precede ImRs (e.g., protocols of Smucker et al., 1995, vs. Arntz & Weertman, 1999. Our results imply that ImRs as a solo intervention does not have the same effects as exposure on expectancy learning. Therefore, if expectancy learning seems necessary for successful treatment, exposure should be included in the treatment procedure. In line with our second hypothesis, we found the expected slower extinction of US expectancy in the combined ImRs exp +EXT condition compared with the EXTonly condition. This is in line with Dibbets et al. (2012), who found slower extinction in an ImRs exp + extinction group compared with an extinction-only group. In addition, combining ImRs exp and extinction led to more participants with unsuccessful extinction compared with participants in the EXT-only condition in both Dibbets et al. (2012) and the current study. There are several possible explanations for this slower extinction in the combination group. First, it may reflect more complex learning, involving divided attention. A previous study indeed found slower extinction if participants had a simultaneous secondary task during extinction with high cognitive load compared with low cognitive load (Raes et al., 2009). Although our participants did not have to allocate their attention to both tasks (updating US expectancy and executing mental imagery) at the same time, participants still had two tasks during extinction. This attention division may increase cognitive load, perhaps in the form of task-switching costs, which could have interfered with extinction. Second, the script included visualizing at least part of the US (i.e., the fire extinguisher) following the CS+, which perhaps could have counteracted extinction learning given that both meanings of the CS+ (predicting the absence and the presence of the US) were rehearsed during the experimental phase in the combination group. For clinical practice, this implies that applying ImRs and exposure simultaneously may not be useful. Rather, therapists may look at the individual needs (e.g., changing appraisals of the negative event vs. reducing overestimation of the negative event reoccurring) of their patient and stick to one strategy at the time. Note that this does not mean that a patient cannot benefit from both strategies. If both ImRs and exposure are deemed necessary, our data imply that one intervention should precede the other instead of combining both.
Regarding Hypothesis 3, we did not find the expected reduced reinstatement for the combined group relative to either manipulation alone. All groups showed reinstatement of US expectancy, and ImRs exp -only participants showed the least reinstatement. There may be several explanations. First, as stated before, ImRs exp does not seem to target expectancy learning, and reinstatement was measured in US expectancy ratings. The potential additional effects of ImRs exp when combined with extinction may not be reflected in these ratings, which results in no differences in reinstatement between the two extinction groups. We did, however, find differences in extinction of US expectancy and impaired extinction in the combined group. In line with this impaired expectancy learning, it could be expected that reinstatement is also stronger in the ImRs exp +EXT group compared with the EXT-only group, which was not the case. The reinstatement procedure might have been too strong to reliably observe group differences (see van Dis et al., 2019). Participants across all groups seemed shocked when presented with the unexpected USs after the spontaneous recovery phase, and many of them indeed indicated after the experiment that they did not expect the US at all. Yet reinstatement was lower in the ImRs exp -only condition, which suggests that the procedure was not too strong for this group. This may have been due to higher US expectancy on the final spontaneous recovery trial because some sort of ceiling effect may have occurred (although the actual ceiling was not reached in terms of the scale that was used to assess US expectancy).
Alternatively, the ImRs exp -only manipulation might have resulted in a new memory (i.e., the new script), which, in turn, may have led to enhanced subsequent extinction learning in the spontaneous recovery phase and thereby reduced reinstatement. Exploratory analyses did show differences in the US expectancy curve in the spontaneous recovery phase, although our data did not provide clear results regarding the exact nature of these differences. Note that expectancy rates did not differ between groups after reinstatement, which suggests that a limited number of extinction trials may be effective in having ImRs exp -only participants "catch up" on expectancy learning. Future research could use a counterbalanced within-subjects design including phased ImRs exp and extinction, making sure that US expectancy extinguishes sufficiently for everyone and controlling for order effects of the two manipulations.
Finally, we did not find the expected enhanced US revaluation for ImRs exp (Hypothesis 4); that is, across measures, there were no group differences on US aversiveness or on positive and negative emotions at premanipulation compared with postmanipulation. The absence of group differences is in line with several other lab studies (e.g., Dibbets et al., 2018;Kunze et al., 2019) and imply that US revaluation may not be a working mechanism specific to ImRs exp . However, the processes leading to US revaluation might differ between extinction and ImRs. That is, presentation of the CS+ in the extinction groups may have evoked a mental representation of the US ) that led to some form of habituation, and, in turn, revaluation of the US (for a similar argument, see Dibbets et al., 2018). On the other hand, imagination of the script in the ImRs exp conditions might have resulted in a change in meaning of the US, which is then reflected in a change in aversiveness and emotion ratings (Arntz, 2012). Alternatively, revaluation in ImRs may take place on the CS+ rather than the US. In that case, ImRs exp should have led to reduced aversiveness regarding the CS+. This was not observed in our data. Rather, extinction appears to have decreased CS aversiveness (i.e., CS+ and CS-) after the manipulation, but there was no evidence for this decrease for the ImRs exp -only condition. The same holds for reinstatement, in which CS aversiveness increased in both extinction conditions but not in the ImRs exp -only condition. This may imply that extinction or expectancy learning-is needed for CS revaluation, whereas US revaluation takes place after extinction and after ImRs. Hence, our data imply that CS revaluation is not a working mechanism of ImRs. Because this is an exploratory finding, replication is required.
Another mechanism of ImRs may be altered memoryassociated cognitions or core beliefs, which are part of the US → CR memory representation (and not simply an assessment of the US alone). For example, after rescripting a memory of a violent event, the perpetrator may remain negative, but the meaning of the event itself in terms of associated beliefs (e.g., of mastery or selfcompassion; Arntz, 2012) may change. Two studies using a trauma-film paradigm found evidence for this. Hagenaars and Arntz (2012) found fewer negative cognitions about the world and less self-blame after ImRs exp compared with an imagery rehearsal control condition. In addition, Siegesleitner et al. (2020) found an increase in mastery after ImRs exp compared with imagery rehearsal. A fear-conditioning paradigm may not adequately assess such higher order cognitions. Future studies may include cognition measures, rather than mere US valence measures, in a different paradigm, such as the trauma-film paradigm, or in ImRs of autobiographical memories.
Remarkably, our physiology data were not fully in line with our US expectancy data. No group differences could be observed on SCR, as opposed to US expectancy, and the FPS data partially showed the same results as US expectancy (i.e., for acquisition) but showed the opposite pattern for Hypothesis 2 (i.e., extinction rate). Diverging results of implicit and explicit measures are, however, not uncommon Boddez et al., 2013). For example, Haesen and Vervliet (2015) found diverging SCR and US expectancy results, which suggests that SCR is not simply a physiological measure of US expectancy. Other researchers (e.g., Sevenster et al., 2014;Soeter & Kindt, 2010) did observe an association between explicit US expectancy and SCR but not FPS. It has also been suggested that FPS is a measure of valence rather than US expectancy (Bublatzky et al., 2013;Lang, 1995), although Mertens and De Houwer (2016) found that FPS changed in line with contingency instructions. These differences might be due to the large amount of noise in physiological data and may indicate the limited reliability of physiological measures (Ney et al., 2018; for a similar argument, see Landkroon et al., 2019).
Our study had several limitations that should be mentioned. First, participants imagined a standardized script in the ImRs exp conditions. This script may have been more relevant to some participants than others. The large variability in credibility ratings seems to support this. The ImRs exp effects therefore may have been reduced. Second, although participants had an imagery practice phase, the experimenter did not guide them through the actual rescripting phase. Even though participants indicated that they could imagine the script quite well, some participants may have executed this task better than others. Third, the spontaneous recovery phase consisted of multiple trials, as is recommended and has been done in previous studies (see Lonsdorf et al., 2017). This procedure allowed the detection of reinstatement effects in all three conditions. However, as a consequence, this may have resulted in extinction learning during the spontaneous recovery phase in the ImRs exp -only condition. Thus, reduced reinstatement of US expectancy in the ImRs exp -only condition may not be solely attributable to the ImRs manipulation. Fourth, our study assessed only age and sex as demographic data, which makes it impossible to evaluate our results in light of different ethnic, cultural, and socioeconomic backgrounds. Because our experiment was quite extensive and answering questions regarding ethnic, cultural, or socioeconomic backgrounds may be a sensitive issue, we decided to stick to basic demographics to not burden our participants more than necessary.
Our study also has several strengths. First, we used a 3-day fear-conditioning paradigm; thus, the manipulations and reinstatement tests took place on consolidated acquisition and manipulation memories, respectively, which promotes translation to clinical practice. Second, we used an audiovisual, meaningful US (following Kunze et al., 2015), which mimics clinical practice more accurately than standard conditioning paradigms. In addition, we used specific measures for US evaluations (i.e., US aversiveness and US-related emotions) instead of commonly used state measures.
In conclusion, our study confirmed that extinction targets expectancy learning, whereas ImRs exp alone does not. We did not find evidence for enhanced US revaluation after ImRs exp . Furthermore, ImRs exp combined with extinction may hamper the speed and effectiveness of extinction. Adding ImRs exp to the extinction procedure did not buffer against reinstatement. We found reduced reinstatement for ImRs exp -only, but this may be distorted by the lack of extinction before reinstatement in this group. Further research is needed to specify effects and mechanisms of ImRs exp and extinction. Our results may also have important clinical implications because tailoring specific treatments to specific patients may be more useful than combining different treatment strategies.

Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article. Funding M. Woelk is supported by Research Foundation-Flanders (FWO-Vlaanderen) Project G069918N.

Open Practices
All data have been made publicly available via OSF and can be accessed at https://osf.io/9wcyt. The design and analysis plans for the experiments were preregistered at OSF and can be accessed at https://osf.io/9wcyt. This article has received badges for Open Data and Preregistration. More information about the Open Practices badges can be found at https://www.psychologicalscience.org/ publications/badges.