Differential Effects of Aging on Autobiographical Memory Tasks

This study examined the role of aging in the recall and recognition of autobiographical memories. Young and older adults submitted personal events during a period of 3 months to an Internet diary. After this period, they performed a cued-recall test based on what, who, and where retrieval cues. Three months later, participants completed a recognition test in which the descriptions of half the entries were altered. The results indicated no age differences on the cued-recall task, but several age differences on the recognition task. Older adults were more susceptible to accept altered entries as authentic, particularly when these changes had been subtle. However, despite their lower performance, older adults were more confident with the accuracy of their decisions. The results suggest that different mechanisms underlie the recall and recognition of autobiographical memories, and that only tasks that subtly tap into source monitoring abilities are affected by cognitive aging processes.

Other studies have demonstrated similar age-related declines in source monitoring. Older adults produced more thoughts and feelings for imagined (e.g., imagining you visited a seminar room) and experienced (e.g., actually visiting a seminar room) events than young adults, who in turn reported more perceptual and spatial information about the two kinds of events (Hashtroudi, Johnson, & Chrosniak, 1990). Whereas the recollection of perceptual and spatial information helped to identify the source of the memory, the recollection of thoughts and feelings did not improve source monitoring. Other studies have demonstrated a similar impaired ability among older adults to access perceptual, spatial, and temporal details of memories (e.g., Gras et al., 2011;Henkel et al., 1998).
A lowered ability to access perceptual, spatial, and temporal details of an original event among older adults may be exacerbated when the original event is relatively remote, misinformation is presented, or participants are instructed to elaborate on misleading details. Frost, Ingraham, and Wilson (2002) demonstrated that misinformation acceptance increased over longer time periods and when participants were encouraged to mentally reconstruct the initial event or visualize misleading details. In Dijkstra and Misirlisoy (2009), older adults performed a recognition task of altered memories submitted 1 year earlier. Substantial false recognition rates of altered memories (39%) were found, especially when the reported events were remote and the altered reports contained changes not essential to the content of the memory. False recognition of foil events was also found to increase in a diary study among college students with an increase in delay and when foils and original records were semantically similar (Barclay & Wellman, 1986).
In short, results of studies using various source monitoring and recognition tasks have demonstrated converging evidence on memory deficits under certain experimental conditions (i.e., remote events, similarity between original and altered or foil events) and in older age. This memory deficit does not appear to be a general impairment in older adults but a more limited ability to access details of the original experience, particularly for remote events. One or two subtle (peripheral) changes in the memory report would hardly affect the reconstruction process, whereas one or two substantial (central) changes would (Dijkstra & Misirlisoy, 2009). Moreover, with changes in peripheral details, the correct decision would involve judgments based on verbatim information from the original reports, whereas central changes alter the gist of the experience. As older adults tend to base their decisions more on gist than on verbatim information (because this strategy is less demanding on attentional resources), central changes would be noticeable for older adults whereas peripheral changes may go unnoticed (Koutstaal & Schacter, 1997).

Cued Recall
In contrast to the established age deficit in source monitoring and recognition, there is evidence for relative age invariance for other memory tasks, such as semantic memory. Piolino, Desgranges, Benali, and Eustache (2002) had young and older adults recall episodic details of recent and remote personal events as well as semantic personal information (e.g., names) from the same time periods. Episodic recall was found to decline more over time with age than semantic recall. A similar result was obtained by Levine, Svoboda, Hay, Winocur, and Moscovitch (2002), in which young adults produced more episodic details for autobiographical memories than older adults, whereas semantic details were produced in equal quantities among young and older adults. In this study, specific probes about the event, time, place, sensory information, and emotion contributed to a reduction of age differences in episodic richness for memories from the past year. Such probes may function as a form of support that reduces age differences in the retrieval of episodic details.
The effect of retrieval support on autobiographical memory recall with cues (who, what, where, when) taken from the original report has been demonstrated to aid the recall of the remainder of the memory in studies with young adults (Burt, 1992), older adults (Catal & Fitzgerald, 2004), and both young and older adults (Dijkstra & Misirlisoy, 2006;Kristo et al., 2009). These studies demonstrated better performance with the what retrieval cue than with other retrieval cues (Burt, 1992;Catal & Fitzgerald, 2004), equal facilitation of the what and who cue (Dijkstra & Misirlisoy, 2006;Kristo et al., 2009), and better performance with multiple cues than with one retrieval cue (Wagenaar, 1986). Together, these findings support the idea that retrieval cues may help reinstate access to details of the original experience, and hence, support accurate retrieval of the memory and its details.

The Present Study
Because of the contrasting results with regard to age differences in source monitoring and recognition in comparison with cued-recall tests with retrieval support, the present study sought to examine these differential effects of aging more closely using a diary study. Using diary entries as a memory base enables the assessment of veridical recall of earlier submitted entries as well as recollection accuracy of authentic and altered diary entries. Moreover, potential confounds with regard to the remoteness of memories and the type of memories (everyday vs. unique memories) across age groups could be avoided.
All diary entries were personal events reported in the same frequency (i.e., three or four events per week) and within the same time frame (i.e., 3 months) by young and older adults. After this period, participants performed a cuedrecall test based on what, who, and where retrieval cues. We presented the cued-recall test immediately after the recording phase to prevent potential floor effects for this relatively difficult test. Three months later, participants completed a recognition test in which the descriptions of half the entries were altered. Retention time was set at 6 months after the start of the diary entry phase to prevent potential ceiling effects for this relatively easy task.
We expected differential effects of aging on the two autobiographical memory tasks. No differences were expected between the age groups in the performance on the cued-recall task, because previous research indicated similar benefits from retrieval support for young and older adults (Dijkstra & Misirlisoy, 2009). There were, however, differences predicted with regard to the efficacy of cues (e.g., Lancaster & Barsalou, 1997). Better retrieval was predicted with multiple retrieval cues over single retrieval cues (cf. Wagenaar, 1986), because multiple cues contain a larger part of the reconstruction of the memory. Moreover, what cues were expected to be more successful than other cues, because they contribute more to the reconstruction of the original experience (the event itself) than where and who cues (Burt, 1992;Catal & Fitzgerald, 2004). The when cue was not included in the study, because earlier findings have shown that this retrieval cue is not helpful (Dijkstra & Misirlisoy, 2006).
However, age differences were predicted in the performance on the recognition task, because source monitoring is involved. A recognition task for an earlier report of an event that may contain alterations not only requires access to the source of the original experience. It involves the additional step of an evaluation of the given report in comparison with what is remembered from the original experience. The detection of subtle alterations is highly taxing on cognitive resources that are in shorter supply with older adults. Therefore, age differences were expected in the acceptance of peripheral changes that did not alter the memory itself relative to central changes that altered the memory. In addition, age differences were expected when there was only one change in the altered report as two changes would be more noticeable. No age differences were expected for unaltered entries, because no comparison with altered details would be needed (Koutstaal & Schacter, 1997;Koutstaal, Schacter, Galluccio, & Stofer, 1999).
In short, the current study examined the role of age in cued recall and recognition accuracy for authentic and altered diary entries. Additional areas of interest were the confidence level with which the recognition decisions were made to obtain a deeper understanding of the relative ease of the decision-making process. Age differences in these outcomes would support the idea of different mechanisms underlying the recall and recognition of autobiographical memory.

Participants
Participants in this study were recruited among students and adults aged 60 and above who lived in or near the city of Rotterdam, the Netherlands. Students received course credit, whereas older adults volunteered their participation. Informed consent was obtained from all participants. Moreover, the study complied with the requirements from the ethics committee of the Erasmus University and consolidated standards on reporting trials. Participants had to have access to email and the Internet, be willing to come to the laboratory twice, and be prepared to keep a diary for 3 months. Twenty-seven young adults and 32 older adults started the study, but two young adults and eight older adults dropped out, leaving 25 young adults (M age = 20.60 years, SD = 2.61 years, range = 18-26 years) and 24 older adults (M age = 65.90 years, SD = 3.06 years, range = 60-71 years). The group of young adults consisted of 22 female and three male participants, whereas the group of older adults consisted of 19 female and five male participants.

Materials and Procedure
The study consisted of four stages. At the beginning of the study, participants came to the laboratory for an initial testing session. During this session, the participants first completed the MMSE, a verbal fluency task, and a memory span task. Subsequently, the procedure for recording personal events in an online diary was explained. These events had to be specific (i.e., not have taken more than several hours) and recent (i.e., occurred that day or up to 2 days before). The descriptions had to be at least 40 words long and contain what, who, and where components. The participants also had to provide ratings, such as the frequency of occurrence (ranging from once per day to once in a lifetime) and the intensity of the emotional reaction during the event (ranging from completely unemotional to extremely emotional), on 7-point scales. Participants practiced entering one or two personal events on the website. The descriptions and the ratings of these events were checked immediately to ensure that the participants fully understood the recording procedure. The participants left with instructions on how to continue these diary entries.
For 3 months, participants recorded three or four personal events per week. During this period, entries were regularly checked to ensure that participants kept recording a sufficient number of events. When this was not the case (and this happened only occasionally), participants received an email, encouraging them to increase their recording rate. At the end of the recording phase, participants were contacted to set the time and date of the second session at the laboratory, where further testing would occur.
During the second session, participants completed a cuedrecall test, in which the activity (what), people (who), and location (where) of the events were used as cues. Participants were initially presented with one of the three possible cues of a previously submitted event (e.g., who had been involved) and asked to give the remaining two (e.g., what the event was about and where it had happened). After the first question, the participant was presented with two cues and asked to give the remaining one (cf. Kristo et al., 2009).
Questions about 18 personal events (about half the total number of events that had been submitted) had to be answered. The events were divided on the basis of their date of occurrence over three time periods (i.e., first six, middle six, and last six) and the presentation order of the cues was counterbalanced over these time periods. Scores for cued recall were calculated as follows: 2 points for a correct answer, 1 point for a partly correct or less specific answer, and 0 points for an incorrect answer or no answer. As two answers could be provided after one cue (e.g., if the cue was who, answers had to be provided for what and where) and only one answer after two cues (e.g., if the cues were who and where, an answer had to be provided for what), a maximum of 4 points could be earned when one retrieval cue had been provided and a maximum of 2 points when two retrieval cues had been provided.
Three months after the second session (M = 93.8 days), participants were contacted again, this time to complete an online recognition test. For this test (cf. Dijkstra & Misirlisoy, 2009), participants were presented with 16 descriptions of earlier recorded events. Events that had been used in the cued-recall test were not used in the recognition test. Half of the 16 descriptions were unaltered entries; the other half had been altered with plausible substitutes. 1 In four altered descriptions, one or two peripheral details were altered (e.g., I wore my hair in a ponytail/hanging down when I went to the ball) and, in the four remaining altered descriptions, one or two central elements related to the gist of the event were altered (e.g., I had an exam today about statistics at the university. It went a lot better/worse than expected). Participants indicated whether the presented descriptions were exactly the same as the descriptions they had entered (yes or no), how confident they were in this decision (on a 5-point scale ranging from not confident at all to highly confident), and how often they had talked and thought about the events on 7-point scales.

Results
To assess whether the events used in the cued-recall and the recognition test were similar across age groups, comparisons were made on the diary entry ratings. As could be expected, properties of the experiences recorded in the online diary were similar across age groups (ps ≥ .057). Age differences were only found for the length of the descriptions and the frequency of occurrence ratings. Descriptions from young adults contained fewer characters (M = 360.9, SD = 60.2) than those from older adults (M = 471.5, SD = 131.5), t(45) = 3.78, p < .001, Cohen's d = 0.974, 95% CI = [51.71,169.46]. Moreover, young adults recorded more frequently occurring events (M = 3.60, SD = 0.49) than older adults (M = 4.52, SD = 0.52), t(45) = 6.29, p < .001, Cohen's d = 1.356, 95% CI = [0.626, 1.216]. These properties could potentially affect performance on the cued-recall and the recognition test. The results reported below were therefore also tested with multilevel analyses (Wright, 1998). Although the length of the descriptions and frequency of occurrence of the personal events varied, none of the results reported below changed because of this variation.
To ensure that differential findings of aging would not be caused by the time of the test, we calculated the correlations between the scores and the age of the events on the individual trials of the cued-recall and the recognition test and compared these correlations across the age groups. The scores on trials of the cued-recall test were affected by age of the event. Events were between 2 and 106 days old (M = 41.8 days), and events that had happened recently were remembered better than events that had happened longer ago, r(846) = −.165, p < .001. This effect of event age was present in both young adults, r(450) = −.186, p < .001, and older adults, r(396) = −.152, p = .002. These two correlations did not differ from each other, Z = −0.51, p = .610. Unlike the cued-recall test, the scores on the trials of the recognition test were not affected by age of the event. The events were between 83 and 218 days old (M = 137.1 days), but events that had happened more recently were not remembered better than events that had happened longer ago, r(736) = −.044, p = .230. When the effect of event age was examined separately for young and older adults, neither correlation was significant, r(368) = −.017, p = .751 and r(368) = −.058, p = .264. These two correlations did not differ from each other, Z = −0.55, p = .582. Because the correlations did not differ between the age groups, any differential findings of aging on the cued-recall and the recognition test will not be caused by the time of the test.

Cued-Recall Test
The first hypothesis predicted age invariance on the cuedrecall test. However, differential effectiveness of retrieval cues and better retrieval on the cued-recall test with more than one retrieval cue were expected. Retrieval after what cues were expected to facilitate retrieval over who and where cues, because these cues have better reconstruction properties. Furthermore, after two retrieval cues, the provided context should sufficiently aid participants to reinstate the initial experience.
To assess the effect of the number of cues, a repeatedmeasures ANOVA was conducted, in which age group was the between-subjects factor and the number of cues was the within-subjects factor. The results demonstrated a main effect of the number of cues on the score (see Table 1), F(1, 45) = 59.13, p < .001, η p 2 = .568. After one cue, participants recalled relatively less information than after two cues. There was, as expected, neither a main effect of age group (p = .447) nor an age group by number of cues interaction effect (p = .352). Both age groups had similar benefits from one retrieval cue and improved their performance in the same way after two retrieval cues. With regard to the type of retrieval cue used, differences were found in how effective the cues were for retrieval. An age group by type of cue ANOVA indicated a main effect of type of cue, F(1, 45) = 47.16, p < .001, η p 2 = .512. Figure 1 shows the average scores after one retrieval cue for each cue type and age group. Performance after one cue (maximum = 24) was, as expected, better with the what than with the who cue, t(46) = 6.86, p < .001, Cohen's d =

Recognition Test
In contrast to the cued-recall test, the second hypothesis predicted age differences on the recognition test. Specifically, age-related differences in recognition were expected for the type of entry, the type of alteration, and the number of alterations with age differences predicted for subtler and fewer changes. We first examined how well participants could distinguish between authentic and altered entries by calculating the d′ and response bias of both groups (see Table 2). Answers were scored as hits (authentic entries correctly recognized as authentic), false alarms (altered entries recognized as authentic), misses or incorrect rejections (authentic entries recognized as not authentic), and correct rejections (altered entries correctly recognized as not authentic). Participants could accurately distinguish between authentic and altered entries (d' = 0.680). Young adults did not outperform older adults (p = .073). Judging from the criterion, all participants were inclined to consider entries as authentic (c = −0.360), but this tendency was stronger among older adults, indicating that they had a stronger bias, t(44) = 2.29, p = .027, Cohen's d = 0.646, 95% CI = [0.073, 1.136].
An age group by type of entry ANOVA yielded a main effect of type of entry (authentic vs. altered) on recognition accuracy (i.e., proportion of correct answers), F(1, 44) = 24.45, p < .001, η p 2 = .357. The main effect of age group (p = .149) and the interaction between type of entry and age group (p = .065) were not significant.
An age group by type of alteration ANOVA revealed a main effect of type of alteration, F(1, 44) = 38.34, p < .001, η p 2 = .466 (see Table 2). Participants performed, as pre- Additional analyses were conducted on the confidence ratings regarding the recognition decisions. One would expect lower performance to coincide with lower confidence. Surprisingly, older adults were more confident than young adults regarding their decisions even though their accuracy rates were lower. An age group by type of entry ANOVA on the confidence ratings indicated a main effect of type of entry (authentic vs. altered), F(1, 44) = 12.98, p = .001, η p 2 = .228, a main effect of age group, F(1, 44) = 16.97, p < .001, η p 2 = .278, and an age group by type of entry interaction, F(1, 44) = 7.11, p = .011, η p 2 = .139. Figure 3 shows the results of this analysis. There were no age differences in the extent to which participants had talked and thought about the personal events they had entered into the diary (p = .721 and p = .777, respectively). The age differences for recognition accuracy and confidence levels can therefore not be explained by differences in rehearsal.

Discussion
The present study examined differential effects of aging on cued-recall and recognition tasks. No age differences were expected for the cued-recall task, because retrieval support was hypothesized to benefit young and older adults equally. Age differences were, however, predicted for the recognition of subtly altered components as it involves source monitoring, a task on which older adults typically perform worse than young adults.
The predictions regarding cued recall were supported by the data. There, as expected, were no age differences, but there were differences for the number of cues (one vs. two) and the type of cue (what vs. where and who). Multiple retrieval cues yielded better performance than single cues, because more details became available for recall. Moreover, what cues facilitated cued recall more than who and where cues. Retrieval is more difficult for everyday events than, for example, for unique events, because they do not stand out in distinctiveness. Therefore, it is not surprising that the what cue turned out to be the most effective cue as it allows a reconstruction along a more central component of the original event (what happened) than components that might be less essential to the event, such as the location of the event (where it happened) or other people who were involved (when it happened). These findings support earlier results with regard to the use of different type or different number of retrieval cues (Burt, 1992;Catal & Fitzgerald, 2004;Wagenaar, 1986). The contribution of our findings to the literature is that, for recent everyday memories, content-based retrieval cues that directly relate to the most important components of the memories (Dijkstra & Misirlisoy, 2006) substantially aid recall, regardless of the age of the person remembering the event. This result is an important finding, because older adults tend to report fewer details of autobiographical memories (McDonough & Gallo, 2013), yet they performed similar to their younger counterparts on the cued-recall task when retrieval support was provided in our study.
Age differences were predicted for the recognition of altered entries as it involved a form of source monitoring that is generally more taxing on available cognitive resources and more susceptible to errors among older adults than young adults. As expected, there was no age difference regarding authentic entries, with all participants being able to correctly recognize their own entries submitted several months ago. Evaluating authentic entries is less taxing on cognitive resources than evaluating altered entries, because the original and evaluated entries are identical. Older adults were, as hypothesized, indeed biased to consider subtly altered entries as authentic and, therefore, less accurate in determining the authenticity for entries with peripheral changes. With regard to substantial changes (central or multiple changes), all participants were generally accurate in noticing these changes. We had also predicted that older adults would have more difficulties than young adults recognizing one change, but they actually performed worse when there had been two changes.
Surprisingly, older adults seemed not to be aware of this deficit in evaluating subtle changes in their reports, because their confidence ratings were higher than those of young adults, even though the latter group performed better on this task. Changes seemed subtle enough that older adults did not notice them and were certain about decisions regarding the authenticity of the entries. Apparently, older adults are less able to reflect upon their own memory performance.
Inflated confidence judgments in older adults have been demonstrated before. Karpel, Hoyer, and Toglia (2001) examined age differences in the qualitative characteristics of real and suggested memories. Young and older participants watched one or two sequences of slides depicting a theft. After a 15-min filler task, they were asked questions about objects and events from the slides and indicated their confidence in the answers. Two questions contained misleading information suggesting the presence of certain objects. Two more rounds of questions followed, in which participants were asked about the presence of real and suggested objects, the confidence in their answer, and the vividness of the memory for the objects. As expected, older adults indicated that they had seen the suggested objects more often than younger adults, demonstrating less efficient source monitoring. Despite their lower performance, older adults gave higher ratings of confidence and vividness. An explanation could be that older adults may be more vulnerable to a possible interference of newly introduced information that competes with original information.
Another possibility is that overconfidence in older adults may be related to the task instructions. For example, larger age differences have been found for intuitive than probability confidence judgments (Hansson, Rönnlund, Juslin, & Nilsson, 2008). Older adults may think intuitively that they perform just as well as their younger counterparts, but, when they are forced to evaluate the specifics of the task and judge their actual performance in detail, their confidence may drop to more realistic levels. Another possibility may be that older participants are more optimistic regarding their recognition performance, because they are not often tested on their actual performance in real life, whereas younger participants, who are mostly psychology students, are tested on their actual performance regularly (i.e., exams and research participation).
The combined results showed that there were no age effects on the initial cued-recall test but that there were effects on the subsequent recognition test. It is possible that this interaction between age group and type of test actually reflects an interaction between age group and time of test. It could be that young and older adults show no differences on how well they remember personal events that have happened between 2 and 106 days ago but that they display differences on how well they remember personal events that have happened between 83 and 218 days ago. However, when we calculated the correlations between the scores and the age of the events on the individual trials of the cued-recall and recognition tests, we did not find differences between the age groups, suggesting that the differential findings of aging were not caused by the time of the test.
The present findings contribute to our understanding of the different mechanisms that underlie cued-recall and recognition processes. Cued recall is a one-step retrieval process that can be augmented with content-based retrieval cues to access other important components of the original experience. Recognition of altered entries requires another step in the retrieval process that involves an evaluation of aspects of the original experience with competing information that has replaced the original information. It may be that this subsequent step of updating that places high demands on cognitive resources is particularly taxing on older adults, particularly when the alterations are subtle. This age deficit may be part of a more general deficit in binding, retrieval, and evaluation of earlier recorded materials in this age group (e.g., Chalfonte & Johnson, 1996;Kuhlmann & Touron, 2012), and hence, cannot be remedied as easily as is the case with retrieval cues.
A promising avenue for future research would be to examine in greater detail strategies that may aid the source monitoring in recognition tasks. For example, an instruction to reactivate perceptual, spatial, and emotional aspects of the original experience as best as possible may stimulate the activation of the same brain structures relevant during the initial experience and help the source monitoring process. Previous research has shown that subjective ratings of reliving coincided with higher activation levels in the auditory and visual association cortex, and activation levels in the amygdala were positively associated with subjective ratings of emotional intensity (Danker & Anderson, 2010;Daselaar et al., 2008). If individuals in general, and older adults in particular, can learn to reactivate the experience better, for example, by focusing on perceptual and sensori-motor details of the experience, false recognition may be reduced. The use of embodied retrieval cues, such as assuming a similar body position as in the original experience, has already shown to be effective in the speed and rate with which memories are being retrieved and remembered later (Dijkstra, Kaschak, & Zwaan, 2007). Reactivating the original experience would then be a first but also major step in the process of attenuating age differences in recognition tasks.

Conclusion
This study demonstrated age invariance with regard to the cued recall of autobiographical memories that had been recorded up to 3 months earlier and age differences in the detection of subtle changes in memory reports that had been recorded up to 6 months earlier. Young and older adults were equally aided by retrieval cues that facilitated the reconstruction of the original experience. The question remains as to whether older adults can be aided in their recollection efforts when the to-be-evaluated information is different from the source. This deficit may reflect a more general age-related impairment in reality monitoring that is particularly prominent for past events (McDonough & Gallo, 2013). Embodied retrieval cues may offer a way to better distinguish between authentic and altered event reports.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by a Toptalent grant awarded to Katinka Dijkstra.

Note
1. To ensure that altered descriptions did not differ on plausibility from authentic descriptions, 82 participants, who had not participated in the main study, were asked to rate the descriptions used in the recognition test on this dimension. We divided the 736 descriptions over 16 conditions. Each condition consisted of 23 altered and 23 authentic descriptions and was taken by four to six participants. The participants rated the descriptions on a scale that ranged from 0 to 100. The altered and authentic descriptions did not differ on plausibility (M = 55.64, SD = 15.40, for altered descriptions vs. M = 56.21, SD = 15.09, for authentic descriptions), p = .612.