Can Working Memory and Inhibitory Control Predict Second Language Learning in the Classroom?

The role of executive functioning in second language (L2) aptitude remains unclear. Whereas some studies report a relationship between working memory (WM) and L2 learning, others have argued against this association. Similarly, being bilingual appears to benefit inhibitory control, and individual differences in inhibitory control are related to online L2 processing. The current longitudinal study examines whether these two components of executive functioning predict learning gains in an L2 classroom context using a pretest/posttest design. We assessed 25 university students in language courses, who completed measures of WM and inhibitory control. They also completed a proficiency measure at the beginning and end of a semester and reported their grade point average (GPA). WM was positively related to L2 proficiency and learning, but inhibitory control was not. These results support the notion that WM is an important component of L2 aptitude, particularly for predicting the early stages of L2 classroom learning.

Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Article
For decades, researchers and practitioners alike have been interested in predicting which learners are likely to succeed in acquiring a foreign language (L2). A variety of variables have been proposed and explored empirically (for a review, see Dornyei, 2006), with the goal of identifying separable components of language learning ability-that is, language aptitude (Dornyei, 2005;Skehan, 2002). Working memory (WM) and inhibitory control abilities have both been identified as being likely contributors to language aptitude (e.g., DeKeyser & Koeth, 2011;Hummel, 2009;Wen & Skehan, 2011). A number of studies have reported relationships between WM and aspects of L2 learning (see Juffs & Harrington, 2011, for a comprehensive review). However, only Sagarra (2000) has examined the longitudinal effects of WM in classroom second language acquisition, and none has investigated the predictive utility of WM and inhibitory control together within a longitudinal design. Thus, the current study sought to provide preliminary evidence of the predictive validity of these two executive functions as a first step in a research program aimed at elucidating the specific cognitive processes that support learning of L2 grammar and lexicon in a classroom context.
Drawing from theoretical developments in the field of cognitive psychology, there has been increased interest in examining how WM contributes to language aptitude (see Dornyei, 2006;Hummel, 2009;Miyake & Friedman, 1998).
WM refers to a specific set of cognitive processes that are crucial to the processing, storage, and retrieval of information in memory (e.g., Baddeley & Hitch, 1974). The multicomponent model of WM includes a short-term storage component (i.e., slave systems) and an attentional control component known as the central executive (Baddeley & Hitch, 1974). Although these components of WM are correlated, they are empirically and conceptually distinguishable (Engle, Tuholski, Laughlin, & Conway, 1999). Indeed, more contemporary models emphasize the role of the central executive as the primary determiner of individual differences in WM (e.g., Engle, 2002). WM is a capacity-limited system (e.g., Cowan, 2005), such that there is an upper limit to the amount of information that can be actively maintained in the focus of attention. Without active rehearsal, that information fades from WM due to decay and/or interference processes. It is the executive control component (or central executive in Baddeley's parlance) that is responsible for manipulating the contents of WM. There is a growing body of evidence for the relationship between WM and L2 proficiency. Individual differences in WM have been correlated with L2 proficiency as measured by the Test of English as a Foreign Language (TOEFL) scores (Harrington & Sawyer, 1992), reading comprehension tasks (e.g., Miyake & Friedman, 1998), as well as with the use of feedback from recasts in conversational interactions (Mackey, Adams, Stafford, & Winke, 2010;Mackey, Philp, Egi, Fujii, & Tatsumi, 2002). Furthermore, evidence from psycholinguistic studies of online language processing suggests that using an L2 imposes cognitive processing demands that necessitate the control of attention by WM (e.g., Hernandez & Meschyan, 2006;see Kroll & Linck, 2007). Indeed, differences in WM are known to be related to L2 online processing (for a review, see Michael & Gollan, 2005). Although not all studies have found a relationship between WM and L2 skills (Chun & Payne, 2004;Mizera, 2006;Taguchi, 2008), a recent meta-analysis of studies examining the relationship between WM and L2 processing and proficiency outcomes estimated a population effect size (ρ) of .255 (Linck, Osthus, Koeth, & Bunting, 2014). These results suggest that individuals with greater WM resources are better equipped to handle the cognitive processing demands of mastering an L2. It is important to note that the capacity and efficiency of the short-term memory component of WM (specifically, phonological short-term memory), independent of the central executive, are also related to several L2 learning outcomes, including vocabulary learning in the lab (Atkins & Baddeley, 1998), vocabulary use and production skill (O'Brien, Segalowitz, Collentine, & Freed, 2006), and oral fluency development (O'Brien, Segalowitz, Freed, & Collentine, 2007), although, here, we focus exclusively on the central executive component of WM.
Notably, not all researchers are convinced of the role that WM plays in L2 learning. For example, Juffs (2004) reported finding no evidence of a relationship between performance on a reading span task and online L2 sentence comprehension and subsequently suggested that researchers have overstated the usefulness of WM measures in accounting for differences in L2 learning. It is important to note, though, that Juffs included native English speakers in his sample but failed to find WM effects on English "garden path" sentences-a finding that has been well documented in the monolingual literature (e.g., Novick, Trueswell, & Thompson-Schill, 2005). The fact that Juffs failed to replicate the robust effect of WM in the L1 raises the possibility that methodological factors (such as an unreliable measure of WM) could have prevented the detection of an effect in the L2. Notwithstanding, the available correlational data from numerous studies provide evidence of the relationship between WM and L2 learning (e.g., Harrington & Sawyer, 1992;Mackey et al., 2002;Miyake & Friedman, 1998) but do not establish the predictive validity of WM in a classroom context.
Several studies have examined the relationship between WM (i.e., executive control) and learning longitudinally, although they have employed laboratory learning tasks rather than examining L2 learning within a naturalistic classroom context. These studies have examined constructs such as artificial grammar learning (Kempe, Brooks, & Kharkhurin, 2010;Martin & Ellis, 2012;Misyak & Christiansen, 2012), computerized feedback within a laboratory context (Lado, 2008), or performance on a lexical decision task (Akamatsu, 2008). Additional studies have focused on fluency gains in speech production rather than vocabulary and grammar learning (Payne & Ross, 2005) or provided manipulated instruction in a single instructional session (Bergsleithner, 2007). Two studies have examined grammar learning in an L2 classroom context. Sagarra (2000) found that WM, as measured by the reading span task, was unrelated to grammar knowledge on standardized measures. Unlike the present study, the author did not report on the relationship between WM and changes in performance (i.e., learning). Kormos and Safar (2008) did find a relationship between WM and various endof-semester outcomes, but they did not assess proficiency at the start of the semester and thus could not estimate the degree of learning over the semester. The current study aims to fill a void in the literature by providing a preliminary test of the predictive validity for the executive control component of WM as a predictor of L2 grammatical and lexical learning in the classroom using a pretest/posttest design.
The use of ecologically relevant measures, such as outcomes linked to a course's curriculum, can be highly relevant for discovering important components of L2 learning. Specifically, to better understand the individual difference factors that explain variability in learning outcomes within the classroom, it is important to measure linguistic content that is the focus of instruction in that course. For example, a recent longitudinal study examined naturalistic L2 learning in an immersion context, where implicit learning was hypothesized to support the learning of the morphosyntactic properties of the language (Frost, Siegelman, Narkiss, & Afek, 2013). The authors found that implicit learning abilities predicted learning across two semesters; however, they did not directly assess the predictive utility of WM. In the Frost et al. (2013) study, 27 native English speakers studying abroad in Israel were administered measures of L2 morphological processing and semantic priming along with a measure of statistical learning. Scores on the statistical learning task were correlated with performance on the morphological processing measures but not the semantic priming measure, indicating that implicit learning was related specifically to the learning of the novel morphosyntactic properties of Hebrew. Notably, the authors also reported a preliminary study demonstrating that their implicit learning measure was uncorrelated with measures of working memory and general intelligence, suggesting that the relationship they discovered between implicit learning and morphosyntactic processing was independent of these other factors. It is important to note that their implicit learning measure explained between 20% to 32% of variance across the different learning outcomes, and this variance is presumably independent of WM (based on the results of their preliminary study). Thus, there is a significant amount of variance in learning outcomes that remained unexplained. It is also worthwhile to note that naturalistic L2 learning within an immersion context and explicit L2 learning in a classroom context likely place different demands on learning and memory functions. Thus, it is plausible that WM and other executive functions (such as inhibitory control) may significantly contribute to gains in proficiency within the classroom context.
There is growing evidence from research on executive functions that WM and inhibitory control both contribute to the cognitive control of memory and attention but, importantly, that they support different aspects of cognitive control (e.g., Miyake et al., 2000). Inhibitory control refers to one's ability to ignore distracting but irrelevant information or to suppress more habitual responses to perform a less dominant response (e.g., Friedman & Miyake, 2004). These skills may be particularly relevant to L2 learning given the evidence for non-selective activation of both languages and the resulting potential interference between languages (e.g., Dijkstra & Van Heuven, 2002;Kroll, Sumutka, & Schwartz, 2005). Consequently, inhibitory control has also been implicated in L2 comprehension and production processes (e.g., Abutalebi & Green, 2007) independent of WM. There is correlational evidence that individual differences in inhibitory control are related to cognate effects during L2 picture naming (Linck, Hoshino, & Kroll, 2008) and language switch costs during trilingual language switching (Linck, Schwieter, & Sunderman, 2012). However, a recent cross-sectional investigation of aptitude for high-level language proficiency found that working memory but not inhibitory control contributed to successful discrimination of high proficiency L2 learners from less successful L2 learners (Linck et al., 2013). Taken together, these results suggest that inhibitory control supports L2 processing, although it remains unclear whether individual differences in inhibitory control might predict the learning trajectory for mastering an L2. To the best of our knowledge, this hypothesis has not yet been systematically tested within a longitudinal design.

The Current Study
This experiment was designed to examine the predictive validity of executive functions for L2 learning of grammar and vocabulary in a classroom context. Using a longitudinal (pretest/posttest) design, we tested the hypothesis that individual differences in WM (specifically, executive control) and inhibitory control are related to L2 proficiency at the beginning and end of a semester-long language course for university students enrolled in an introductory language class. Critically, with this design, we were also able to examine whether a learner's WM and inhibitory control would be related to his or her degree of learning (i.e., change in proficiency) across the semester as a first step for establishing the predictive validity of executive functions for L2 learning. If adult learners' L2 proficiency is related to individual differences in executive functioning, then we expected WM and inhibitory control to be significantly correlated with L2 proficiency at pretest and at posttest. Furthermore, WM and inhibitory control should account for variability in the amount of learning during the semester, as measured by the change in L2 proficiency between pretest and posttest.
Our study investigated beginning learners of Spanish drawn from either a first semester or third semester course, both of which focused on grammar, reading, and writing. By including a more diverse sample of low proficiency learners, the findings from our study may be more representative of the true relationships and may generalize to a broader range of early learners because we minimize the risk of range restriction, which is known to attenuate measured relationships (e.g., Ghiselli, 1964;Hunter & Schmidt, 1990).

Participants
Native English speakers were recruited from Spanish language courses at a large American university. Participants were enrolled in a first semester or third semester introductory Spanish course. In total, 30 students (18 female, 12 male) participated in the pretest session during the sixth week of the semester during one of three evening sessions. A total of 25 students (8 first semester, 17 third semester) completed the posttest session in a lab outside of class time during the penultimate week of the semester, approximately 8 weeks after the pretest.

Materials
L2 proficiency measures. The criterion measure included 20 multiple-choice and fill-in-the-blank items from the grammar and vocabulary section of the Diplomas de Español como Lengua Extranjera (intermediate level), a standardized test of grammar and vocabulary knowledge published by the Instituto Cervantes (http://diplomas.cervantes.es/candidatos/ modelo.jsp). The test was selected in consultation with subject matter experts in second language acquisition and foreign language instruction, with the goal of measuring proficiency at a similar level, thereby allowing direct comparisons of accuracy within and between language groups. Alternate versions were constructed for the pretest and posttest sessions. The dependent variable in the analysis was a percent correct score computed across all test items. The test-retest reliability of the proficiency scores was .72.
Working memory. Working memory was measured with the operation span task (Turner & Engle, 1989), which is a complex span task that requires simultaneous processing of simple arithmetic operations and storage of words in memory. In this task, participants first view an equation (e.g., (7 × 2) − 5 = 9) and indicate with a button press whether the equation is correct or incorrect, and then briefly view a to-be-remembered word before the next equation is presented. Operation-word pair trials are presented in sets ranging from two to six trials. At the completion of a given set, participants must recall as many of the two to six words from that set as possible. Three sets of each set length were presented, for a total of 60 trials. Participants received one point for each correctly recalled word from trials on which a correct operation judgment was made, thereby requiring participants to adequately attend to both the processing and storage components of the task to score highly. Because the primary processing task involves solving math equations rather than processing language, this WM measure is arguably less dependent on language skills per se and thus was chosen to minimize variance due to differences in L1 language proficiency (cf. reading span task of Daneman & Carpenter, 1980). Inhibitory control. Given recent claims that inhibitory control is an important executive functioning component for L2 use (e.g., Abutalebi & Green, 2007;Bialystok, Craik, & Luk, 2008), the study also included the Simon task (e.g., Simon & Rudell, 1967) as a measure of inhibitory control. In the Simon task, participants view a series of colored boxes (red or blue) on a computer screen and must respond based on the color but not location of the box. On congruent trials, the colored box appears on the same side as the required response. But on incongruent trials, the box appears on the side opposite the required response. Because participants must suppress the natural tendency to respond to the location of the stimulus, this mismatch in stimulus and response locations typically leads to slower correct responses on incongruent trials relative to congruent trials (known as the Simon effect; Simon & Rudell, 1967). Participants completed three blocks of 42 trials, with an equal number of trials in the congruent, incongruent, and neutral (presented at fixation) conditions. The Simon effect (response time [RT] difference between incongruent and congruent trials) was computed for each participant and served as the measure of inhibitory control.
Self-report questionnaire. The questionnaire assessed the participants' prior experience with learning language, including whether they had any study abroad experience. All participants confirmed they were native English speakers and had not studied abroad for more than 2 weeks. Participants were also asked to self-rate their L1 and L2 proficiency levels in the four skills of reading, writing, speaking, and listening. To assess academic performance, participants were asked to supply their current university grade point average (GPA) and their standardized test scores (i.e., Scholastic Aptitude Test [SAT], American College Test [ACT]). However, we had to exclude the standardized test score data from the analysis due to substantial missing data, as well as a number of inconsistencies in the reporting of the test scores.

Procedures
The pretest session took place during an out-of-class session in the sixth week of the semester. After signing an informed consent form, participants completed two paper-and-pencil L2 proficiency measures and also reported their GPA and SAT scores. The posttest session took place 2 months later in a lab equipped with computers to administer the computerized tasks using E-Prime experiment generation software (Psychology Software Tools, Pittsburgh, PA). Participants first completed the posttest form of the two paper-and-pencil L2 proficiency measures. They were then seated in front of a computer and completed the self-report questionnaire, the Simon task, and the operation span task.
The measures of WM and inhibitory control were administered during the posttest session due to time constraints. Because executive functions tend to be relatively stable traits (e.g., Miyake & Friedman, 2012), the timing of this test is not likely to impact the inferences of this study. Although some evidence of the cognitive consequences of bilingualism indicates improvements to executive functions due to experience using multiple languages (see Bialystok, 2010;Diamond, 2013), these effects are thought to accrue over a much longer period of time than a single classroom semester. Therefore, we are confident that the administration of our individual difference measures during the posttest session did not impact the conclusions of our analysis.

Analysis
Separate correlation analyses were conducted with each of the three criterion measures-pretest and posttest proficiency scores, as well as a change score (i.e., posttest − pretest)-and the two individual difference measures of WM and inhibitory control to examine the bivariate linear relationships. In addition, exploratory regression analyses were also conducted to examine the multivariate relationships of both WM and inhibitory control with each criterion measure, and all conclusions reported below were upheld in those more complex analyses (see the appendix). Given the converging evidence across analyses, we focus below on examining zero-order correlations between the criterion measures and individual difference measures. Table 1 provides descriptive statistics for the predictor and criterion measures. 1 On the pretest proficiency measure, the mean score was 41.4% correct (SD = 14.2). The mean score on the posttest proficiency measure was approximately 47.7% (SD = 17.0), and the mean improvement score (i.e., posttest − pretest) indicated an average improvement of 6.3% points (SD = 15.6). Comparing the first and third semester students, the distribution of scores largely overlapped (pretest: first semester range = 13.3%-73.3%, third semester range = 30%-60%; posttest: first semester range = 13.3%-80%, third semester range = 30%-70%), further justifying the combined analysis of students from both course levels.

Results
The mean Simon effect in our sample (43 ms) falls well within the range typically reported in the literature (most studies have reported sample mean effects in the range of 20 to 50 ms; Lu & Proctor, 1995). The sample average WM score (47 out of a maximum of 60) is also in line with findings from other studies employing the operation span task with L2 learners (e.g., Linck, Kroll, & Sunderman, 2009). Visual inspection of the distributions suggested that both WM and inhibitory control scores were normally distributed in this sample.
The mean reported GPA was 3.43 (out of 4), indicating that participants were fairly high performing in their university courses. Visual inspection of the distributions indicated that GPA scores were highly negatively skewed, with most scores falling above 3.0 and only two scores falling below 2.5. GPA scores were also missing data from three participants. Given these issues, we excluded GPA from the main analyses reported below.
The correlation matrix for the individual difference measures is provided in Table 2. In this sample, none of the individual difference measures were significantly correlated, although the correlations were in the expected direction. Prior to analysis, inhibitory control scores were reverse-coded, so that a higher score for both individual difference measures indicated better performance.

Working Memory Is Related to Proficiency and Learning
The results of the correlation analyses are reported in Table  3, and Figure 1 displays scatterplots of WM scores and the pretest (Panel A), posttest (Panel B), and change scores (Panel C) along with their respective correlation lines. As predicted, WM was significantly related to performance and to learning: WM was positively correlated with performance at posttest (r = .40, p = .049, with the 95% confidence intervals excluding zero), although not at pretest (r = .06, p = .79). For the change score measuring learning over the semester, WM also had a positive correlation (r = .38, p = .059, with 95% confidence intervals primarily above zero); this effect was similar in magnitude to the posttest correlation, although the statistical test was only marginal (perhaps caused by the decreased reliability typically found in difference scores, particularly when the two scores are positively correlated; see Edwards, 2001). In contrast, the correlations involving inhibitory control were all near zero, non-significant, and had 95% confidence intervals spanning a wide range of both positive and negative correlation values. These results indicate that WM, but not inhibitory control, was related to L2 proficiency at posttest and, critically, that WM accounted for learning over the course of the semester.

Discussion
The goal of this study was to provide preliminary longitudinal data on the predictive validity of two specific cognitive processing abilities-WM and inhibitory control-for L2 learning in a classroom context. Participants completed an L2 proficiency measure at the beginning and again at the end of a semester. They were also administered individual difference measures of WM and inhibitory control. We found that WM was positively related to L2 proficiency, while inhibitory control did not account for a significant amount of variability. Specifically, greater WM resources were significantly related to greater L2 proficiency at posttest and (marginally) to gains in L2 proficiency across the semester (i.e., posttest-pretest changes).
For the last two decades, there have been claims that greater WM resources could lead to better L2 learning (e.g., Harrington & Sawyer, 1992;Miyake & Friedman, 1998). Yet, to the best of our knowledge, there have been no published studies demonstrating that a learner's WM can predict learning of L2 grammar and vocabulary over time within a university classroom learning context. The present longitudinal study provides evidence of the predictive  validity of WM for L2 classroom learning, thus contributing empirical support to claims that WM is an important component of L2 aptitude (e.g., Hummel, 2009;Miyake & Friedman, 1998), at least at early stages of L2 learning. These results also suggest that individual differences in WM may have a larger impact on learning than inhibitory control, which has been linked to L2 processing differences (e.g., Abutalebi & Green, 2007;Linck et al., 2012). Future research will need to examine WM along with other cognitive processes that have been found to predict learning Note. Working memory and inhibitory control measures were standardized prior to analysis. Inhibitory control scores were reverse-coded so that higher scores for both measures indicate better abilities. CI = confidence interval. † p = .06. **p < .01. outcomes, such as implicit learning ability (Frost et al., 2013), to determine the relative contributions of these factors. We note that WM was not related to performance on the pretest. This might be due to the variability in the students' L2 knowledge and learning experiences prior to taking the courses. Students in the first semester course had limited exposure to Spanish. Students in the third semester course could have entered after completing the first and second semester courses at the university or through a language placement test. Given the variability in the students' proficiency at the start of the semester, as well as the variable amount of time between their last Spanish exposure and the start of the course (allowing for some attrition), it is not entirely surprising that the proficiency test at the start of the semester did not correlate with WM. By contrast, following a semester in which the students have had a relatively homogeneous Spanish classroom learning experience, the correlation between WM and L2 proficiency and learning is evident.
Unlike WM, inhibitory control was unrelated to L2 proficiency. A similar null result was found in a recent investigation of high-level language aptitude, in which high-level language learners were distinguished from less successful learners based on measures of working memory and phonological short-term memory but not inhibitory control (Linck et al., 2013). This finding is hard to reconcile with the growing body of literature implicating inhibitory control in online language processing (e.g., Guo, Liu, Misra, & Kroll, 2011;Linck et al., 2012). These differences in effects may indicate a more important explanatory role for inhibitory control in fine-grained online language processing outcomes than in coarser measures of language proficiency. For example, a recent study of trilingual language switching found that individuals with better inhibitory control abilities experienced smaller language switch costs when switching into or out of the dominant L1 (Linck et al., 2012). This study was motivated by claims in the literature that inhibitory control is an important mechanism for language control during lexical access (e.g., Green, 1998) and in particular during language switching (e.g., Meuter & Allport, 1999). When focusing on language processing measures where specific cognitive processes are implicated (e.g., bilingual lexical access), effects of inhibitory control are found (for other evidence of inhibition during language processing tasks, see Morales, Paolieri, & Bajo, 2011;Van Assche, Duyck, & Gollan, 2013). In contrast, coarser measures of language proficiency may allow other mechanisms or strategies to compensate for a cognitive processing deficiency. For example, perhaps individuals with weaker inhibitory control abilities can compensate during L2 learning by relying more heavily on strategies (e.g., mnemonics) or by more efficiently engaging other cognitive processes (e.g., WM) to support the multifaceted, complex task of language learning.
Another potential explanation of the null inhibitory control results is that the learners in our study may have been at too early a stage of L2 learning for inhibitory control effects to emerge. For example, they may not have had enough experience with controlling the two languages, or they may have still been relying heavily on their L1 to aid L2 use (i.e., L1 transfer; see MacWhinney, 2005) such that L1 inhibition was not necessary. An open question is whether individual differences in inhibitory control might serve as a predictor in more intensive learning contexts such as linguistic immersion, where it has been argued that L1 inhibition supports L2 learning (Linck et al., 2009). It is possible that enhanced inhibitory control abilities in highly proficient bilinguals are just a by-product of practice (Bialystok & DePape, 2009). Alternatively, it is possible that at more advanced proficiency levels or with more extensive time on task, we might find that having better inhibitory control abilities may yield benefits to L2 learning. This is an issue that should be addressed in future research.
Aside from these theoretical considerations, methodological limitations could have driven the lack of inhibitory control effects. Some have argued that the Simon task may not provide the best measure of inhibitory control, perhaps in part because of low reliability as reported in other studies (e.g., Paap & Greenberg, 2013). Indeed, reliability of the Simon effect in our sample (.62) was also less robust (although still within an acceptable range for analysis), and this would limit the magnitude of its relationship with the proficiency measure. This accords with the notion advanced by Blumenfeld and Marian (2013) that bilinguals must manage potential conflict among competing representations (within the lexicon) much more so than between representations and responses. Thus, Stroop-like inhibition tasks, during which stimulus-stimulus conflict must be resolved, may be more relevant to bilingual language control than the control mechanisms employed in Simon-like tasks, in which stimulus-response conflict must be resolved. Future work would therefore benefit from employing a Stroop task to investigate the role of inhibitory control in longitudinal studies of L2 learning.
The final sample size of 25 students is smaller than ideal (although we note that it is on par with other studies of this kind; for example, Frost et al., 2013), as was the number of items in each administration of the proficiency measure. However, in light of these methodological constraints, the strong and positive WM correlations (and regression results; see the appendix) affirm that WM is an important predictor of L2 proficiency, whereas the lack of inhibitory control effects could simply reflect low power and/or restricted range in this sample. Consequently, our future work will attempt to cross-validate and extend these findings with a larger independent sample and more robust criterion measures to verify that the reported relationships are not sample-dependent. Furthermore, we hope to collect longitudinal data from learners at different levels of L2 proficiency to determine whether the cognitive abilities associated with positive L2 learning outcomes remain stable over time (see Linck et al., 2013 for data from a cross-sectional study indicating that WM is indeed related to success at higher proficiency levels). Future work is also needed to assess the predictive contributions of these cognitive abilities in the context of other important variables, including language experience (e.g., frequency of L2 use, prior language learning), to provide a richer understanding of their relative contributions to learning. Notwithstanding, our study has provided preliminary results that suggest this methodology can productively be used to study the cognitive abilities related to L2 learning outcomes.

Conclusion
This study fills a gap in the literature by providing direct evidence of claims that the executive control component of WM is a predictor of L2 classroom learning (e.g., Harrington & Sawyer, 1992;cf. Juffs, 2004;Miyake & Friedman, 1998). Using a longitudinal (pretest/posttest) design, we found evidence that learners with greater WM resources were more likely to succeed in learning their L2. The results of this study indicate that measures of WM are likely to improve the predictive utility of tests of language aptitude. Given that one goal of aptitude research is to identify factors that affect an individual's ability to develop L2 proficiency over time, additional studies using longitudinal designs will contribute more data on the components of aptitude that predict subsequent learning. This line of research could also be extended to other learning contexts (e.g., immersion learning, blended learning) to determine whether WM and other potential components of L2 aptitude, such as inhibitory control, are differentially predictive of L2 learning outcomes across different learning contexts.

Descriptive Statistics by Course
In the main text, descriptive statistics were reported for the entire sample of participants. Below, the descriptive statistics (Ms and SDs) are reported separately for the two course levels (see Table A1). Inferential group comparisons were conducted using independent samples t tests assuming unequal variances. The groups differed on age, L1 reading and writing self-ratings, and L2 writing and listening selfratings. However, no differences were found on the variables that were included in the substantive analyses reported above.

Regression Analysis
To further explore the relationships between the two executive functions and the L2 outcomes, simultaneous regression analyses were conducted with working memory (WM) and inhibitory control, both being allowed to account for variability in the outcome measures (see Table A2). Prior to analysis, the WM and inhibitory control scores were first standardized to z scores to facilitate interpretation and comparison of the effect sizes (Gelman & Hill, 2007). The direction and significance of the WM-criterion relationships essentially replicate those reported in the bivariate correlation analyses in the main text, even when controlling for inhibitory control. Similarly, inhibitory control did not explain variance in outcomes even when controlling for WM.  Note. Working memory and inhibitory control were standardized prior to analysis. † p = .075. *p < .05. ***p < .001.