Exploring the Effects of Automated Written Corrective Feedback on EFL Students’ Writing Quality: A Mixed-Methods Study

Despite a large number of studies on the adoption of automated writing evaluation (AWE) systems, the effects of automated written corrective feedback (AWCF) on English as a Foreign Language (EFL) students’ writing has been insufficiently documented. This study employed a mixed-method approach to examine such effects because of the significance of AWCF in EFL writing. Using a quasi-experimental design, this study explored how AWCF through Grammarly affected EFL students’ writing quality. A total of 67 EFL students from two intact university English classes participated in this study, with a treatment group receiving two rounds of Grammarly feedback and teacher feedback while a comparison group receiving teacher feedback only. The results of the posttest writing task revealed that the students from the treatment group did not significantly outperform the students from the comparison group in syntactic and lexical complexity, accuracy, and fluency. A follow-up questionnaire consisting of fixed-response and open-ended questions was administered to the students from the treatment group after the posttest to elicit the students’ perceptions of Grammarly feedback effects on their writing. The qualitative findings supported and provided deeper insights into the quantitative results. This study was concluded with a discussion of its limitations and implications.


Introduction
Among different language learning skills, writing skill occupies a crucial part in various levels of learning and tests (J.Zhang, 2019).Students' writing ability can accurately reflect their language proficiency level (Jin & Yang, 2006).However, in the meantime, most English as a Second Language (ESL) students stated that ''expressing ideas in correct English'' (Evans & Green, 2007, p. 8) would be the biggest obstacle for their English learning.Under this background, Bitchener and Ferris (2012) have contended that written corrective feedback (WCF) plays a key role in second language (L2) writing because it might serve as a useful tool for L2 learners to improve their writing performance.Because of the importance of English writing skill and the difficulties facing English learners, it could be meaningful to continue to examine the potential effects of WCF on students' writing quality.
As researchers continue to investigate the effects of various types of feedback on student writing, feedback sources, such as automated written corrective feedback (AWCF) offered by automated writing evaluation (AWE) systems, also gain much attention from researchers.The availability of large corpus of students' writing samples and the development of natural language processing have allowed AWE systems to provide AWCF on student writing.Along with the line of research on the use of AWCF in student writing, researchers have also noted some merits and drawbacks of the feedback.For example, AWCF can help students improve languagerelated issues in their writing (J.Li et al., 2015), and AWE system can also provide immediate feedback for student writing (Fang, 2010).The immediate feedback and the scoring feature of AWE systems might serve as an incentive for student revision (P.L. Wang, 2013) because it is likely that students might revise their writing multiple times to gain a satisfying score.In fact, one advantage of AWE systems is that students can revise their drafts as many times as they want after submission (Warschauer & Ware, 2006), which might in turn, help to enhance students' writing quality through cultivating their autonomy in the process of writing, assessing, and revising (Chen & Cheng, 2008).
However, despite the advantages of AWCF, researchers also pointed out several disadvantages.For example, Aluthman (2016) claimed that sometimes AWCF could be too complex for ESL learners at lower proficiency level to follow.In addition to the feedback complexity, Lai (2010) contended that some AWE systems could provide formulaic or repetitive feedback, which could result in student confusion.In such a case, it would be hard for students to demand for clarification because AWCF might not consider the social aspect of feedback (M.J. Wang & Goodman, 2012).Moreover, Cheville (2004) noted that AWCF might only draw students' attention to surface levels of their writing (e.g., grammar and mechanics), neglecting some deeper levels, such as content or organization.This type of feedback could not lead to the overall development of student writing, and the types of essays on which AWCF could be provided are limited (Ware, 2011).Because of these potential advantages and disadvantages concerning AWCF, it seems necessary for researchers to take into account student perceptions of the feedback when investigating its effectiveness.One reason for this necessity is that considering student perceptions could allow L2 writing practitioners to stay informed of what their students regard as advantages and disadvantages of AWCF, and subsequently to maintain the advantages and find ways to deal with the disadvantages.Through this way, practitioners might be able to help students benefit most from AWE feedback.
Many studies have shown the effectiveness of AWCF on the improvement of students' writing quality (e.g., Dikli & Bleyle, 2014;El Ebyary & Windeatt, 2010;J. Li et al., 2015).For example, J. Li et al. (2015)'s study revealed that Criterion feedback could lead to the improvement of student writing both from one draft to the next, and from the first to the final.Moreover, research has shown that AWCF could be effective for students at differential proficiency levels.For example, Kim's (2014) study demonstrated that AWCF helped both the high-and low-level students to significantly improve their writing quality from the first draft to the revised draft.Despite the effectiveness of AWCF on student writing, there has been a lack of research on how a combination of AWCF with teacher WCF can impact students' writing quality, particularly to Chinese English as a Foreign Language (EFL) students.The investigation of the effects of the combination is necessary as such a combination could be more in line with the ecological validity of L2 writing classroom (Stevenson & Phakiti, 2014), and AWCF should be employed as a complement to teacher WCF, not a replacement (Ware, 2011).
However, to the best of our knowledge, little research has been conducted to examine how the combination affects student writing in comparison to teacher WCF only.One of the few exceptions is Wilson and Czik's (2016) study, in which the authors investigated whether there was any difference in students' writing quality between those who received AWCF with teacher WCF and those who received teacher WCF only.The results showed that there was no significant difference between these two feedback conditions in students' writing quality of final drafts.In terms of writing motivation, students who received both AWCF and teacher WCF exhibited greater writing persistence than their counterparts who received teacher WCF only.Despite these pedagogically informed findings, one limitation of the study was that it did not take into consideration students' perceptions of the AWE feedback they had received.Taken together, the purposes of the present study are to examine the effects of AWCF on students' writing quality, how students perceive it, and the relationship between their perceptions and writing quality.Writing quality in this study was operationalized as complexity (i.e., syntactic and lexical complexity), accuracy, and fluency (CAF).It is worthwhile to conduct the present study because of the unprecedentedly increasing application of AWE systems to L2 writing, and the crucial role of L2 writing in the overall development of students' language proficiency.
The research questions of the present study are: 1. Is there any difference between Grammarly feedback with teacher WCF and teacher WCF only in the writing quality of Chinese EFL students with lower proficiency levels?2. What is the relationship between the degree of satisfaction about Grammarly feedback and writing quality for Chinese EFL students with lower proficiency levels?3. How do Chinese EFL students with lower proficiency levels perceive the Grammarly feedback?
Literature Review

AWCF and L2 Writing
Modest evidence was reported that AWCF has a positive effect on students' writing quality, and two lines of studies can be found as to the effectiveness of AWCF on students' writing quality (Stevenson & Phakiti, 2014).The first line of studies was within-group studies that presented the evidence on the increase of students' writing scores and the decrease of students' errors across multiple drafts produced by the same student writers (El Ebyary & Windeatt, 2010;Liao, 2016;A. Lu & Li, 2016;Parra & Calero, 2019;Thi & Nikolov, 2022).For example, Liao's (2016) study showed that AWCF was effective in helping reduce students' linguistic errors for both revisions and new pieces of writing.The reasons for the effectiveness of AWCF might be that AWE tools are able to provide immediate feedback on student writing (Dikli, 2006), develop student awareness of writing process (Matsumoto & Akahori, 2008), and cultivate students' writing autonomy (Y.J. Wang et al., 2013).However, despite these potential merits of AWCF, researchers have noted some of its drawbacks, such as its excessive focus on linguistic issues in writing (Warschauer & Ware, 2008) and its failure to reflect social, multimodal, and contextual aspects of writing (Vojak et al., 2011).Therefore, to mitigate these drawbacks of AWCF, further research is needed on combining AWCF with other sources of feedback, such as teacher feedback, and examining how these combinations can affect students' writing quality.
The second line of studies were between-group studies that investigated either the effects of AWCF versus no feedback or the effects of AWCF versus teacher WCF on students' writing quality.First, studies that compared the effects of AWCF with no feedback have reported the effectiveness of AWCF (e.g., Barrot, 2023;Franzke et al., 2005;Grimes, 2008).Franzke et al. (2005) examined the effect of Summary Street feedback on students' writing quality, and they found that students receiving the feedback performed significantly better than students receiving no feedback in text quality, content, organization, and stylistic quality.Second, when it comes to comparing AWCF with teacher WCF, studies also have revealed the superiority of the former over the latter (e.g., S. Wang & Li, 2019; Y. J. Wang et al., 2013;Warden, 2000).For example, S. Wang and Li (2019) examined the effects of Writing Roadmap (WRM), one type of AWE system, on student writing.The results of the study indicated that students receiving WRM feedback outperformed students receiving teacher WCF in the aspects of language form, contextual structure, and writing quality when WRM was used to assess student essay.Moreover, when student essay was assessed by the teacher, the WRM feedback helped students produce significantly better essays in writing quality than the teacher WCF did.
Although these between-group studies demonstrated the effectiveness of AWCF on students' writing quality, several design deficiencies should also be noted.Specifically, Stevenson and Phakiti (2014) contended that studies comparing the effects of AWCF with teacher WCF did not clarify how teacher feedback was provided, which made it hard to attribute any observed effects to the provision of one type of feedback.Moreover, they further claimed that designing this type of betweengroup studies should integrate AWCF with classroom setting because AWCF should be used to complement teacher WCF instead of replacing it.In other words, it seemed that more research is needed to explore the effects of the combination of AWCF with teacher WCF on students' writing quality, and whether this combination is more useful than teacher WCF alone.This type of research is needed because, as the prevalence of the use of AWE system in the current digital era, L2 writing teachers may want to stay informed of how effective AWCF is and what role it should play in L2 writing instruction.This is particularly necessary for EFL writing domain where many students are writing their essays through AWE systems.Based on the empirical studies discussed so far, it seems that AWCF may have a positive impact on students' writing quality under certain conditions.However, the empirical studies can provide a general picture as to the potential effectiveness of AWCF at the cost of offering insights into how students actually perceive of the feedback or how the feedback leads to the improvement of their writing quality.For example, although students who receive AWCF could outperform their counterparts who do not receive such feedback, it might be likely that some students might benefit much more from the feedback than others (see D. R. Ferris, 2006).Multiple factors resulting from individual differences may contribute to this type of likelihood.Among the factors, students' perception of feedback is the one that needs to be explored because perception could mediate the extent to which students use feedback to improve their writing (Wilson & Czik, 2016).Therefore, to obtain a deeper understanding about the effectiveness of AWCF, researchers may want to take into consideration student perception regarding the feedback.

Student Perception of AWCF
Studies about the effect of AWCF on student writing should not only deal with how effectively an AWE system works, but also should attach importance to how a student perceives or internalizes the feedback (Jiang & Yu, 2022).Although studies about student perception of teacher WCF have indicated that students in general have a positive attitude toward the feedback (Sinha & Nassaji, 2022), studies about student perception of AWCF have observed mixed findings because students usually hold both positive and negative attitudes toward the feedback.Specifically, students in some studies stated that AWCF was helpful for their writing (e.g., Bai & Hu, 2017;S. Huang & Renandya, 2020;Z. Li et al., 2014).For example, many students in Bai and Hu's study made it clear that Pigai feedback, which is one type of AWCF widely used by Chinese EFL learners, was able to enhance the mechanics and grammar of their writing.In contrast, there are also some studies that have demonstrated students' negative attitudes toward AWCF (e.g., Chen & Cheng, 2008;Cheng, 2017;Lai, 2010).Specifically, because little social learning was involved in the provision of AWCF compared with other sources of feedback, such as peer or teacher feedback, students might experience ''dehumanizing instruction'' (Lai, 2010, p. 442), which might cause ''frustration to students and limited their learning of writing'' (Chen & Cheng, 2008, p. 94).In addition to the social learning aspect, students also reported that AWCF could be confusing and could fail to provide specific and meaningful information as to their written errors (Lai, 2010).Employing questionnaire and focused group interviews, Cheng (2017) investigated how students in his study perceive of the AWCF they received.The students in that study claimed that the AWE system might not accurately assess their writing and consider their feelings when providing feedback, and low scores given by the AWE system could demotivate them to write (Cheng, 2017).In a similar vein, Scharber et al. (2008) revealed a more complex picture about how students perceive of AWCF, in which students might change their positive attitudes toward the feedback to negative attitudes.Specifically, students might first feel engaged in processing the feedback because they wanted to improve their writing scores given by AWE systems through applying the feedback to revisions.However, when they encountered the inaccuracy of the AWCF, they became frustrated with it and decreased the use of it.It seemed that students'''subjective experience'' (Scharber et al., 2008, p. 27) with AWCF played an indispensable role in whether they might continue to use the feedback and how much they might benefit from it.Taken together, because of the mixed findings about students' perceptions of AWCF and the importance of the perceptions on student use of the feedback, a study on the effectiveness of AWCF might want to include an investigation of student perception concerning the feedback, and the relationship between the perception and students' writing quality.

Research Design
This study adopted an embedded mixed-method design (Creswell & Creswell, 2018) to examine the impact of AWCF on university EFL students' writing quality and explore students' perceptions of the feedback.Both quantitative and qualitative questions were answered by this embedded design.According to Creswell and Creswell, the qualitative data in this study were collected after the quantitative data, so it was also an ''explanatory sequential core design'' (p.381).In other words, the embedded design used in this study was to further explore why certain results were generated, to help elaborate on differences in outcome measures, and to elicit participant views on the AWCF so that possible changes might be made for students to benefit most from it.
For the first research question, the independent variables were the feedback conditions.Specifically, students in the experimental group received teacher WCF and AWCF, while students in the control group received teacher WCF only.The dependent variable was students' writing quality in the post-test writing task.For the second research question, eight variables associated with CAF measures and students' perceptions of AWE feedback were used to address the relationship between EFL students' degree of satisfaction about AWCF and writing quality.Quantitative and qualitative methods were used to analyze students' post-test writing task and a questionnaire.The qualitative data in this study were used to triangulate the quantitative data in order to explore further insights into the potential effects of AWCF on student' writing quality.In terms of ensuring the inter-rater reliability of calculating error ratio, 15% of the students' essays were also scored by the other EFL writing instructor in addition to the researcher of the present study.The Pearson Correlation coefficient between the two sets of scores was .87.Then the remaining student essays were randomly distributed between the two scorers.

Context and Participants
This study was conducted in an institute located at southwest China.The participants in this study were non-English majors with lower level of English proficiency.They were sophomores with an average age of 19, including 32 boys and 35 girls.During the study, they were enrolled in a compulsory College English course for all non-English majors in the institute.The course lasted the whole semester of 16 weeks, and the student participants met for 1.5-hr sessions twice a week.An instructor who has been teaching the course for 19 years and holds a Ph.D degree in applied linguistics taught both classes based on similar teaching materials.The participants have been learning English for about 9 years, and none of them had any experience in living or studying abroad.Their majors were electronic business and food science and engineering, and their main goals of studying English were to complete their undergraduate study or to pass College English Tests-Band 4 (CET-4), which is a nationally standardized English test for non-English majors.CET-4 scores have been used by researchers to gauge EFL students' English proficiency because of the well-documented validity and reliability of the CET-4 (e.g., Gao & Min, 2021;S. Huang & Renandya, 2020).As such, the CET-4 scores of the students in the present study were collected and examined, and the results showed that their scores were all below 425 (roughly 50 in TOEFL iBT).In addition, both the classroom observation and the scores of their English final exam in the previous semester revealed that the students in this study were at lower level of English proficiency.All the students had no previous experience in using Grammarly feedback to revise essays.

Research Procedure
In this study, a sample of 67 EFL students were divided into an experimental group and a control group, with the first group including 30 students, and the second group including 37 students.The experimental group received WCF from both the teacher and Grammarly while the control group received WCF only from the teacher.The principal investigator in this study also served as the teacher who taught both groups of students to ensure the students received the same instruction on English learning.During a span of 12 weeks, the students in both groups were asked to complete four writing tasks (i.e., a pre-test writing task, writing tasks 1 and 2, and a post-test writing task) and two revision tasks (i.e., revisions of writing tasks 1 and 2) in total.
At week 1, the principal investigator, who is also the teacher of the two classes, provided a 40-min training session for students in the experimental group who received AWCF from Grammarly.The session was for students to be familiar with how Grammarly provided feedback for their essays.Specifically, the session was conducted in a language lab where each student had access to a computer.The teacher-researcher used three students' essays as examples to explain to the students about the use of Grammarly feedback.Students in the control group were not given the training session since they did not receive Grammarly feedback.After the training session, students in both groups were asked to complete the pre-test writing task to examine whether they started the experiment at a similar level of writing quality.The treatment of this study was from week 3 to week 9. Specifically, the teacher assigned writing task 1 to the students in both groups at week 3.All students completed the task on paper in class.Then the students in the experimental group submitted their writings to the teacher for higher-level WCF on content and organization before they submitted to Grammarly for lower-level WCF on spelling, grammar, and punctuation.In contrast, the students in the control group only submitted their writings to the teacher for both higher-and lower-level WCF.The lower-level WCF provided by the teacher was comprehensive and direct because of the potential advantages of direct WCF (see L. J. Zhang & Cheng, 2021) and the ecological validity of comprehensive WCF in writing class (Q.Liu & Brown, 2015).The other reason for using direct WCF was that both direct WCF and Grammarly feedback are explicit, which helps to validate the results concerning the two types of feedback.Moreover, according to Lee (2008) study, lower proficiency students generally preferred explicit feedback, such as direct WCF.All students revised their writings based on the given WCF at week 5.This procedure of completing writing task 1 was repeated in writing task 2, which was assigned at week 7 and revised at week 9.At week 11, all students were asked to produce their post-test writings that were typed into computers by a research assistant and stored as Word files for later analyses.To check the accuracy of the research assistant's typing, the researcher went through all the typed writings, and addressed any typing issues.Then a questionnaire was administered to the students in the experimental group at week 12. (see Table 1).The directions and the topics of the writing tasks were adapted from the writing tasks of the past CET-4 tests to ensure the tasks' validity and reliability (Teng & Zhang, 2020).The genre of all the four tasks were argumentative essays.According to Y. Huang and Jun Zhang (2020), argumentative essays are widely used to assess students' writing proficiency in large-scale English proficiency tests in China, and CET-4 test is one of such tests for Chinese non-English majors.In fact, non-English majors usually spend a large amount of time practicing argumentative essays to get a good grade in CET-4 because of an examdriven educational system in China (L.Zhang, 2016).In addition, an argumentative essay was chosen as the writing task because research has shown that such a task might elicit relatively long or complex sentences from students and that students might tend to generate more language-related issues when they produce such sentences (Q.D. Liu, 2016).In a similar vein, relatively complex structure of an argumentative essay may pose a challenge to language learners (Connor, 1990;Schiffrin, 1985).Therefore, this study asked students to produce argumentative essays under the assumption that providing more opportunities for students to practice may be helpful for them to improve their writing skills in this type of essays.
Questionnaire.A questionnaire was administered to collect student perceptions about the AWE system.The questionnaire consisted of two parts, with the first part including 10 five-point Likert scale questions (see  2017) study.The reasons for adapting Huang and Renandya's questionnaire were two-fold.First, their questionnaire was administered to EFL students who might have the similar profile to the students of this study.Second, the design of their questionnaire was based on several AWCF studies, and the questionnaire was appropriately implemented in their study to elicit EFL students' perception about AWCF.
The 10 Likert scale questions were composed of three constructs: perceived comprehensibility (three items), perceived usefulness of the feedback for revision (three items), and perceived usefulness of the feedback for English writing performance (four items).The internal consistency of the question items was above the standardized benchmark: calculated as Cronbach Alpha, the reliability coefficients of the first construct was .75, of the second construct was .81,and of the third construct was .72.The Likert scale questions were translated to Chinese and were measured through 5 (strongly agree) to 1 (strongly disagree).For the second part, students were allowed to answer the two open-ended questions using English, Chinese, or a mixture of them to better gather their perceptions about Grammarly feedback.
Then the students' Chinese answers were translated into English.Based on Strauss and Corbin's (1998) study, the two open-ended questions were analyzed through open coding and axial coding.For open coding, different concepts were identified, coded, and summarized after the students' answers to the questions were read multiple times.Then for axial coding, the concepts were analyzed and categorized as different themes.
In order to ensure the accuracy of the translation, a professor teaching translation translated the Chinese version questionnaire back to the English version.A high accuracy was found after closely examining these two versions.Then five EFL non-English majors were asked to complete a pilot test to the questionnaire to check the face validity, after which the wordings of several items were adjusted.For the final Chinese version of the questionnaire, a satisfactory reliability was generated with

CAF Measures
A variety of measures were used to investigate the students' English writing quality because of the multicomponential nature of CAF constructs (Housen et al., 2012).Specifically, in line with Norris and Ortega (2009), syntactic complexity was measured through four indices: 1. the mean length of T-units (MLT), 2. dependent clause per T-unit (DC/T), 3. the coordinate phrases per clause (CP/C), and 4. complex nominal per clause (CN/C).These four indices were selected due to the characteristic of multiple components of syntactic complexity, and the necessity of incorporating measures of subordination, coordination, and phrasal complexity when assessing syntactic complexity (Housen et al., 2012;Johnson, 2017).L2 Syntactic Complexity Analyzer (L2SCA) was used to analyze the four indices in syntactic complexity (X.Lu, 2010).Following Link et al. (2022) and Vasylets and Marı´n (2021), lexical complexity was assessed from the dimensions of lexical diversity and sophistication, and was computed with Coh-Metrix 3.This study employed the metric of textual lexical diversity (MTLD) to address the former and the metric of log frequency of content words (LCW) to address the latter.MTLD was chosen because it is a valid measure of L2 proficiency (Yoon & Polio, 2017), and was least affected by essay length (Mazgutova & Kormos, 2015;McCarthy & Jarvis, 2010).LCW indicates the average word frequency for the log of content words in the CELEX database (McNamara et al., 2014).The reason for choosing LCW was that it is more reliable than the raw frequency of content words pertaining to the indication of lexical sophistication (Kormos, 2011).
In line with a number of studies (e.g., Chandler, 2003;Karim & Nassaji, 2020), this study employed an error ratio to examine students' writing accuracy.An error ratio is calculated by all errors in an essay divided by the total number of words, and multiply 100.Multiple types of errors were concerned in this study, including grammar, vocabulary, spelling, and punctuation.One advantage of an error ratio is that it takes into account the differences of essay length.Finally, fluency was assessed by the total number of words composed by the students within the 30-min time limit.

Treatment
This study used Grammarly to provide WCF for students in the experimental group.Powered by artificial intelligence (AI), Grammarly provides free help in spelling, grammar, and punctuation for students' writings.
Although Grammarly can be accessed through MS or as an app on smartphones, this study was conducted through the website of Grammarly (https://app.grammarly.com).This study used free online version of Grammarly rather than premium version.The free version has several features that should be noted.First, it marks students' writings with scores ranging from 1 to 100 based on writing quality.Second, students can set their writing goals in terms of audience and formality.Third, a variety of lines differing in color are used to underline different flaws in student writing, with red lines representing correctness, blue lines representing clarity, green lines representing engagement, and purple lines representing delivery.Fourth, Grammarly provides explicit WCF in terms of correctness.This type of explicit WCF might make it easier for students to correct errors by themselves.It is also worth noting that the explicit WCF is highly accurate when correcting found errors (Paul & Woll, 2020), particularly for error types commonly made by EFL students, such as error types about determiner, preposition, and spelling (Ranalli & Yamashita, in press).

Data Analysis
SPSS Version 25 was employed to perform the statistical analyses in this study.I utilized an independent sample t-test to answer the first research question.Specifically, an independent sample t-test was performed to examine whether there was a difference in writing quality between the two groups in the post-test writing task.Pearson product-moment correlation analyses were used to answer the second research question, which explored the relationship between EFL students' degree of satisfaction about AWCF and writing quality.The degree of satisfaction was from students' answers to the ten Likert-scale questions, which was the first part of the questionnaire.The qualitative analyses of the two open-ended questions in the second part of the questionnaire were conducted to answer the third research question, with the categorization of the data and the identification of their emerging themes (L.J. Zhang & Cheng, 2021).

Compare Writing Quality Between the Experimental Group and the Control Group
To ensure the comparability of the two groups' writing quality at the beginning of the intervention, an independent samples t-test was conducted with regard to the students' pre-test writing task.The results indicated that there was no significant difference between the experimental group (M = 71.700,SD = 11.55) and control group (M = 73.676,SD = 7.83) in writing quality (t(65) = 20.832,p ..05).Then a series of independent samples t-tests were computed to account for the first research question, which asked whether Grammarly feedback and teacher WCF could lead to better writing quality than teacher WCF only.Tables 3 and 4 showed descriptive statistics and Levene's test results of the independent samples t-tests, respectively.Levene's test results revealed that the equal variances assumption held between the two groups for the variables of MLT, DC/ T, and CP/C in the syntactic complexity measure, the variables of MTLD and LCM in the lexical complexity measure, and both the variables of error rates and word count in the measures of accuracy and fluency.However, the equal variances assumption was violated between the two groups for the variable of CN/C in the syntactic complexity measure.To examine whether there were significant differences in the eight variables between the two groups, independent samples t-tests were conducted for the variables of MLT, DC/T, CP/C, MTLD, LCM, error rates, and word count, while Mann-Whitney U test was applied for the variable of CN/C.The results were shown in Table 5.For this study, both the independent samples t-tests and Mann-Whitney U test used an adjusted p value of .006(.05/8) to address the all eight variables.

Examine the Relationship Between Chinese EFL Students' Degree of Satisfaction About AWCF and Writing Quality
A Pearson Correlation was calculated examining the relationship between the variables of CAF measures and students' perceptions about Grammarly feedback (see Table 6).A weak correlation that was not significant was found regarding the variables of DC/T (r (28) = .208,p ..05),CP/C (r (28) = .247,p ..05),CN/C (r (28) = .158,p ..05),MTLD (r (28) = .155,p ..05),LCM (r (28) = .228,p ..05),Errors (r (28) = .099,p ..05), and fluency (r (28) = .159,p ..05).Students' perceptions about Grammarly feedback were not related to these seven variables of CAF measures.In contrast, a moderate positive correlation was found about the variable of MLT (r (28) = .436,p \ .05)under the measure of syntactic complexity, indicating a significant linear relationship between the MLT and students' perceptions about Grammarly feedback.Students who had more positive attitudes toward Grammarly feedback tended to produce more complex MLT in their writings.

Investigate Students' Perceptions About Grammarly Feedback
The third research question asked how students perceived Grammarly feedback for their writing.To answer this question, a questionnaire of fixed-response questions and open-ended questions was administered to the students who received both teacher and Grammarly feedback.Twenty-nine students submitted their responses to the questionnaire as one student did not submit his response.Their responses of the fixed-response questions were summarized, respectively, in Tables 7 to 9, while responses of the open-ended questions were summarized in Table 10.As Tables 7 to 9 show, more than half of the students strongly agreed or agreed that they could understand Grammarly feedback (58.6%) and they know how to revise based on Grammarly feedback (62.1%).The similar number of students also strongly agreed or agreed that Grammarly feedback could help them correct grammar mistakes (65.5%), get higher score for their compositions (62%), improve the quality of their compositions (65.5%), realize their writing problems (65.5%), and improve their grammar (51.7%).In contrast, less than half of the students strongly agreed and agreed that Grammarly feedback was clear (48.2%), could help them enlarge their vocabulary (37.9%), and enhance their writing performance (44.8%).The results revealed that although the majority of students noted that Grammarly feedback was beneficial to their writing, there were still quite a few students who did not hold positive attitudes toward the feedback.
Table 10 summarized student responses to the two open-ended questions in the questionnaire.Twenty-nine students answered the two questions.For the first question, 11 (38%) students claimed that they most liked grammar and vocabulary feedback offered by Grammarly, seven (24%) students most liked grammar feedback, 9 (31%) students most liked vocabulary feedback, and two (7%) gave no response.For the second question, six (20%) students stated that they least liked convention and punctuation feedback, three (10%) students least liked vocabulary feedback, three (10%) students least liked operational system of Grammarly, two (7%) students least Fan liked grammar feedback, two (7%) students least liked the premium function of Grammarly, one (3%) student least liked the scoring function of Grammarly, and 12 (41%) students offered no response.The results suggested that while most students were in favor of certain type (s) of feedback provided by Grammarly, more than half of the students were still unsatisfied with some functions offered by Grammarly.

Discussion
The primary goal of the present study was to explore whether AWCF with teacher WCF could lead to better writing quality than teacher WCF only for EFL students with lower level of English proficiency.The findings revealed that students who received AWCF with teacher WCF might not outperform students who received teacher WCF only in writing quality.The results were in line with the results of previous research that reported the ineffectiveness of AWCF on student writing (S.Huang & Renandya, 2020;Ware, 2014;Wilson & Czik, 2016), and were contradictory with the results of previous research that reported the effectiveness of AWCF on student writing (Barrot, 2023;Thi & Nikolov, 2022).
There might be three possible explanations for the findings of this study.First, students' lower proficiency levels could prevent them from benefiting from Grammarly feedback (Ghufron & Rosyida, 2018;Koltovskaia, 2020;Lin & Griffith, 2014;Shang, 2022).Grammarly feedback is provided in English, which is not the students' mother tongue, so it may be hard for them to effectively process the feedback.This assumption could be backed up by some students' comments on the feedback.For example, S22 commented that ''sometimes I have to use translator to help me understand Grammarly feedback because it is in English.''S23 said that ''I cannot revise my writing based on Grammarly feedback because I cannot make right revision whatever I do.''According to sociocultural theory, in order for the feedback to be helpful for students' writing, it should be associated with their zone of proximal development (ZPD), which is defined by Vygotsky (1978) as ''the distance between the actual developmental level as determined by [students'] independent problem solving and the level of potential development as determined through problem solving [with the help of more advanced external sources]' ' (p. 84).Referring to the present study, students' drafts submitted to Grammarly could be seen as drafts that reflect their current linguistic knowledge, and revisions based on Grammarly feedback could be regarded as drafts that they can produce with the help of the feedback.In other words, it is this type of feedback from Grammarly that may serve as guidance to help students accomplish something that they cannot fulfill independently at that time.Grammarly feedback might be used as a tool to bridge the gap between students' current writing level and an ideal level.However, if students fail to understand Grammarly feedback, which is a preliminary step for ZPD to work, it is unlikely that the feedback could serve as this type of bridge.Second, students' unfamiliarity with Grammarly feedback might be the other reason why they did not benefit from the feedback.For example, S20 commented ''it is inconvenient for me to use Grammarly because I am not familiar with it.''In fact, for EFL students in China, what they use most frequently is Pigai feedback that provides feedback in students' mother tongue.Similarly, S22 said ''I only used Grammarly twice, so I do not even know what types of feedback it provides.Sometimes the feedback on formatting, such as space, overwhelms me since I do not pay much attention to space issues when writing on paper.''S3 commented ''I do not like Grammarly because it does not provide keyboard for me to write on it with my smartphone.In addition, Grammarly is not made in our country, and sometimes I cannot logon it due to issues of internet connection.''In this case, previous studies have showcased the relationship between the characteristics of web-based learning system and students' perceived ease of use (Ke et al., 2012;Nikou & Economides, 2017;Zhai & Ma, 2022), indicating that students may not want to use a type of learning system if they think it is not easy to use.Moreover, if students are unfamiliar with or are overwhelmed about AWE feedback, they may feel demotivated to use it (e.g., Sommers, 2013;Wilson et al., 2021), which, in turn, can make it hard for them to benefit from the feedback.In fact, according to social cognitive theory (Bandura, 1977(Bandura, , 2012)), this type of demotivation may be detrimental to student learning.
Third, research has shown that students may not attend to AWE feedback if they perceive it as not being useful in their writing (R. Li et al., 2019;Zhai & Ma, 2022).In my study, the questionnaire findings revealed that the majority of students may not attend to Grammarly feedback since more than half students did not strongly agree or agree with the usefulness of Grammarly feedback in enhancing their writing performance.This lack of attention might be one reason why the feedback was not beneficial to the students' writing.Long (1996) noted the necessity of taking into account learners' attention when it comes to the relationship between the positive or negative evidence for language learners and their language acquisition.Referring to interaction hypothesis, Schmidt (1995Schmidt ( , 2001) ) has deliberated why attention is crucial for language acquisition based on the notion of awareness, which consists of two levels: noticing (i.e., a lower level of awareness), and understanding (i.e., a higher level of awareness).In EFL writing domain, the level of noticing is for learners to be aware of any new information provided, while the level of understanding is for them to revise their language errors in writing (Bitchener, 2017).It is in the level of noticing that attention plays an indispensable role.According to Long (1996), attention is a prerequisite for noticing.In my study, the students did not pay sufficient attention to Grammarly feedback because they perceived it as not being useful.Therefore, their perception might make it less likely for them to learn from the feedback, and improve their writing quality since they could not even notice the feedback.
In addition to exploring the effect of Grammarly feedback on students' writing, this study also examined the relationship between students' perception of the feedback and their writing quality in eight variables of CAF measures, with no such relationship found in seven variables.This finding was partially supported by the finding of Sinha and Nassaji (2022) that no correlation was observed between students' feedback perception and their writing accuracy, and also was partially supported by the finding of Shang (2022) that no correlation was observed between students' feedback perception and syntactic complexity and grammatical accuracy of their writing.In contrast, this finding was contradictory with the finding of Rummel and Bitchener (2015) that claimed a connection between students' perception and their feedback retention.The reason for the contradictory might lie in the difference of selecting participants in the two studies.Specifically, in Rummel and Bitchener (2015) study, the student participants were divided into different feedback groups based on their feedback preferences, while the current study randomly divided the student participants into different feedback groups.In this case, the students in the current study were assigned to receive Grammarly feedback not because they prefer the feedback, but because they were asked to receive it.Therefore, it might be possible that no relationship was observed between students' perception and their writing quality in seven variables simply because they did not want to receive the feedback.In addition, the finding of the current study also demonstrated a significant correlation between students' perception and the variable of MLT in students' writing, indicating that the majority of students favored the idea that Grammarly feedback could prompt them to produce long sentences.Indeed, several characteristics of AWE systems, such as immediate direct feedback (Dikli, 2006) and multiple revision opportunities (Warschauer & Ware, 2006), make it more possible for students to produce long sentences after revising their writings.This finding, however, is not consistent with the finding of Shintani ( 2016) study indicating that corrective feedback may not lead to the better quality of student writing in syntactic features.

Conclusion, Limitations, and Implications
This study contributes to the research on the effect of AWCF (i.e., Grammarly feedback) on EFL students' writing quality by employing a sequential explanatory mixed-methods design.The results mainly reveal that the students receiving both AWCF and teacher WCF may not outperform the students receiving only teacher WCF in writing quality.Moreover, students' answers to the questionnaire consisting of fix-response and open-ended questions provide further insights into why the addition of AWCF to teacher WCF could not result in the significant improvement in the students' writing quality.
The present study has several limitations.First, this study did not examine how students implemented AWCF in their revisions, so it would be hard to know how they made use of the feedback and what obstacles they encountered when revising.Future research can explore how AWCF affects amounts and types of errors in student revisions to gain a deeper understanding about the feedback, and can use think-aloud protocols to identify patterns about how students process the feedback they receive.Second, research has validated the potential role of individual differences (e.g., motivation, working memory, etc.) in student writing (see Kormos, 2012).However, this study did not trace the possible changes of such differences resulted from the provision of AWCF.Future research, for example, can investigate whether providing students with the feedback can enhance their writing motivation and consequently improve their writing quality.Third, the sample in this study was the students with lower level of English proficiency, so the finding of this study could not be generalized to students with other proficiency levels.Fourth, this study only examined the effects of AWCF on the genre of argumentative writings.Due to the importance of considering the effect of writing genre and tasks (e.g., Graham et al., 2016;Schoonen, 2012), future studies could be conducted to explore the effect of AWCF across different writing genres and tasks.
Despite the limitations, the present study could offer several implications for EFL writing teachers.First, the integration of AWCF into EFL writing instruction does not necessarily lead to the improvement of students' writing quality.Thus, teachers should be cautious about introducing AWE systems to EFL students, particularly to students with lower proficiency levels (Xu & Zhang, 2022).If they do want to apply AWCF to their writing instruction, they may need to think of effective ways to make students benefit most from it.For example, teachers may want to introduce students with lower proficiency level AWE systems operated with their mother tongue so that it could be easier for them to figure out the feedback provided.Second, based on the students' responses to the fixed-response questions in the questionnaire, it seemed that more than half of the students were in favor of a potentially positive role of AWCF in their writing.However, because of their lower level of English proficiency, they might not be ready to effectively apply the feedback to revise their writings, which, in turn, made it unlikely for them to improve their writing quality.Thus, when asking students to receive AWCF, teachers may want to monitor their revision process and give timely support to individual student.For example, for EFL settings with relatively small class sizes, teachers can provide individualized feedback for students through one-on-one writing conference.In addition, students may mandatorily be asked to revise their writings multiple times to enhance their agency (Liao, 2016) and engagement with the feedback (Z.Zhang, 2020).Third, according to the students' responses to the questionnaire, nearly all students mentioned language-related issues (e.g., grammar and vocabulary) as their obstacles in writing, with no students mentioning the importance of organization and content in their writing.However, these aspects are of great significance in making a good piece of writing.In this case, teachers need to enable students to be aware of the importance of these aspects in writing so that it is more likely for them to develop the ability to produce writing with appropriate organization and content.and patients.You should write at least 120 words but no more than 180 words.
Whether to take a job or go to a graduate school?(argumentative essay) Directions: Suppose you have two options upon graduation: one is to take a job in a company and the other to go to a graduate school.You are to make a choice between the two.Write an essay to explain the reasons for your choice.You should write at least 120 words but no more than 180 words in 30 min.
The importance of lifelong learning (argumentative essay) Directions: For this part, you are allowed 30 min to write an essay commenting on the saying ''Learning is a daily experience and a lifetime mission..''You can cite examples to illustrate the importance of lifelong learning.You should write at least 120 words but no more than 180 words.

Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Table 1 .
The Procedure Used in this Study.

Table 2
), and the second part including two open-ended questions: 1.What do you like most about Grammarly?Why? 2. What do you like least about Grammarly?Why?The first part was adapted from S.Huang and

Table 2 .
Likert-scale Questions Used in This Study.

Table 4 .
Levene's Test Results of the CAF Measures Between the Two Groups.

Table 3 .
Descriptive Statistics of the CAF Measures From the Experimental and Control Groups.

Table 8 .
Perceived Usefulness of Grammarly Feedback for Composition Revision.

Table 10 .
Results of StudentResponses to the Open-ended Questions in the Questionnaire.

Table 9 .
Perceived Usefulness of Grammarly Feedback for Enhancing Writing Performance.