Exploring Metalinguistic Knowledge of Low to Intermediate Proficiency EFL Students in Japan

In line with recent studies recognizing positive correlation between metalinguistic knowledge and language proficiency, this study aimed to find out how much metalanguage is understood by low to intermediate proficiency English as a Foreign Language (EFL) learners in Japan. A very simple metalanguage test was designed and administered to 1,180 non-English majors at two Japanese private universities. Some of the participants’ metalinguistic ability was compared with their standardized test scores. For 639 participants who took the Test of English for International Communication (TOEIC) Bridge® test, their reading scores and their metalinguistic ability showed the strongest correlations (r = .66), and for 87 participants whose scores from the Visualizing English Language Competency (VELC) test were available, the strongest correlation between the two tests was also with the reading section (r = .80). The results revealed that even very simple metalanguage, such as noun, adverb, and article, was not recognized by many of the participants.


Introduction
In Japan, the number of 18-year-olds has been steadily decreasing since the 1990s. As the number of possible candidates diminishes, many schools had to compete for new students by diversifying the entrance selection methods and often lowering their entrance academic standards. According to Japan's Ministry of Education, Culture, Sports, Science and Technology (2012), 92% of students who applied for junior colleges, colleges, or universities managed to enter those schools, while only 62% had succeeded in 1990. This has left many universities having to deal with students who are not academically ready for university studies. Ishii, Shiina, Maeda, and Yanai (2007) reported that 87% of the 11,481 university teachers surveyed recognized declining ability among students. Of the 87%, 53% felt that this decline was causing some problems, and nearly 8% felt that the problem was serious enough to disrupt operation of classes. In addition to the lowered academic standards for incoming university students, changes in Japan's English education seemed to have affected students' basic knowledge of English. The new Courses of Studies for junior and senior high school English courses, which emphasize communication, was introduced in 1996. Saita (2003), who tracked prefectural English examination results of entering high school students from 1995 to 2002 reported that English proficiency levels steadily declined during the 8 years, with the biggest drop in 1998, when new students who went through all 3 years of junior high school education under the new Courses of Study. According to a report by National Institute of Multimedia Education, of approximately 3,500 incoming university freshmen at 26 different schools who took their English placement test, only 24% of non-English major students at private universities had high school graduate level of English proficiency, and 33% had less than junior high school graduate level (Ono, 2005). These reports are based on paper-based tests on vocabulary and grammar (Saita, 2003) and listening, vocabulary, grammar, and reading (Ono, 2005), so they do not directly reflect the communication skills of the test takers. However, there is little data suggesting improved English communication proficiency of Japanese university students, and with declined ability on basic vocabulary and grammar, it does not seem reasonable to hope that communication skills alone may have improved (Shite, 2007). The problem here may be that communicationoriented English education replaced the more traditional grammar instructions instead of being added to it. 553601S GOXXX10.1177 1 Fukuoka University, Japan Extensive amount of research on effects of different types of classroom instructions on language acquisition was conducted in the 1980s and 1990s, and Norris and Ortega (2001) conducted research synthesis and meta-analysis of 77 studies published in 21 different journals and a number of edited books published between 1980 and 1998. Comparisons of average effect sizes from 49 studies with sufficient data indicated that "focused L2 instructional treatments consistently outperformed a range of control/comparison or baseline conditions by an average of nearly one standard deviation unit (d = 0.96)" (p. 192), and in both Focus on Form or Focus on Forms interventions, "treatments involving an explicit focus on the rule-governed nature of L2 structures are more effective than treatments that do not include such focus" (p. 195).
Japanese students have little opportunity to use or be exposed to English outside the classes. In a survey of 3,700 high school students (Benesse, 2006), only 6% answered they have ever spoken on the phone in English, 18% have written an email in English, and 21% have read something on the Internet written in English. In such an English as a Foreign Language (EFL) environment, explicit instructions play a crucial role to maximize learners' language acquisition through practice. In university English classes, difficulty arises when the students lack basic knowledge assumed by teachers to have been covered in junior and senior high schools. These students not only lack English proficiency, but they also struggle to understand metalanguage used to describe the target language in classrooms and in textbooks.
Metalinguistic knowledge is "knowledge of technical terms such as 'verb complement' and semi-technical linguistic terms such as 'sentence' and 'clause'" (Ellis, 2009, p. 38 To explain the answers for such questions, metalanguage is essential. English translations of the explanations given in the workbook are as follows: 1. As the blank is followed by a noun accents, it requires an adjective regional to modify the noun.
2. The phrase in the last week at the end of the sentence indicates a period of time in the past, and the have in front of the blank shows that it is a present perfect sentence. Present perfect tense is have + past participle, thus the past participle grown is required.
Without basic metalinguistic knowledge such as noun, adjective, and present perfect, learners will have difficulty understanding explanations by teachers and textbooks for such test items. Before metalanguage can be used to describe another language, metalanguage itself needs describing and explaining to be understood by the audience (Berry, 2005).
Many recent studies investigating correlations between language proficiency and metalinguistic knowledge have found positive correlations between the two. Elder and Manwaring (2004) found significant correlations (r = .69-.76) between an L2 metalinguistic knowledge test and L2 class achievement of students who were studying Chinese at a university in Australia. Renou (2001), in her study of 64 university-level learners of French, compared the results of oral and written grammaticality judgment tests and a French proficiency test. Highest correlations were found between their judgment tests and cloze sections of proficiency tests, which tested learners' knowledge of vocabulary, grammar, and structure. Roehr (2008) compared the results of a German language test and a metalanguage test given to 60 university students studying German at a British university and found a strong correlation (r = .81). Elder (2009) compared the results of English metalinguistic knowledge tests and three English language proficiency tests: computer-based Test of English as a Foreign Language (TOEFL CBT); the Diagnostic English Language Needs Assessment (DELNA), used at the University of Auckland; and International English Language Testing System (IELTS). Correlations were significant with DELNA's reading component (r = .36), all parts of IELTS, especially the reading section (r = .54), and all parts of TOEFL CBT® (listening r = .49, reading r = .57, structure/ writing r = .57, and total r = .61).
Although many of these previous studies measured metalinguistic knowledge through learners' ability to find and/or describe errors in ungrammatical sentences alone or in addition to their ability to identify parts of speech, the first process may be too difficult for low-proficiency L2 learners. Iida, Teele, and Kuwayama (2005) found that their metalanguage test, which asked the participants to find, correct, and describe errors in sentences, was too difficult for low intermediate participants with an average TOEIC® score of 414. As the English proficiency level of the participants of this study ranged from very low to intermediate level, a simpler metalanguage test was necessary.

Research Questions
This study aimed to investigate the following two research questions:

Research Method
Participants A metalanguage test was administered to more than 1,200 first and second year non-English majors at two private Japanese universities. English is a compulsory subject in Japanese secondary schools, so the participants have had at least 6 years of formal English education. Removing non-Japanese students, those who only completed the first page of the double-sided test form, and those who did not sign the consent form left 1,180 participants.

Metalanguage Test
The metalanguage test developed by the author had 36 items in four sections: (a) parts of speech; (b) parts of sentences; (c) tenses, voices, and moods; and (d) other. Participants were asked to look at English sentences and choose a term which best described the underlined part or the whole sentence as below: Choose the name of parts of speech for the underlined word. Some choices may be used more than once, and some choices may not be used at all.
For University B, whose students were expected to have a higher English proficiency than students at University A, another section with 10 items was added. This section was conducted as a pilot for further research on the relationship between metalinguistic knowledge and grammaticality judgment test. Following the format used in Elder (2009), and with possibility of increasing the number of participants in future study in mind, multiple choice format was used. Participants were asked to look at the underlined parts of English sentences and choose an answer which most appropriately describes the error in each sentence as below: Some sentences below are grammatically incorrect. Choose the most appropriate explanation of the error for each sentence. Some choices may be used more than once, and some choices may not be used at all. (See Table 6 for answer choices.) 1. I have a friend who brother is a famous singer. 2. If I see Tom tomorrow, I will ask him about it.
Answer choices included "the sentence is grammatically correct," and 4 of the 10 sentences were in fact correct.
Instructions and answer choices were given in Japanese, as the purpose of this study was to assess the participants' L2 (English) metalinguistic knowledge in L1 (Japanese). Most of the metalinguistic terms tested were sampled from the TOEIC Bridge® official workbook (Educational Testing Service, 2008) and the TOEIC Bridge® official preparation guide (Educational Testing Service, 2002), both of which gave explanations for questions in Japanese. Additional terms that were closely related to the terms used in those books were added. All items were multiple choice questions testing receptive knowledge of metalanguage without any productive items, as Ellis (2004) points out that measuring receptive rather than productive knowledge of metalanguage may have greater validity because "it is learners' understanding of explicit linguistic constructs rather than their ability to articulate metalinguistic rules that is important where language acquisition and use are concerned" (p. 267). The same answer choices were used for all questions in each section, and all sections' answer choices included I don't know to minimize the effect of guessing. Students were encouraged to choose I don't know as the results were intended not only for research but also to understand students' needs in classroom and textbook terminology.
Only simple, high-frequency words were used to minimize test difficulty. A vocabulary analysis using Vocabulary Profile (Cobb, n.d.;Heatley & Nation, 1994) showed that the test consisted mostly of words from the first 1,000 (87%) and second 1,000 (6%) word bands of the General Service List (West, 1953). Although the rest (7%) were categorized as off-list words, these were Tom, New York, movies, and common Japanese names Keiko, Tama, and Yumiko. It is unlikely that these interfered with participants understanding of the sentences.

Rasch Analysis
Rasch analysis using the Winsteps® software package (Linacre, 2010) was conducted on the data from the metalanguage test. Rasch analysis produces measures of item difficulty and person ability on a common equal interval scale measured in log-odds units, or logits. Figure 1 shows the Winsteps variable map with persons ranked by ability on the left and items by difficulty on the right. A position higher on the scale represents greater ability for a person or greater difficulty for an item. When person ability precisely matches item difficulty, the person has a 50% expectation of success. The M on each side of the axis shows the mean of person ability and item difficulty. Participants' mean ability was higher than mean item difficulty, and it can be seen that persons are distributed a little higher than items. As this test was intended to measure metalanguage recognition by low to intermediate proficiency learners, the scarcity of more difficult items does not significantly affect the quality of this study. All this means is that higher proficiency students did not have problems with the metalanguage that appeared in this metalanguage test.

Correlations With Proficiency Tests
The 639 participants' TOEIC Bridge® test total scores ranged from 64 to 164 with a mean of 117.0 and a standard deviation of 19.8. According to Educational Testing Service (2006), 110 to 120 on TOEIC Bridge® is considered equivalent to 280 to 310 on TOEIC®. The 86 participants' VELC test total scores ranged from 326 to 583, with a mean of 444.1 and a standard deviation of 58.4. Although VELC test can only be converted to TOEIC® scores individually using all six sub-scores of the two section test, their conversion guideline (Research group for VELC, 2013) shows that 500 and 400 on VELC tests are approximately 450 and 330 on TOEIC® tests, showing that the average English proficiency of those 86 participants from University B is higher than that of the average English proficiency of 639 from University A.
A Pearson correlation was used to analyze the correlations between the participants' measures from the metalanguage test and listening, reading and total scores from the TOEIC Bridge® test and VELC test. Moderate to high correlations were found between the metalanguage test results and all sections of both proficiency tests (Table 1). In accordance with findings from previous studies (Elder, 2009;Elder & Manwaring, 2004;Renou, 2001;Roehr, 2008), the highest correlations were found with the reading sections (r = .66 for the TOEIC Bridge group; r = .80 for the VELC group), which includes items testing knowledge of grammar and vocabulary. Table 2 lists the results of the parts of speech section of the metalanguage test from the most difficult to the easiest item. The two most difficult items were slowly in "The teacher speaks slowly" and well in "Yumiko can speak English well." The most common erroneous response for both the adverb items was adjective. There were two items testing for adjectives, kind in "The teacher is kind" and small in "There is a small box on the desk." Although the difficulties of those items were not as high as the adverb items, they were commonly mistaken for adverbs. This confusion between adjectives and adverbs may be considered negligible, especially for the low-proficiency learners. However, the difference is often tested in English proficiency tests such as TOEIC® and TOEIC Bridge®. Also, because of the non-saliency of the differences in their forms and the redundancy to sentence meanings, they are difficult to be noticed by learners, requiring instruction for more successful language learning (Schmidt, 1993(Schmidt, , 2001.

Parts of Speech Items
The third most difficult item was the in "The box is small" with only 804 correct responses out of 1,180. The most common wrong response was preposition, probably because those are also small function words. The difficulty of using English articles by Japanese learners has long been recognized and researched (e.g., Akamatsu & Tanaka, 2008;Butler, 2002;Yamada & Matsuura, 1982). However, because of the difficulty, special attention is usually given in teaching of articles. It seems futile trying to teach or learn how to use articles when the learners do not know which words are articles, especially when there are only three of them: a, an, and the.
There were two auxiliary verb items, "I will call you tomorrow" and "Yumiko can speak English well," and they were the fifth and sixth most difficult items in this section. The preposition on in "There is a small box on the desk" was one of the easiest items in this section with 967 correct responses. The easiest items in this section were the three nouns, but even those did not have 100% recognition. While 1,152 participants correctly recognized the name Tom in "Tom is a teacher" as a noun, correct responses for box in "The box is small" decreased to 1,105, indicating that some participants confuse nouns with names. The word teacher in "Tom is a teacher" had even less correct responses, probably because of the location of the word in the sentence. Some participants seem to assume that nouns are always subjects.

Parts of Sentence Items
The parts of sentence section was included because the five sentence patterns S (subject) + V (predicate verb), S + V + C (complement), S + V + O (object), S + V + O + O, and S + V + O + C are very commonly used in Japan to teach English sentence structures. Table 3 lists the parts of sentence pattern items in order of the most difficult to easiest. The seven items only tested complements and objects because subjects and predicate verbs were found to be too easy during piloting. The results show that the majority of participants had difficulty distinguishing between complements and objects, which probably requires grammatical knowledge rather than metalinguistic knowledge alone. The bigger concern here is lower ability participants choosing subjects and predicate verbs, suggesting that those participants did not know what subjects and verbs of sentences were. Whether the ability to distinguish those five sentence patterns can help improve learners' language skills is an issue beyond the scope of this study, but not recognizing the names of these parts could hinder learning, as they are very frequently used in explanations. Table 4 lists items from the third section of the metalanguage test: tenses, moods, and voices, also in order of item difficulty. As expected, causative was the most difficult with only 127 correct responses and the highest difficulty of 3.39 logits of the whole test. The most common erroneous response for this item was past perfect (n = 844). As past perfect is usually taught and memorized as "had + past participle," it is understandable that this item was wrongly recognized as one. At the same time, this suggests the possible overgeneralization of grammar terms, as we see more examples in this section. There was another causative item, "She made her brother clean her room," but this item was significantly easier than the first item, possibly because of the presence of two people. The common errors for this item was passive (n = 105) and past perfect (n = 68). Distinction between sentences with past participles seems to be the problem, as the passive sentence "The book was written by a famous writer" and the present perfect sentence "I have finished my lunch" both had past perfect as the most common errors. These overgeneralizations could be a result of memorizing grammar rules without enough input or meaningful output practices. The second and third most difficult items were participle construction sentences, one with a past participle and the other with a present participle. For both items, many participants chose interrogative. Interrogative sentences were not included as test items because the construction of interrogative sentences, starting with a base form of a verb with no subject, seemed too simple. However, the results suggest that participle construction sentences were incorrectly recognized as interrogative sentences because they do not have a subject at the beginning and start with participles, which look like verbs. Although not a very easy sentence structure for Japanese learners of English to use, the subjunctive sentence "If I were you, I would study harder" was the second easiest item in this section and the fifth easiest in the whole test. The presence of if has most likely helped the recognition.

Tense, Mood, and Voice Items
The item with the lowest difficulty in this section was "I will call you in the morning," with 1,113 out of 1,180 participants correctly recognizing it as a future sentence. If we look back in the first section of the test, will as an auxiliary was not so well recognized. While the item difficulty of will as a future  sentence was −3.07 logits, the difficulty of will as an auxiliary was −0.69 logits. This suggests that future function of the word will was more easily recognized than the grammar term of it. This could be a result of more communication-focused instruction in secondary schools, and understanding the meaning is more useful than knowing the linguistic term of the word. However, not knowing the grammatical function of the word could mean that they are not able to generalize its usage (i.e., will is followed by a base form of a verb, it is inverted to the front of the sentence for questions, etc.) with other auxiliary verbs.

Other Items
The last section intentionally included items for which confusion with similar terms were expected, and the results (Table 5) were not surprising. The most difficult item in this section was antecedent of a relative clause, "The woman who is wearing a red dress is my sister," with only 383 correct responses. Not knowing the term antecedent probably causes very little harm, and it can always be explained as "the noun described by the relative clause." However, among the wrong responses, overwhelming 511 chose subject pronoun, followed by 77 choosing possessive pronoun and 62 choosing object pronoun, raising concerns for the participants' understanding of pronouns. There were two pronoun items in this section, he and his in "He has finished his work." Even though the subject pronoun he was the easiest item in this section, close to 150 participants answered wrong with 53 choosing I don't know and 41 choosing antecedent. The possessive pronoun his, however, was the third most difficult item in this section. While only 706 recognized it correctly, 243 chose object pronoun and 95 chose subject pronoun. Because there are only limited number of pronouns, memorizing the terms and forms would be worth more than the confusion and error caused without the knowledge.
The second most difficult item in this section was watching in "I like watching movies" as a gerund, and as expected, the most common wrong response was present participle. The opposite error of choosing gerund for present participle sleeping in the sentence "The baby is sleeping" was the fourth most difficult item in this section. The fifth was finished in "He has finished his work." The verb with the same past and participle forms was chosen intentionally, and it was indeed confused with the past form. The opposite case with slept in "The baby slept all night" as a past form of the verb had 275 participants choose past participle. The confusion was also tested with a verb with different forms of past and past participle, written in "The book was written by a famous writer" but this was the second least difficult item in this section. The last set of items in this section was the word who as an interrogative and a relative pronoun in "The woman who is wearing a red dress is my sister" and "Who is that woman?" respectively. As expected, those two terms were mistaken for each other, with 84 responding as interrogative for relative pronoun, and 71 doing the opposite.

Grammaticality Judgment Section
This last section of 10 items were only given to participants from University B (n = 524). Table 6 shows, as expected, 10 grammaticality judgment items had higher item difficulties. The most difficult item was an error in participle construction sentence, "Look out the window of a bus, Keiko saw a car accident." However, there were 79 erroneous responses choosing "the verb should be a gerund," which would produce the same form as the correct response, "the verb should be a present participle form." The same can be said for the fifth difficult item, "Tom is going to have his car fix." There were 32 responses choosing to change the verb to its past form, which is the same as its past participle form.
The second most difficult item in this section was an item correcting a verb tense in a subjunctive sentence, "If you have a lot of money, what would you buy?" with an item difficulty of 2.43 logits. There was also one correct subjunctive sentence in this section, but many participants wanted to change the tense of the verb, making its item difficulty 1.49 logits. This suggests that even if a subjunctive sentence with if was easy to recognize with an item difficulty of −1.89 logits (Table 4), correct structures of subjunctive sentences may not be well understood by the participants of this study. Something similar can be said for relative pronoun, which was easy to recognize with an item difficulty of −1.07 logits in the previous section (Table 5). The erroneous responses for three relative clause items in this section suggest that the recognition of relative pronouns does not guarantee the understanding of how they are used. The causative sentences were found to be difficult to identify and were confused with perfect sentences in the previous section (Table 4). In this section, the correct causative sentence "I will have my staff call you by tomorrow night" was the third most difficult item, and again, it was confused with perfect sentence, with 78 participants wanting to change the verb to its past participle form.

Conclusion and Implications
This research addressed two questions: Is there a correlation between English proficiency test scores and metalinguistic knowledge among low proficiency students, and what metalinguistic features can low proficiency students typically recognize? The results supported the findings of other recent studies by Roehr (2008) and Elder (2009) of correlations between metalinguistic knowledge and proficiency test results in more advanced learners, and found significant correlations between students' proficiency test scores and metalinguistic knowledge. The strongest correlation found with both TOEIC Bridge® test and VELC test were with their reading sections, also supporting previous findings. The participants in this study had difficulty identifying basic parts of speech and parts of sentences, and showed confusion with similar sentence structures and other grammar terms, suggesting that many of them lack the metalinguistic knowledge to understand explanations given by teachers and textbooks done using similar metalanguage. Also, the results of the grammaticality judgment items suggest that knowing the metalanguage does not equal to full understanding of the grammatical structures. This means that teaching of metalanguage for its own sake is of little use.
With that in mind, an important point to consider is which metalanguage should be taught and used and which should be avoided. With the spread of communicative language teaching and the controversy over the effectiveness of explicit instruction in the past, the use and knowledge of metalanguage have "suffered increasing marginalization and retreated from their former centrality" (Hu, 2011, p. 63). However, as described in the previous sections, recent studies have shown correlations between learners' metalinguistic knowledge and their general language proficiency. Hu (2011), finding strong correlations (r = .72) between Chinese English as a Second Language (ESL) learners' knowledge of metalanguage and their ability to verbally describe English grammar rules, suggests that "the availability of a working metalanguage was an aid to, rather than a distracting influence on, the participants' internalization of explicit grammar" (p. 73). When a student prepares for an important proficiency test such as TOEIC®, which has a grammar component, how can we explain why one of the answer choices is correct without the use of some metalanguage? How can we guide our students to write or speak more accurately? Is it possible to avoid using metalanguage completely and still give clear and precise explanations? Using metalanguage can also make it possible to generalize grammar rules and allows the learner to apply the same rule in different sentences with different words as Hu (2010) describes: Note. L = I don't know; G = The verb should be a present participle form; A = The sentence is grammatically correct; I = The verb should be a gerund; H = The verb should be a past participle form; D = The verb should be in its past form; B = The adjective should be an adverb; E = The verb should have a third-person-s; J = The relative pronoun should be which; F = The verb should be will + base form (future); K = The relative pronoun should be whose.
. . . the explanatory precision with which a metalinguistic generalization can be made and the efficient delimitation of the contexts to which the generalization applies. Metalingual terms that are appropriately used can pre-empt both under-and overgeneralization of the rules in question. (p. 181) There are, however, cases in which the use of metalanguage can confuse more than assist. As several examples showed in the results of the current metalanguage test, misunderstanding and overgeneralization of grammar rules can result from giving the rules without sufficient explanations and examples. Also, some of the technical metalanguage, such as antecedent, can easily be replaced with simpler terms. Also, the terms past participle and present participle could do more damage than good, as "present" participles are not always used with present sentences and "past" participles are not always used with past sentences. In this current study, for example, 199 participants thought "I have finished my lunch" was a past perfect sentence, most likely from the past participle finished. For such terms that have confusing names but are necessary in instruction, less-confusing names that can still be commonly understood by both teachers and learners may have to be established in class.
The findings in this study can support the importance of metalanguage in second language acquisition, which has been regaining attention (Elder, 2009;Elder & Manwaring, 2004;Hu, 2010Hu, , 2011Iida et al., 2005;Renou, 2001;Roehr, 2008) in a field in which explicit grammar teaching has somehow been sidelined in the shadow of communicative language teaching. Recognizing the importance of metalanguage and explicit knowledge does not advocate the teaching of metalanguage with the goal of learning metalanguage for its own sake or a return to a teacher-centered grammar translation methodology. How much or how little metalanguage should be used would vary in different situations, but what is important is that the metalanguage used is understood by everyone. Investigating metalinguistic knowledge of undergraduates in Hong Kong, Berry (1997) found wide differences between learners in their knowledge, and wide discrepancies between learners' knowledge and their English teachers' expectations, which overestimated learners' actual knowledge. Along with teaching learners necessary terminology, he emphasizes the importance of giving awareness training to teachers so they would realize that their classroom language could be creating problems in their classrooms. Having access to metalanguage "may help sharpen understanding of linguistic constructs" (Ellis, 2004, p. 240) by allowing teachers and learners to talk about the target language with more precision when explaining, clarifying, practicing, using, and reflecting on the use of the language in different types of teaching and learning styles.