From a simple to a complex aspectual system: Feature reassembly in L2 acquisition of Chinese imperfective markers by English speakers

This article reports on an empirical study on the acquisition of Chinese imperfective markers (zai, -zheP and -zheR) by English-speaking learners at three proficiency levels. Compared to English, Chinese has a richer imperfective aspect in terms of markers (forms) and features (meanings). Results are presented from a grammaticality judgment task, a sentence–picture matching task and a sentence completeness judgment task. We find that advanced learners are successful in reassembling additional semantic features (e.g. the [+durative] feature of zai and the [+atelic] feature of -zheP) when the first language (L1) and second language (L2) functional categories to which the to-be-added features belong are the same. However, advanced learners have problems in differentiating between the interpretations of the progressive zai and the resultant-stative -zheR, and are not sensitive to the incompleteness effect of -zheP, which indicates that discarding L1-transferred features is arduous for learners. Our findings, in general, support the predictions of the Feature Reassembly Hypothesis (Lardiere, 2009). In addition, there is some evidence obtained for L1 influence, which persists at an advanced stage.


I Introduction
Minimalist syntactic approaches (e.g. Chomsky, 1995) have made a principled distinction between functional and lexical categories. Functional categories and their feature sets have been proposed as the locus of all cross-linguistic differences (Borer, 1984). Learning a second language involves learning the new feature bundles in which the various formal features are mapped onto the target language functional morphology (Slabakova, 2009). Formal linguistic approaches to research in second language acquisition (SLA) have focused mainly on the acquisition of features bundled onto functional categories. Form (lexical item)-function (feature) relationships in the first language (L1) and second language (L2) are not always equal. Learning how features are assembled or reassembled into lexical items is something that L2 learners must engage in when restructuring their grammars. The Feature Reassembly Hypothesis (FRH; Lardiere, 2008Lardiere, , 2009) stresses the reconfiguration of features that exist both in the L1 and L2.
English is argued to be a fully-tensed language but formal markers of aspect in English are not predominant in the verb (Brinton, 1988). There is an aspectual opposition between progressive (verb to be and verbal form in -ing) and non-progressive in English, which pervades the whole of the non-stative verbal system (Comrie, 1976). Although Chinese does not have dedicated inflection to express tense, number, gender, or case, it employs morphemes to mark aspect (Klein et al., 2000: 723). Aspect markers are the only kind of morphology-like devices in Chinese (Gu, 1995) and their functions are argued to be complex (e.g. Li and Thompson, 1989;Lin, 2002Lin, , 2003Smith, 1997;Xiao and McEnery, 2004). Aiming to explore detailed mechanisms in the feature reassembly process and learning difficulties that they raise, the present study focuses on the L2 acquisition of Mandarin imperfective markers (zai, -zhe P and -zhe R ) and their features by English speaking learners. The organization of this article is as follows. We first compare the imperfective marking systems of English and Chinese in Section II, introduce relevant L2 theories and previous acquisition research on Chinese imperfectives in Sections III and IV, respectively, and then report our experimental study as the main body of the article.

II Imperfective marking in English and Chinese
Aspect is a grammatical category to present a situation from a particular viewpoint and depict how the situation unfolds over time (Klein, 2009). In more recent research, linguists treat aspect as an interactive system and distinguish grammatical aspect (or viewpoint aspect) and lexical aspect (also known as situation aspect, inherent aspect or Aktionsart/Aktionsarten) (Slabakova, 2005;Smith, 1997), both of which contribute to the aspectual meaning of a sentence.
The most basic grammatical/viewpoint aspect opposition is perfective and imperfective (Comrie, 1976;Smith, 1997). Perfectivity indicates the view of a situation as a single whole and bounded, while the imperfective is viewed as not bounded, linguistically presenting an open situation. We adopt Smith's (1997) analysis that imperfectives allow inferences about beginnings and endings: imperfectives focus on internal phases of an event or on present external stages (e.g. the preliminary or the resultant stages) of a situation. Specific imperfective meanings will be discussed in detail later. On the other hand, lexical/situation aspect concerns the classification of eventualities, regarding their temporal properties. Vendler's (1957) four-way classification (i.e. states, activities, accomplishments and achievements) represents an early attempt to categorize lexical aspect and is followed by the majority of research concerning tense-aspect systems. Based on Vendler's (1957) model, Smith (1997) uses dynamism, telicity and duration to distinguish five basic situation types (with semelfactive as a newly added class). Characteristics of the situation types can therefore be presented as a cluster of conceptual temporal features (i.e. [±dynamic], [±telic] and [±durative]).
In terms of the syntactic status, aspect is normally treated as a functional category and has a maximal projection itself, which is AspP (Aspect Phrase). Recent studies on aspect propose that information about lexical and grammatical aspect are located in different AspPs: a vP-internal or inner aspect projection for lexical aspect (e.g. Borer, 2005;MacDonald, 2011;Travis, 1991Travis, , 2010) and a vP-external or outer aspect projection for grammatical aspect (e.g. Nossalik, 2010;Slabakova, 2001). Outer aspect has morphological manifestations (such as aspect markers) that inner aspect usually does not (MacDonald, 2006). The two AspPs encode different aspectual features: the semantic features [±telic] are normally checked at the inner AspP and aspectual features such as [±perfective] are checked through overt tense/aspect morphology at the outer AspP (Salaberry, 2008).
Aspect has been widely discussed in English grammar. Three main types of English imperfective are explored in the literature: habitual, 1 progressive and resultant-stative 2 (Comrie, 1976;Smith, 1997). The latter two meanings are expressed by the same form: the auxiliary be and the morpheme -ing, as shown in Table 1. According to Smith (1997), progressive and resultant-stative focus on different intervals of a situation: the former presents an interval of an event that includes neither its initial nor final endpoint, and that precedes the final endpoint; whereas the latter presents an interval of a positional or locative that follows the final endpoint of a change of state. The two viewpoints differ also in dynamism. For example, (A1) in Table 1 means that the event of eating an apple is ongoing, which is dynamic. However, (A2) does not convey that he is in the process of assuming a seated position, but presents a resultant interval: he is already seated, which is non-dynamic and does not consist of successive stages.
Since the form of grammatical aspect remains the same in English imperfectives, lexical aspect plays a decisive role in the resulting aspectual interpretation. The English imperfective marker normally co-occurs with non-stative events. Activities and accomplishments, which refer to a dynamic action, can work with be + -ing to express that an event is in progress, as (1a) below illustrates. However, if the verb indicates a resultative state after a telic event, such as wear in (1b), which specifies the state of having clothes already on the body as the result of putting on, the whole imperfective phrase receives a resultant-stative reading. Verbs that involve a resultant state always have a positional or locative property, such as sit and wear. In addition, when the imperfective marker works with an achievement verb, such as reach in (1c), the verb constellation expresses a predictive situation happening in the future, which is a form of Futurate (Dowty, 1977). Compared to English, Chinese imperfective marking system is more complex, consisting of two forms: the preverbal zai 3 and the post-verbal -zhe. As a morpheme, zai is syntactically freer than -zhe: the preverbal zai can be used before a small clause that even includes -zhe, whereas -zhe can only follow a bare verb and function like a suffix. The properties of zai and -zhe are summarized in Table 2.
The preverbal zai acts as a progressive marker, indicating the ongoingness of an event, as illustrated in (A1). Unlike its English counterpart, zai is restricted to durative events and hence not compatible with achievements. As in (A2), zai cannot present the preliminary stages of the event of reaching the mountaintop as be + -ing does in (1c), which shows that zai is sensitive to the durativity parameter. 4 Moreover, zai is able to make an unbounded activity stand alone as a complete sentence as in (A3), whereas the corresponding -zhe sentence (B1) sounds incomplete. The subtle difference between zai and -zhe in the so-called incompleteness effect is overlooked in the majority of theoretical work on Chinese aspect markers, because it does not directly concern grammaticality; for example, both (A3) and (B1) are licit in Huang et al.'s (2009) study. Tsai (2008) is the only influential study by far that systematically examines the phenomenon whereby some sentences with an aspect maker are judged to be incomplete in isolation and relates the incompleteness effects to a syntactic process of 'tense anchoring'. Tsai proposes that, although Chinese verbs are not inflected for overt morphological tense markers, Chinese has weak syntactic tenses, which cannot manifest as specific tense features (e.g. [+past] and [+present]) but can have a sentence tense anchored. Tense anchoring is a process of spelling out an event variable in morphosyntactic terms. One way to a achieve 'tense anchoring' is through Asp-to-T raising: the progressive zai locates at the head of the outer AspP and can raise to T to instantiate a lexical tense operator. 5 Our views on Chinese syntactic tenses and the function of zai are in line with Tsai (2008). We argue that the tense anchoring function can be analysed as a semantic feature [+T], which is the motivation of the Asp-to-T movement of zai. It should be checked at the head T position to license a Chinese sentence regarding tense.
In terms of the post-verbal -zhe, we agree with researchers like Zhu (1982), Chen (1999), andTsai (2008), among others, who argue that imperfective -zhe has two different viewpoints: progressive and resultant-stative. The aspect meaning depends on the verb phrase that works with -zhe.
The progressive -zhe (abbreviated as -zhe P ) and zai differ in the tense-anchoring function and selection requirements on the predicate. First, sentences composed by an activity and -zhe P sound incomplete, and normally indicate a background action in a complex sentence or require the co-occurrence of a sentence-final particle (e.g. the particle ne for a [noteworthy] meaning). For instance, chang zhe ge in (B1) can function as a background clause in a sentence like Ta (chang zhe ge) gan-wan le huor 'He finished this work while singing', or be accompanied by ne in the free-standing sentence Ta (chang zhe ge) ne 'He is singing'. This is because -zhe P does not have a [+T] feature like zai does, which can trigger an Asp-to-T movement and implement tense anchoring in the syntactic sense. Secondly, when interacting with the predicate, zai is sensitive to durativity but -zhe P to telicity. Telic predicates (i.e. accomplishments and achievements) is not compatible with -zhe P , as illustrated in (B2) and (B3).
The resultant-stative -zhe (abbreviated as -zhe R ) appears only with verbs that express a certain degree of attachment, such as Mixed Telic-Stative verbs (MTS verbs, e.g. chuan/dai 'to wear/ to put on', na 'to fetch/to hold') and positional verbs (e.g. zuo 'to sit'). Mixed Telic-Stative verbs (Li, 1990) are a special type of Chinese verbs that encode the whole process of a telic action and the state resulting from that process. Progressive and resultant-stative markers focus on different intervals of the event. For example, the verb chuan 'to wear/ to put on' can denote both the dynamic action and the resultant state of putting on. In (C1), -zhe R presents the state resulting from the process of putting on the coat, and the verb chuan here corresponds to the English verb wear. The aspectual difference between (C1) and (A1) results from the use of different imperfective markers. Moreover, similar to -zhe P , -zhe R does not have a [+T] feature and sentences such as (C1) sound incomplete. There are several types of strategy to save predicates with -zhe R from the incompleteness effects: adverbial quantification, subordination, and locativeexistential construals (Tsai, 2008).
It is controversial that whether -zhe R and -zhe P are one marker with two meanings or two separate markers. We argue that the two -zhes should be treated as two different markers, due to the fact that some sentences with -zhe can be ambiguous between progressive and resultant-stative, as illustrated in (2). When working with the MTS verb zhuang 'to load/to be loaded with', -zhe can indicate either an internal interval of the specific action of move or an external interval of the final state.
(2) # Ta zhuang zhe nei xie qian. He load ASP that CL money ' He was putting the money (into his pocket/a bag). / He carried the money.' As for the syntax, questions arise about the syntactic structure of Chinese aspect phrases and the nature of the interaction between the aspect marker and the predicate. Due to the syntactic and semantic differences between the viewpoint forms, a double-aspect (e.g. Huang et al., 2009) or a multiple-aspect structure (e.g. Tsai, 2008) are proposed to account for the derivation of Chinese imperfective sentences. We agree with the studies that propose each aspect marker can have its own maximum projection, namely AspP (Huang et al., 2009;Tsai, 2008). There is a consensus that the preverbal zai directly fits into a higher Asp head position and its maximal projection (AspP1) is above vP, as shown in Figure 1. However, it is controversial whether the lower AspP that accommodates -zhe is above vP as well and whether -zhe R and -zhe P are generated in different AspPs. We argue that -zhe R and -zhe P are located in the same layer, 6 and adopt the analysis of Huang et al. (2009), which proposes the aspectual projection of -zhe R and -zhe P (AspP2) is between AspP1 and vP.
As we discussed above, the Chinese imperfective markers have some selectional restrictions on the predicate they can co-occur with and have some temporal properties as well. A central question is what set of values the markers can have. It is commonly argued that the situation aspect (static/dynamic; telic/atelic; durative/punctual) is expressed by the predicate and the viewpoint aspect (imperfective/perfective) is indicated in aspect markers. However, unlike the English marker be + -ing, the Chinese zai is restricted to durative verb phrases and -zhe P is incompatible with telic events. Aspectual coercion is not triggered by the combination of zai/-zhe P and an achievement. It is reasonable to assume that zai is inherently specified as [+durative] and -zhe P as [+atelic]. In other words, the Chinese grammar requires that telicity and durativity be assigned to the functional heads by the imperfective morphology. We adopt the outer-and-inner aspect analysis proposed by Travis (2010): while situation aspect is encoded by a vPinternal or inner aspect projection, viewpoint aspect is encoded by a vP-external or outer aspect projection. As presented in Figure 1, AspP1 and AspP2 are outer AspPs, whereas AspP3 is an inner AspP, for lexical aspect only. The outer AspPs that accommodate the markers scope over the inner AspP. If the semantic feature attached to Asp1/2 and that to Asp3 are of the opposite value (e.g. [+telic] vs. [-telic]), the sentence will be illegal due to a semantic clash. Moreover, We agree with Tsai (2008) that Chinese is a language without tense morphology but has weak syntactic tense: there is a TP above AspPs. We tentatively argue that zai has a tense-anchoring [+T] feature and can carry out an Aspto-T movement to get the feature checked at the head T position.
Looking at imperfectives in English and Chinese at a morpholexical level, we summarize the aspect markers and their features in Figure 2. Chinese grammatical aspect system is richer than that in English: diverse aspectual meanings are expressed by more than one form. Due to the difference in how temporal-aspectual features are assembled between the two languages, English-speaking learners of Chinese need to reconfigure the feature sets of the Chinese markers in their L2, which is dubbed a feature reassembly process in Section III.
A large body of research on L1 transfer in L2 aspect has been devoted to discussion on in what way and to what extent the learner's L1 affects the learning process. Some recent studies (e.g. Domínguez et al., 2017;Gabriele, 2009;Gabriele and McClure, 2011;Roberts and Liszka, 2013) have shown that the aspectual properties and the way of aspectual coding in the L1 transferred into the L2 influence L2 acquisition.
Transferred L1 properties may bring difficulties with the acquisition of aspectual interpretations. For instance, Gabriele and McClure (2011) investigate whether advanced L2 learners can extend beyond the grammatical properties of the L1 by examining the acquisition of the semantics of the imperfective marker te-iru in Japanese by native speakers of Mandarin Chinese. L1 effects have been confirmed as the results of an interpretation task suggest that Chinese learners cannot extend beyond the properties of the L1.
Learning difficulties can also arise from grammatical differences between the L1 and L2. L2 learners are influenced by how aspectual distinctions (imperfective and perfective) are expressed or aspectual meanings are instantiated in their L1. Roberts and Liszka (2013) report a self-paced reading study designed to investigate whether or not advanced French and German learners of English as an L2 are sensitive to tense/aspect mismatches between a fronted temporal adverbial and the inflected verb that follows (e.g. *Last week, James has gone swimming every day) in their on-line comprehension. Aspect is grammaticalized in both French and English but not in German. They hypothesize that the difference in the L1 aspectual marking may impact L2 processing. The online-data show that only the French L2 learners were sensitive to the mismatch conditions, whereas the German L2 learners did not show a processing cost at all. They therefore argue that the performance differences between the L2 groups can be explained by influences from the learners' L1: namely, only those whose L1 has grammaticalized aspect (French) were sensitive to the tense/aspect violations online.
Based on the discussion on L1 properties and how they are instantiated in the L1 and L2, some SLA researchers (Choi and Lardiere, 2006;Lardiere, 2008Lardiere, , 2009Slabakova, 2008) further point out that some learnability challenges are brought by the complexity of the mapping between form and meaning. Rooted in Chomsky's (1995Chomsky's ( , 2005 Minimalist framework, Lardiere (2009) extends Sprouse's (1994, 1996) Full Transfer Full Access (FTFA) model and proposes a Feature Reassembly Hypothesis (abbreviated as FRH). She argues that one of the greatest challenges for L2 learners is to assemble the right combination of features into the lexical items for a given language. More importantly, feature reassembly is especially difficult in cases where the target features do exist in the L1 but are configured differently. In such cases, L2 learners have to disassociate the feature matrices that have been selected and assembled in their L1 and reassemble them in a way that matches the L2 properties. According to Lardiere (2009), the feature reassembly process can be discomposed to two steps 7 : Step 1: Feature detection and mapping L2 learners initially look for morpholexical correspondences in the L2 to those in their L1, presumably on the basis of semantic meanings or grammatical functions and then map the feature set of the perceived corresponding L1 item onto the L2 target item.
Step 2: Feature disassociation and reassembly L2 Learners fine-tune the target feature set by adding or deleting relevant features on the basis of the L2 input. The FRH has been tested by L2 studies on tense and aspect. For example, Domínguez et al. (2017) investigate the acquisition of the Spanish imperfect by English-speaking learners at three different proficiency levels. To converge on the target grammar, English speakers need to dissociate the 'continuous' and the 'habitual' from the Preterit in their L2 Spanish since these meanings are expressed with forms that also convey perfectivity in English. The results show that the learners have problems with the 'continuous' meaning in all tasks, which signals a mapping problem of aspect-related features present in both English and Spanish onto a new form (the Imperfect) and supports the prediction of the FRH.

IV Previous L2 studies on Chinese imperfectives and the research questions
Studies concerning L2 Chinese aspectual systems have mainly focused on whether learners' developmental patterns in aspectual marking support Andersen and Shirai's (1996) Aspect Hypothesis (AH), which predicts that the early use of verbal inflection in first and second language acquisition is strongly influenced by situation aspect conveyed by the verb phrases. L2 researchers have found some developmental patterns in L2 Chinese that confirm some predications of the AH (see Jin, 2002Jin, , 2009Jin and Hendriks, 2005;Tong, 2012;Wang, 2012;Wen, 1997;Yang et al., 1999), but there are still some results that do not follow the general course predicted by the AH (e.g. Jin, 2009).
Those studies have shed light on a 'universal' aspectual marking pattern and revealed the interaction between aspect markers and situation types in L2 Chinese. However, this approach is quite descriptive and does not have strong explanatory power, especially when the results are at odds with AH predictions. Some studies have pointed out that complex meanings and functions of aspect markers cause problems for L2 learners (e.g. Wen, 1995), but have not clarified how 'complex' the meanings, the forms and the mapping/remapping mechanisms are, and in what way these factors affect learners' behaviours. Moreover, from the perspective of aspect markers, research on both -zhe and zai remains scarce (Jin, 2009;Jin and Hendriks, 2005;Wang, 2012). Hence, in the present study, we do not intend to follow the trend of testing the AH, but turn instead to focusing on the causes of acquisition difficulties and the questions that remain unsolved. To the best of our knowledge, none of the previous studies has systematically investigated the L2 acquisition of Chinese imperfective marking at the morpholexical level in a feature reassembly approach.
As discussed in Section I, the complexity in form-meaning mapping between English and Chinese aspectual properties requires English speakers to reconfigure existing features onto new lexical items in the L2 Chinese. Following the path of the FTFA and the FRH, we assume that the initial state in L2 acquisition is the final state of L1 acquisition and the entirety of the L1 grammar (including features and feature configurations) is transferred to L2. As soon as English speakers have detected that zai, -zhe P and -zhe R are the morpholexical correspondences to the English be + -ing on the basis of semantic meanings and grammatical functions, they would map the features of be + -ing into the target markers zai, -zhe P and -zhe R .
To acquire the progressive marker zai, in the next step, English speakers need to dissociate the [+resultant-stative] feature from zai, as illustrated in Figure 3). They are also expected to reassemble a [+durative] feature from English lexical non-achievement verbs into zai. We therefore hypothesize that English speakers will mistakenly allow the association of zai with an achievement verb at the initial stage and a resultant-stative interpretation of zai sentences. Since the copular verb be in be + -ing is always tensemarked, English speakers may plausibly transfer a tense feature (presumably the present tense) from a tense marker into zai at the initial stage.
The -zhe R and the -zhe P share the same form and the resulting aspectual meaning of a V-zhe cluster depends on the verb. English natives need to acquire that when -zhe is associated with a verb carrying a certain meaning of attachment, the sentence receives a resultant-stative reading. English speakers are also expected to reconfigure the features of -zhe R and -zhe P in their L2, due to the difference between the L1 and L2 feature sets. Regarding the -zhe R , English speakers need to relinquish the L1 transferred [+progressive] feature (see Figure 4). We predict that beginners are not able to reject a progressive reading of -zhe R sentences. For the progressive -zhe P , English natives are presumed to discard the L1  transferred [+resultant-stative] feature and further to reassemble a [+atelic] feature onto -zhe P (see Figure 5). English-speaking learners may not know the telicity constraint and therefore mistakenly allow the association of -zhe P with accomplishments or achievements at the initial stage, since sentences like 'he is building a house' and 'he is winning' are acceptable in English. Furthermore, unlike zai, which can have an unbounded sentence tense-anchored, -zhe P/R lacks the [+T] feature. English speakers are predicted not to be sensitive to the incompleteness effects of -zhe P sentences at early stages.
Based on the analysis and predictions above, four main research questions are asked in the present study: feature into zai and be sensitive to the incompleteness effect of -zhe P sentences?

V Participants and methods
To test the research questions listed, we examine data from 90 participants through three different tasks: an off-line acceptability judgment task (AJT), a sentence-picture matching task (SPMT) and an on-line sentence completeness judgment task (SCJT).

Participants
The participants consisted of 25 Chinese native speakers (NS) and 65 L2 Englishspeaking learners of Chinese recruited from universities in the UK and China. On the basis of their performance in a 40-blank Chinese cloze test adopted from Yuan and Dugarova (2012), the learners were classified into three Chinese proficiency groups: 21 beginners, 23 intermediate learners and 21 advanced learners. Table 3 presents the information of the participants. ANOVA tests conducted on the proficiency scores show that the participating groups were significantly different from each other (F (3, 86) = 327.1, p < .001). Post-hoc pairwise comparisons revealed that there were statistical differences between all possible pairs of groups in terms of the cloze score (p < .05).

Instruments
In order to minimize any possible effects from vocabulary on their behaviours, all the key vocabulary words in the study were selected from the level A (the easiest level) of the National Syllabus of Graded Words and Characters for Chinese Proficiency (Hanban, 2001), sent to the participants in advance and checked at the beginning of the experiment. Before the main tasks, a prerequisite test was conducted to ensure that the basic aspectual distinctions (perfective or imperfective) and syntactic positions of the markers (following or preceding the verb) had been established in the participants' L2 Chinese. All the task items were piloted before the main study. The test order of the three main tasks was: the SCJT → the SPMT → the AJT. The instructions of the tasks were provided in the participant's native language.
a The acceptability judgment task (AJT). A web-based AJT was administered to all participants, which included the four types listed in Table 4, with each type having 4 tokens. Types A-1 and A-2 are for Question 1, to test whether the learner will reject the association of achievements with zai and that with -zhe P . Types B-1 and B-2 are for Question 2, investigating whether the learner can differ zai from -zhe P in telicity. The participant was asked to decide whether the sentence is 'completely unacceptable', 'probably unacceptable', 'probably acceptable' or 'completely acceptable'. There was also an 'I don't know' option.
b The sentence-picture matching task (SPMT). A sentence-picture matching task was designed for Question 3, to explore whether participants can distinguish between different aspectual meanings of the progressive zai and the resultant-stative -zhe R when the markers are associated with the Mixed-Stative-Telic verbs (e.g. chuan 'to put on/wear'). The participant was presented with two pictures on the computer screen and then listened to a sentence that can be played only once. They were asked to indicate which picture best matches the sentence provided by ticking the answer on an answer sheet. If they could not understand the meaning of the sentence, they could choose 'I don't know'. Two types of sentences (3) were involved (4 tokens for each type). Another eight sentences were added as fillers.
(3) Type C-1: zai + V MTS [+progressive] Xiao nühai zai chuan xie. Little girl ASP put on/wear shoe 'The little girl is putting on her shoes.' Picture set: Xiaoli chuan zhe yi jian waitao. Xiaoli put on/wear ASP one CL coat 'Xiaoli is wearing a coat.' Picture set: c The sentence completeness judgment task (SCJT). A sentence completeness judgment task was invented for Question 4, to test whether the participant can differ zai from -zhe P in the tense anchoring feature [+T]. The task was conducted on Eprime 2.0 and employed a self-paced reading fashion. The participant was presented a sentence by pressing the SPACE bar, in a word-by-word fashion. Each test item started with a plus sigh '+' and ends with a string of asterisks '***', followed by a question page on which the participant was asked whether the sentence can stop here standing as a complete Chinese sentence. The participant was instructed to make a judgment by pressing the '' key for yes, the '' key for no, and the '?' key for 'I don't know' on the keyboard. The participant's choices and thinking times (the reaction times for both the ending page '***' and the question page) were recorded. The former type of data is the main data and the latter acts as a supporting one. Longer thinking times are thought to reflect the participant's uncertainty and processing difficulties, which may relate to a violation of expectation, the ungrammaticality of the sentence, or a reanalysis process.
It was crucial to control variables in the design of the SCJT. All the predicates of the critical items were activity verbs, which are atelic. Hence the occurrence of the aspect markers is the only factor left, which directly influences the sentence completeness. Six tokens were designed for each marker and divided into two different lists based on a Latin square design. There were 6 critical items (3 tokens for each marker), such as (4), and 34 fillers in one list, with a half being 'incomplete' and the other half 'complete'. The lengths of the critical items were 3-4 words (4-5 characters).

Results from the acceptability judgment task
In data analyses for the AJT, the 'I don't know' responses were deleted and treated as missing values. The four acceptability ratings were converted into the numerical values 1, 2, 3 and 4, respectively. As 2.5 is at the middle of the scale, mean scores falling between 2 and 3 suggest indeterminacy. Those that reach 3 or above imply acceptance and those lower than 2 are interpreted as rejection. If the mean score of a learner group and that of the native Chinese group fall in the same range (i.e. ⩽2, 2-3 or ⩾3), that learner group's overall performance was considered to be native-like. The scores were analysed using linear mixed effects (LME) models under nlme (Pinheiro et al. 2014) and ANOVA under lmerTest (Kuznetsova et al., 2017), with Group and Marker as fixed effect factors and Subject and Item as random factors.
The LME indicated that the main effect of Group and that of Marker reached significance (Group: F = 43.76, p < .001; Marker: F = 29.69, p < .001) but the interaction effect was not significant (F = 0.99, p = .39). ANOVA results revealed the groups were significantly different from each other in terms of the mean score in both types (Type A-1: F (3, 355) = 24.97, p < .001); Type A-2: F (3, 352) = 27.15, p < .001). As shown in Table 5, similar to the Chinese natives, the intermediate and advanced groups judged both the zai and the -zhe P sentences as unacceptable (mean scores < 2), albeit to a lesser extent (Type A-1: native vs. intermediate: p < .001; native vs. advanced: p = .001; Type A-2: native vs. intermediate: p < .001; native vs. advanced: p = .098). The beginner group was indeterminate on the zai sentences but tended to reject the -zhe P sentences. Paired-sample t-tests were conducted on the data for the two types within each group. The results showed that only the learner groups found the -zhe P sentences significantly more unacceptable than the zai sentences (beginner: t(68) = 2.278, p = .026; intermediate: t(88) = 2.395, p = .019; advanced: t(82) = 2.604, p = .011; native: t(99) = 1.090, p = .278).
The LME showed a main effect of Group (F = 10.63, p < .001), a main effect of Marker (F = 209.33, p < .001) and a significant interaction effect (F = 12.61, p < .001). ANOVA results revealed significant differences between the groups in the -zhe P sentences (Type B-2: F (3, 348) = 14.99, p < .001) but not in the zai sentences (Type B-1: F (3, 350) = 1.41; p = .24). As presented in Table 6, similar to the Chinese natives, the three learner groups accepted the use of zai with accomplishments (mean scores >3). However, for the association of -zhe P with accomplishments, the advanced learners showed native-like indeterminacy, while the beginners and the intermediate learners tended to accept this type. Paired-sample t-tests showed that all the four groups found the -zhe P sentences significantly less acceptable than the zai sentences (ps < .001).

Results from the sentence-picture matching task
In the SPMT, results are first presented in terms of the distribution of the participants' choices. 8 This task offered participants four choices (Picture A, Picture B, both A and B, and 'I don't know'), which represent different interpretations of the target sentence. Participants' choices were classified into three categories: 'Progressive' for choosing the progressive meaning picture, 'Resultant-stative' for the resultant-stative one and 'Both' for both interpretations, and 'I don't know' choices were deleted and treated as missing values. The interpretive choices were presented through mosaic plots under the vcd package (Meyer et al., 2017) in R. Chi-squared nonparametric tests were further used to compare the participating groups' actual frequencies of aspectual choices. Nonsignificant p values in the chi-squared tests are taken as pointing to essentially similar interpretive choices. As Figure 6 illustrates in the case of the zai sentences, in general, the Chinese natives' data had an overwhelmingly high percentage of the correct progressive interpretation, while the L2 groups did not show the same pattern. The beginner group incorrectly allowed the resultant-stative interpretation of the zai sentences at the rate of 71.1% (a combined percentage of 'Resultant-stative' and 'Both') and the percentage of the correct interpretation (progressive) was very low. At both the intermediate and advanced levels, the correct responses accounted for only around half of the total, which shows that indeterminacy on the aspectual meaning of zai persisted throughout the developmental stages. Chi-square results showed that there was a relationship between Group and Choice (χ 2 = 100.52, df = 6, p < .001). Moreover, the patterns of the three learner groups were divergent from that of the Chinese natives (ps < .001  Regarding the interpretation of -zhe R sentences, chi-square results indicated that there was a relationship between Group and Choice (χ 2 = 20.487, df = 6, p = .002). As shown in Figure 7, the correct choice (resultant-stative) accounted for a large proportion of the Chinese natives' choices. The three learner groups showed a similar pattern: from the beginner stage to the advanced stage, the percentages of the correct interpretation were above 60%, which indicates that the L2 learners tended to associate -zhe R with a resultant-stative reading from the beginning. Chi-squared tests also showed that the learners' interpretations of -zhe R were significantly different from those of the natives (native vs. An individual analysis conducted for the SPMT found an acquisition asymmetry between zai and -zhe R . More L2 learners were able to acquire the interpretation of -zhe R Figure 6. Interpretations of the progressive zai + V MTS sentences in the SPMT by group. Note. The area of the boxes gives an indication of the proportion to the whole. Results with dotted lines being places where results are less than expected and solid lines those where results are more than expected. The Pearson residuals plot to the right uses saturation to indicate inferences that can be made from the data. Individual cells that violate the assumption of independence are more deeply colored. Residuals above 4 or below −4 indicate a difference that means the null hypothesis that Group and Choice are independent can be rejected, and those between 2 and 4 (and −2 and −4) do not indicate a statistical rejection of the null hypothesis (Zeileis et al., 2007). than those who acquired the meaning of zai at all proficiency levels: only one intermediate learner and two advanced learners consistently (on 4 out of the 4 tokens) chose the correct interpretation on both the zai and -zhe R sentences, and thus were considered to be able to differentiate the aspectual meanings of zai and -zhe R .

Results from the sentence completeness judgment task
For the SCJT, results will first be presented in terms of the participants' judgments on sentence completeness. This task offered participants two choices, which were labeled 'Complete' and 'Incomplete'. The judgment choices were assessed through mosaic plots under the vcd package (Meyer et al., 2017) in R. Chi-squared nonparametric tests were then used to compare learners' and natives' patterns in each condition and compare results of the two types within each group. Non-significant p values in the chi-squared tests will be taken to point to essentially similar choices. As supporting evidence, the participants' thinking times were also compared. Linear mixed effects models using nlme were run on the whole dataset of thinking times (in milliseconds), with Group and Marker as fixed effect factors and Subject and Item as random factors.
As shown in Figure 8, only the Chinese natives could correctly judge -zhe P sentences as incomplete and zai sentences as complete. Within group chi-square tests also showed that they made essentially different judgments on the completeness of the two types (χ 2 (1, N = 150)= 91.68, p < .001, Cramer's V = .782). However, the learner groups behaved differently from the natives, especially on -zhe P . The beginners tended to judge the -zhe P sentences as complete, while both the intermediate and the advanced groups showed indeterminacy (beginner vs. intermediate: p = .001; beginner vs. advanced: p = .001; intermediate vs. advanced: p = .768). For zai, the learner groups tended to judge the zai sentences as complete, albeit to different extents. More specifically, the beginners and the intermediate learners behaved similarly, with around one third of responses being 'Incomplete' (beginner vs. intermediate: p = .871), and the percentage of the correct choice slightly increased at the advanced level (intermediate vs. advanced: p = .08). Chisquare tests that were conducted within each learner group observed a significant difference between the results of zai and -zhe P only in the advanced learners' data, which suggests that they had some sensitivity to the completeness effects (advanced: χ 2 (1, N = 126) = 11.867, p = .001, Cramer's V = .307).
The participating groups' reaction time data are presented in Figure 9. The LME found a main effect of Group (F = 2.85, p = .036), a main effect of Marker (F = 14.47, p < .001) and a significant interaction effect (F = 4.29, p = .005). Paired sample t-tests were also conducted within each group. There were no significant differences observed between the reaction times for the zai sentences and those for the -zhe P sentences in the data of the

VII Discussion
The main goal of the present study is to evaluate difficulties in the feature reassembly process when English speakers acquire Chinese imperfective makers. Holding a full transfer position, we assume that English speakers need to reconfigure the feature sets of the Chinese imperfective markers in their L2 by adding or discarding some features. The L2 patterns observed in the three tasks will be discussed and explained in this section. Answers to the four research questions proposed are summarized in the three topics below.

Feature adding: learning additional semantic restrictions
As discussed in Section I, the English and Chinese imperfective markers differ in their interactions with lexical verb classes. The Chinese markers are more complex and impose more restrictions on the predicate, due to the additional features that they carry (i.e. the [+durative] feature of zai and the [+atelic] feature of -zhe P ). The two additional features exist in both English and Chinese but are not initially assembled on the English marker. Research questions 1 and 2 are to address whether English speakers can add the [+durative] feature and the [+atelic] feature into the feature set of zai and that of -zhe P , respectively, which is predicted to be difficult by the FRH.
The AJT results have revealed that the L2 learners were able to correctly reject both the ungrammatical Type A-1 (zai [+durative] + V achievement ) and Type A-2 (V achievement -zhe P [+atelic]) from the intermediate level. An individual analysis reveals that successful learners emerged at the intermediate stage but the percentages of successful learners remain low at the advanced stage (28.6% for zai and 42.9% for -zhe P ). In terms of the difference between zai and -zhe P in telicity, only the advanced group correctly accepted the Type B-1 (zai + V accomplishment ) and showed native-like indeterminacy on the Type B-2 (V accomplishment -zhe P ). The findings suggest that learning the additional semantic restrictions is difficult but they are ultimately acquirable, which is in line with the FRH.
At this point, we need to explain why the beginners showed indeterminacy on Type A-1 (zai [+durative] + V achievement ) and Type A-2 (V achievement -zhe P [+atelic]) rather than incorrectly accepting the two types, given the association of the imperfective marker with achievements is allowed in their L1 English. The beginners' pattern seems to be unexpected because if we assume full transfer holds for all the features of lexical items, the grammar should be transferred from English into their L2 Chinese at the initial stage. Some previous studies have also tested the two types of sentence and found a similar learner pattern. For example, Jin (2009) reports that low-intermediate learners in her study rejected both types of sentence, which supports the prediction of the Aspect Hypothesis (Bardovi-Harlig, 1999) that learners do not associate imperfective markers with achievements at the very beginning of acquisition. We argue that it might be too hasty to rule out the transfer effect here. We further examined the beginners' data and the Pearson correlation found a statistical relationship between the AJT and the proficiency scores (p = .031). Learners with lower proficiency scores were more likely to judge the two types of ungrammatical sentences as acceptable. This suggests that the beginners' proficiency affected the results and L1 transfer did take place. Some of the beginners were more advanced than the others and had acquired some knowledge of the selectional restrictions of zai and -zhe P , and thus they could not represent the initial stage.
We should then turn to explain why learners can acquire the additional semantic features quickly at early stages and why the learning difficulty does not persist to an advanced stage, even though a feature reassembly process is involved. We propose that this can be accounted for the similarity between the L1 and L2 aspectual operations. In English, the form be + -ing is very odd with some achievements, as in (5).
(5) ? Mary was finding her watch. (Smith, 1997: 172) For the particular type of achievements, it is difficult to think of the events expressed by the verbs, such as finding, as having preliminary stages (Smith, 1997: 172). Events whose substages before the endpoint are difficult to perceive are not preferred by imperfective marking in English and are completely rejected by Chinese imperfectives due to a semantic clash. The constraint is presumably easy to acquire because it already exists in the learner's L1. More importantly, the nature of the additional features can account for the findings. The reassembly of [+durative] and [+atelic] is slightly different from the learning situation exemplified in Lardiere (2009 [+atelic] are semantic aspectual properties of verb phrases in English, classified as lexical / situation aspect properties. The markers zai and -zhe P mainly express grammatical aspect (i.e. imperfective), which is an orthogonal aspectual category to lexical aspect. The two to-be-added features have already existed and been attached to a form that is also associated with the umbrella category Aspect in the L1 English. When learning the additional semantic constraints, English speakers are reconfiguring the features within the functional category Aspect. Our findings suggest that feature assembly within the same functional category would not impose immense difficulties for L2 learners.

Feature discarding: an arduous process
The sentence-picture matching task was designed to address Question 3, investigating whether English speakers will be able to discard the [+resultant-stative] feature initially transferred into zai and the [+progressive] feature initially transferred into -zhe R , and to differentiate zai and -zhe R in terms of aspectual meanings. The present study has found that the L2 learners who had no problem accepting both the progressive zai + V MTS and the resultant-stative V MTS -zhe R sentences in the AJT but showed asymmetric patterns in the SPMT: they behaved in a more native-like way on V MTS -zhe R than on zai + V MTS . Even at the advanced level, the L2 learners remained indeterminate on the reading of zai. On the other hand, their interpretations of the resultant-stative -zhe R sentences were similar to those of the natives. The asymmetry results indicate that learners at higher proficiency levels still have difficulty differentiating zai and -zhe R in terms of aspectual meanings.
The findings lead to discussion on difficulties brought by different feature reassembly mechanisms, especially by feature discarding. In Chinese, zai sentences only receive a progressive reading while the interpretation of -zhe sentences depends on the verb type, with the latter case being similar to the aspectual formulation in English. When learning zai, English speakers should completely discard the [+resultant-stative] feature that is transferred from the feature set of be + -ing. However, for -zhe, they do not need to relinquish any transferred viewpoint features but to acquire the interactions between the aspect marker and the verb type (i.e. -zhe P working with common action verbs and -zhe R with the verbs involving a certain attachment meaning), which seems congruent with their L1 imperfective formulations. Then we should ask a question: is this entirely a result of L1 effects? The answer is no. In the SMPT, we strictly controlled the verb type (the MTS verb such as chuan 'to wear/ to put on') and ensured the participants involved had fully acquired the semantic meaning of the Chinese MTS verbs by the vocabulary checking. The markers zai and -zhe R appear with the same verb and the reading of the sentence completely depends on the aspect marker. Semantic properties of the verb in this case could not help the learner differentiate the aspectual meanings. L1 transfer is insufficient to explain why the learners had difficulty in interpreting zai.
We argue that the asymmetric result is attributable to the different learning situations involved. Disassociating a meaning from a certain form may constitute a major difficulty for learners. For related L1 and L2 feature sets, if the learning direction is from a superset to a subset, 10 it is very difficult for learners to disconfirm the L1-transferred incorrect reading, as learners have to rely heavily on negative evidence in the input. Our finding is echoed by some other empirical studies on L2 imperfectives. For example, Ryu et al. (2015) find that Japanese speakers are more successful in acquiring the Korean imperfective marker -ko iss-(for progressive and resultant-stative) than in acquiring the other imperfective -a iss-(for resultant-stative only). In the learners' L1 Japanese, the imperfective marker -te ican express either progressive or resultant-stative, depending on the situation type. Japanese speakers need to disassociate the L1-transferred progressive reading from -a issin their L2 Korean, which is proved the most difficult. The learning situation and their findings are in line with our argument that discarding a semantic feature from a certain form is arduous.

L1 transfer effects
Following the FTFA model, we assume that all the features on the L1 imperfective marker are transferred into the L2 at the initial stage. A critical question to ask is what kinds of features are initially bundled in the L1 form. Apart from aspectual values, tense and number features can be attached to the copular verb be in English. More specifically, after having detected zai and -zhe are the counterparts of be + -ing, do English speakers transfer not only aspect features but tense features into zai and -zhe P , given the copular be is part of the aspectual marking and always inflected in tense? The sentence completeness judgment task investigating the acquisition of the tense-anchoring feature [+T] of zai reveals that the learners tended to find both the zai and -zhe P sentences complete at the initial stage and remained indeterminate in judging the -zhe P sentences even at the advanced level. We argue that the beginners' insensitivity to the incompleteness effects and the apparent asymmetry between the acquisition patterns of zai and -zhe P at later stages are attributed to L1 transfer and the learning situation involved.
We believe that the learners initially transferred the tense property attached to the inflection on the copular verb into zai and -zhe P , although they may not pin down a specific tense feature as Chinese is known as a 'tenseless' language. The transferred tense property functions similarly to the tense-anchoring feature [+T] in Chinese and plays a facilitative role in the acquisition of [+T] on zai at early levels. However, due to the essential but subtle differences between the L1 tense features (grammaticalization of location in time) and the L2 tense-anchoring feature (tense-anchoring unbounded events), it is very difficult for L2 learners to differentiate the two functions and fully acquire the function of the [+T] feature. Moreover, the learning situation involved here is deleting the transferred tense feature from the feature set of -zhe P , which is also a feature discarding process and predicted to be very hard. There may not be enough informative evidence in the input for learners to disassociate the tense feature transferred from the L1.
The two factors above together account for the result that the advanced learners still showed indeterminacy on the -zhe P sentences, although they correctly judged the zai sentences as complete. We do not consider this finding as an indication of a permanent deficiency, because the advanced group cannot represent the end-state grammar and both the judgment and the RT data showed that they had some sensitivity to the difference between zai and -zhe P regarding the tense-anchoring function.

VIII Conclusions
Chinese imperfective system is more complex than the English system, in terms of both features (meanings) and aspect markers (forms). The present study tests the prediction of the Feature Reassembly Hypothesis by investigating how English speakers reconfigure the feature sets of the imperfective markers zai, -zhe P and -zhe R in their L2 Chinese. We have systematically compared the features of the imperfective markers in English and Chinese, and differentiated specific feature reassembly mechanisms that cause varying degrees of difficulty. A grammaticality judgment task, an interpretation task and a sentence completeness judgment task were employed to explore whether L2 learners will be successful in feature configuration by adding new features or discarding L1-transferred features.
Compared to the English be + -ing, the Chinese imperfective markers carry more semantic features (i.e. the [+durative] feature of zai and the [+atelic] feature of -zhe P ) and impose more restrictions on the verb type that they can work with, which requires a featureadding process. The successful acquisition of the two features implies that the original functional category that the additional feature belongs to in the L1 plays an important role in L2 acquisition. When the to-be-added feature is also associated with the target L2 functional category in the L1, the feature reassembly process is relatively easier for learners.
We have also observed an asymmetry in the acquisition of the interpretations of zai and -zhe R and in the acquisition of the incompleteness effect. Most of the advanced learners could not differentiate between zai and -zhe R in terms of their aspectual meanings and are not sensitive to the tense-anchoring function. It is very hard for learners to completely delete the L1-transferred [+resultant-stative] feature from zai and the transferred tense feature from -zhe P . Our findings suggest that feature discarding constitutes a major difficulty for L2 learners, as learners have to rely heavily on negative evidence in the input.
Moreover, we have also found some evidence of L1 influence. Both the temporal and aspectual features carried by be + -ing are transferred into the learners' L2 at the initial stage. The L1-transferred tense features hinder the learners from perceiving the difference between zai and -zhe P in terms of the incompleteness effect. The L1 effect persists even at an advanced stage.
Our feature-based study is one of the very first attempts to investigate the development of L2 Chinese imperfective system that involves different detailed feature reassembly mechanisms. More future studies with diverse methods are needed to explore which learning situation poses most problems in bilingual development, and for what reasons.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The author would like to gratefully acknowledge the support of the University of Cambridge-Chinese University of Hong Kong Joint Laboratory for Bilingualism (JLB) and the AHRC funded research project, Multilingualism: Empowering Individuals, Transforming Societies, under the Open World Research Initiative.

ORCID iD
Yanyu Guo https://orcid.org/0000-0003-3544-2338 Notes 1. Habitual aspect describes a situation which is characteristic of a whole period (Comrie, 1976) and can be expressed by the Simple Present (e.g. he eats apples), by the auxiliary verb would in a past tense sentence (e.g. last year, we would go there), or by the phrase use to (e.g. we used to go there frequently). Since the habitual aspect is irrelevant to the Chinese imperfective makers, it will not be discussed in this study. 2. The resultant-stative viewpoint is dubbed as 'resultative' in Smith (1997). Since in most accounts resultativity is related to telicity and perfectivity, we adopted Sohn's (2019) term of this aspectual meaning. 3. It has been controversial on the status and function of the particle zai: it is treated as a verb (e.g. Chao, 1968), a preposition (e.g. Tai, 1973), an adverb (e.g. Dai, 1997), or conflated with -zhe (e.g. Li and Thompson, 1989). The particle zai has had a long historical development, appearing first as a locative verb and gradually evolved into a locative preposition and finally into a progressive marker in modern Chinese (Klein et al., 2000). We treat the zai immediately followed by a VP as an aspect marker. 4. According to Smith (1997), situations are either durative or instantaneous. Achievements have an inherent endpoint and are instantaneous in nature. Semelfactives (Smith, 1997: 29) such as cough and blink, which refer to single-stage events with no results, are conceputalized as instantaneous as well. Semelfactives do not accept the progressive in English or in Chinese. Sentences like ta zai ke-sou ('he was coughing') cannot be taken to refer to the ongoingness of a single cough. However, on surface, due to the fact that semelfactives can be construed iteratively, they can work with the progressive markers to indicate a derived, multiple-action activity is ongoing, which is a shifted iterative interpretation. 5. For more details about the three-layered Aspect Phrase model proposed and the difference between zai and zhe in terms of syntactic positions and generative process, see Tsai (2008: 682-84). 6. Tsai (2008) put -zhe R on the lowest layer and argues that only the cluster [V-zhe R ] may appear in an imperative sentence. We do not consider the ability of constituting an imperative sentence as evidence to support that the aspectual projection of -zhe R is within VP. Chinese imperatives allow phrases that have a resultative meaning such as a resultative compound (e.g. chi-wan 'to eat up'). The semantic feature of -zhe R enables the cluster [V-zhe R ] to be an imperative sentence. Moreover, syntactically, -zhe R and -zhe P can work with the same predicate (such as zhuang qian 'to load money into', hua zhuang 'to put on make-up'). It is more reasonable to argue that the two -zhes differ from each other in semantics but not syntax. 7. This is inspired by Domínguez et al. (2017), who summarize two processes in language acquisition: language acquisition = feature selection + feature assembly. 8. Type C-1 (zai [+progressive] + V MTS ) and Type C-2 (V MTS -zhe R [+resultant-stative]) were also tested in the AJT, which served a screening test for the SPMT. The participants who did not choose 'completely acceptable' or 'probably acceptable' in four out of the four tokens on the target sentences were excluded (five beginners and intermediate learners). 9. Following Li (1999), Lardiere (2009) assumes that if a Chinese noun is plural-marked, it must also be definite and human, and the Chinese plural suffixation men is tightly associated with the features [+definite] and [+human]. 10. The superset/subset relationship here only applies to comparing the feature set of the L1 form and that of the corresponding L2 form and does not refer to the whole grammars.