Working Toward Cross-Cultural Adaptation: Preliminary Psychometric Evaluation of the Affect Knowledge Test in Japanese Preschoolers

To facilitate preschoolers’ emotional development, it is useful to have a developmentally and culturally appropriate measure of emotion knowledge. The Affect Knowledge Test (AKT), a widely used measure of emotion knowledge, has been previously used with diverse cultural groups, including Japanese preschoolers, despite scarce reliability and validity information. Thus, the purpose of the present study was to conduct field tests of the Japanese-translated version of the AKT and a preliminary psychometric evaluation of the measure with Japanese preschoolers. Initial analyses of the Japanese-translated version of the AKT showed that the emotion recognition scale had a low internal consistency and subscales were hardly correlated with each other. After emotion faces used in the AKT were modified based on the interdependent cultures’ attention bias in emotion decoding, both reliability and construct-related validity were improved to satisfactory levels. These findings highlight the importance of cross-cultural adaptation of measures and demonstrate preliminary validity evidence for future adaptation of the AKT with Japanese preschoolers.

Emotional development occurs throughout the life span. As children grow up, they begin to experience a variety of emotions. Interacting with others provides learning opportunities for culturally appropriate ways of expressing and regulating emotions. To develop effective interpersonal skills, children must learn to understand one's own and others' emotions, as well as the causes and consequences of emotions. A growing body of research in Western cultures has focused on preschoolers' development of emotion knowledge and its relations to their social-emotional well-being and later school success. However, less information is available on preschoolers' development of emotion knowledge in Japanese culture. In this study, we (a) discuss the utility of having a developmentally and culturally appropriate measure of emotion knowledge for Japanese preschoolers, and (b) report the preliminary results of the psychometric evaluation of the Affect Knowledge Test (AKT: Denham, 1986;Denham et al., 2002) with Japanese preschoolers.

Significance of Emotion Knowledge Development in Early Childhood
Emotion knowledge plays an essential role in preschoolers' emotional development. Emotion knowledge comprises understanding one's own and others' emotions and the causes and consequences of emotion (Denham, Zinsser, Brown, & Domitrovich, 2012). Emotion knowledge is therefore an essential aspect of emotional competence along with emotion expression and emotion regulation (Kujawa et al., 2014;Saarni, 1999;Shao, Doucet, & Caruso, 2015). High emotion 846688S GOXXX10.1177/2158244019846688SAGE OpenWatanabe et al.

research-article20192019
1 George Mason University, Fairfax, VA, USA 2 NTT Communication Science Laboratories, Kyoto, Japan knowledge allows preschoolers to express emotion in culturally appropriate ways, correctly recognize emotions in others, and comprehend the causes of their own and others' emotions (Denham et al., 2002). In the preschool years, children start to understand emotion-eliciting situations, to infer various causes of emotions, and to predict possible consequences of emotions (Denham, 1986;Denham & Zoller, 1991;Dunn, Brown, Slomkowski, Tesla, & Youngblade, 1991). Moreover, emotion knowledge is a vital tool for regulating one's own emotions in interpersonal conflicts (Kopp, 1989). Being able to identify one's own emotions is the first step toward monitoring and modifying one's emotions, which aids in relationship maintenance.
In addition to enriching everyday interpersonal interactions, acquisition of emotion knowledge aids school readiness. Children who can accurately interpret others' emotions are more likely to have a successful transition from preschool to school (e.g., Denham, Bassett et al., 2012;Raver & Knitzer, 2002). They can get along with peers in a large group setting and more easily adjust to their new school environment. In contrast, preschoolers' emotion knowledge deficits are linked to negative outcomes, such as a higher risk of aggression (Denham et al., 2002;Valiente, Swanson, & Eisenberg, 2012), poor academic achievement, internalizing and externalizing problems (Kujawa et al., 2014), and lower teacher ratings on social functioning (Izard et al., 2001). Because of emotion knowledge's contribution to not only social functioning but also school success, supporting young children's emotion knowledge development is critical, especially during the preschool period.

Call for Supporting the Development of Emotion Knowledge in Japan
Compared with Western cultures, relatively little research has been conducted on Japanese preschoolers' emotion knowledge development. It may be because many Japanese people believe that children should acquire emotion knowledge without explicit instructions. However, a shift has occurred over the last decade as Japanese society has started paying more attention to emotional competence due to growing problems related to Japanese children's interpersonal skills. For instance, futoko (school refusal), being absent 30 days or more per year for reasons not having to do with finances or illness (Aruga, Suzuki, & Tagaya, 2012), has increased in Japan, along with school bullying and violence. According to the Japan Ministry of Education, Culture, Sports, Science and Technology (2017), the number of futoko students has reached 67,798 (0.5% of the total students) in 2016, which is the largest number ever recorded. Many of those children experience anxiety associated with peer relations. These problems with school refusal persist despite intervention programs aimed at addressing children's interpersonal problems and facilitating their social skills (e.g., Kobayashi & Aikawa, 1999;Kokubun, 2000). The programs only yielded temporary benefits, failing to show long-term effects and successful generalization to everyday life (Yamada, 2008). The focus is therefore shifting to younger children as researchers point out that to learn and develop better interpersonal skills, young children should have a solid foundation of emotion understanding (e.g., Watanabe, 2016;Yamada, 2008). Thus, there is a strong need for supporting preschoolers' development of emotion knowledge in Japan.

Assessment of Preschoolers' Emotion Knowledge in Japan
To support Japanese preschoolers' development of emotion knowledge, researchers need an instrument that not only measures baseline emotion knowledge but also is sensitive enough to measure subsequent increases in emotion knowledge. This need requires a developmentally and culturally appropriate measure of emotion knowledge. Although several measures of preschoolers' emotion knowledge have been developed and used in Japan (e.g., Hoshino, 1969;Sakuraba & Imaizumi, 2001), psychometric properties of those measures have not been reported, and there is no measure used across studies with Japanese samples. Thus, the present study seeks to convert a well-researched Western emotion knowledge measure, the AKT, to a Japanese version that can be used with Japanese preschoolers.
The AKT (Denham, 1986;Denham et al., 2002;) was chosen to assess Japanese preschoolers' emotion knowledge because it is one of the most widely used measures of emotion knowledge (Morgan, Izard, & King, 2010) and has been administered in various cultures (Bassett et al., 2011;Machado, Verissimo, & Denham, 2012). The AKT was developed as a developmentally appropriate and contextually valid measure of preschoolers' emotion knowledge, by utilizing emotion faces, puppets, and authentic situations that preschoolers might encounter in daily life (Denham, 1986). To examine developmental changes in preschoolers' emotion knowledge, the AKT consists of two subtests. The first subtest was designed to assess preschoolers' recognition of four basic emotions (i.e., happy, sad, angry, afraid). It calls upon preschoolers to name the emotions shown on hand-drawn emotion faces (expressive recognition) and then to identify the emotions on the emotion faces by pointing (receptive recognition). In the second subtest, whether children identify an emotional consequence of emotion-eliciting situations enacted by puppets is measured. Children listen to short vignettes about what has happened to the puppet, such as receiving ice cream, and then must identify what emotion the puppet is likely to feel: happy. The second subtest allows researchers to assess preschoolers' emotion situation knowledge by examining whether preschoolers successfully identify the puppet's feeling in a given situation.
The AKT has undergone extensive psychometric evaluation in the West. Previous research showed good internal consistency and test-retest reliabilities for both emotion recognition and emotion situation understanding (Denham et al., 2002;. Various studies demonstrated the measure's validity (e.g., Bassett, Denham, Mincic, & Graling, 2012;Denham et al., 2003). For instance, Bassett and colleagues (2012) conducted a confirmatory factor analysis (CFA) and confirmed AKT's two-factor structure. Also, children's performance on the AKT was associated with their social competence as rated by their peers and teachers (Denham et al., 2003;Denham, McKinley, Couchoud, & Holt, 1990). In addition, it is important to note that because the AKT assesses preschoolers' basic emotion knowledge, a ceiling effect on their performance around 4½ years old has been reported in the United States (Denham, 2006).
As Denham (2006) summarized, this measure has additional advantages. Because the AKT is embedded within play and requires little verbal ability, it demonstrates ecological validity. Preschoolers often enjoy it like a game. Another advantage is that for researchers, it is easily learned and only takes about 20 min to administer. In addition, Hoshino (1969) reviewed studies that used similar methods using drawings of emotion faces and photographs of emotion expressions and suggested that simplified drawings of emotion faces, like the ones used in the AKT, are most appropriate for preschoolers because they are free from gender bias and do not contain extraneous stimuli that may influence children's judgments.
Although the AKT attracted Japanese researchers and has been used in Japan (e.g., Hirabayashi, Ohno, Karasawa, & Tardif, 2007;Kazama, Hirabayashi, Karasawa, Tardif, & Olson, 2013), information regarding its psychometric properties with Japanese children has not been reported. Considering cultural differences regarding emotion between interdependent and independent cultures (Kitayama, 2001;Masuda et al., 2008;Trommsdorff & Heikamp, 2013), it is critical to evaluate cross-cultural adaptation of the measure for further usage with Japanese children. The present study is a preliminary step toward cross-cultural adaptation of the AKT. We conducted two small-scale studies to preliminarily evaluate the measure's psychometric properties when used with Japanese preschoolers.

Cross-Cultural Adaptation and Psychometric Evaluation of Measures
Conducting a thorough psychometric evaluation of a socialemotional measure is critical when adapting measures across cultures (Alonso-Alberca, Vergara, Fernández-Berrocal, Johnson, & Izard, 2012;Heo & Squires, 2012). The AKT has been used outside of the United States, and at each juncture it has undergone psychometric evaluations to ensure valid inferences can be drawn in the new setting. Establishing strong validity evidence in a measure's country of origin is a crucial first step, but insufficient in and of itself when using a measure in a new cultural setting. Satisfactory psychometric evaluation results must be established in the new setting to reliably draw cross-cultural comparisons (Elosua & López-Jáuregui, 2008).
When conducted thoroughly, psychometric evaluation is a rigorous process that analyzes a number of aspects of a measure to accumulate evidence to support a validity argument (Cook & Beckman, 2006;Royal, 2017). When psychometrically evaluating a measure, one ultimately needs to ascertain the degree to which valid conclusions can be drawn from the results of the measure. Replication of a study, along with psychometric evaluation, should be conducted periodically to ensure the ongoing validity of research conclusions. Cook and Beckman (2006) cogently define validity as "how well one can legitimately trust the results of a test as interpreted for a specific purpose" (p.166.e8) are reasonable and appropriate.
In recent decades, there has been a movement toward forming a unified approach to validity. In the unified approach, construct validity is a central component, and reliability is recognized as an essential element of validity (e.g., American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014; Cook & Beckman, 2006;Messick, 1998). The unified approach to validity claims that all types of validity (i.e., construct, content, and criterion validity) should be encompassed within one comprehensive framework: construct validity. The reason construct validity is all-encompassing is because conclusions drawn using an instrument are only accurate to the degree the instrument actually measures what it purports to measure (Cook & Beckman, 2006). As Messick (1998) writes, in accord with Cook and Beckman, constructs are attempts at capturing the essence of traits; although construct validity is important, it can never be proven, only reach to varyingly accurate degrees.
The current study focused on establishing two types of validity evidence as listed by Sousa and Rojjanasrirat (2011): (a) internal consistency reliability and (b) construct-related validity (e.g., convergent validity, divergent validity). There are various ways to examine these different types of evidence. For instance, demonstrating relations among scales of the adapted measure is a way to evaluate construct-related validity (Hong et al., 2006).

Guideline for Cross-Cultural Adaptation of a Measure
In addition to the requirements of general psychometric evaluation described above, cross-cultural adaptation of a measure involves four major steps, which make the adaptation process complex (Byrne, 2016;Sousa & Rojjanasrirat, 2011). The first step is translation of the measure from the source language (SL) to the target language (TL). This step includes forward translation (from SL to TL) and back translation (from TL to SL) followed by examination of the three versions of the measure: the original, the forward translation, and the back translation by experts on the subject. During the examination, any discrepancies as well as cultural relevance of the content are reviewed, and then changes are made as needed. The second step is pilot and/or field testing of the adapted measure in the target culture. By administrating the measure to a representative sample of the target population and conducting a preliminary psychometric evaluation, researchers identify poorly operating or culturally irrelevant items and then carry out modifications accordingly. This step can be done with a small size sample (e.g., n = 10-40). The third step is full psychometric evaluation with a large sample, in which researchers collect multiple pieces of validity evidence. The final step involves establishing norms of the adapted measure in the target culture (Byrne, 2016). Determining the standard score allows the users of the measure to interpret how well or poorly an individual performed on the measure.
To evaluate their cross-cultural adaptation of the AKT, Machado et al. (2012) administered the Portuguese version of the AKT to 160 Portuguese preschoolers (3-5 years). Two types of validity evidence (Sousa & Rojjanasrirat, 2011) that the authors examined were as follows: (a) the internal consistency of the measure for its reliability and (b) a CFA for its validity. The results showed that naming "afraid" contributed to a low internal consistency and had a very low factorial weight. Thus, they decided to eliminate the item, along with two emotion-eliciting scenarios (one was ethically not applicable for the Portuguese culture; another one had a low response rate on mothers' preassessment questionnaire). Then, the results of another CFA yielded a good fit, and the internal consistencies of the AKT subtests and the total scale were moderate to high. The results also showed that identifying emotions is easier than naming emotions for Portuguese preschoolers, which align with the developmental trajectory found in American preschoolers.

Cultural Factors to Consider for Cross-Cultural Adaptation
Taking previous cross-cultural adaptations of the AKT into consideration, what potential factors should be considered in a preliminary psychometric evaluation of the AKT with Japanese preschoolers? There is a substantive difference between American and Japanese cultures. American culture is represented as an independent culture where individual autonomy is valued, whereas Japanese culture is represented as an interdependent culture where group harmony is valued (Markus & Kitayama, 1991;Trommsdorff & Heikamp, 2013). Independent cultures encourage people to express emotions openly and directly because such expressions promote autonomy through assertion of personal feelings (Masuda et al., 2008). In contrast, interdependent cultural norms lead people to suppress emotion expressions, especially negative ones, because they may become destructive to group harmony (Kitayama, 2001;Trommsdorff & Heikamp, 2013). Hence, Japanese people are encouraged to cover up negative facial expressions with smiles and show fewer facial expressions in general than people in independent cultures (Aune & Aune, 1996;Gudykunst, Ting-Toomey, & Nishida, 1996).
Differing cultural emphases on emotion expression are reflected by where individuals look when interpreting others' emotions. Yuki, Maddux, and Masuda (2007) found that Japanese people tend to focus on eyes as emotional cues, whereas American people are more likely to focus on mouth. The authors explained that the eyes are more difficult to control than the mouth and are therefore a more reliable emotional cue in Japanese culture where emotion suppression is encouraged. When examining the reliability and validity of the AKT with Japanese preschoolers, it is important to take into consideration these cultural differences in emotion expressions and emotion decoding, which may possibly influence preschoolers' interpretations of emotion faces in the test.

Focus of the Present Study
Following the steps of cross-cultural adaptation of a measure (Byrne, 2016;Sousa & Rojjanasrirat, 2011) outlined in the previous section, we carried out the second step by conducting field tests of the Japanese-translated version of the AKT (Fujioka, 2008) and a preliminary psychometric evaluation of the measure to prepare for the third step: future full psychometric evaluation. Specifically, the present study consists of two studies. The first study examined internal consistencies of its scales and the total emotion knowledge and correlations among its subscales and then identified poorly functioning items. In the second study, after making modifications by taking into account the cultural differences in emotion expressions and emotion decoding, we conducted another field test and evaluated psychometric properties of the modified AKT. As a standard of comparison, the information of psychometric properties of the original AKT with American children was used.

General Method
As briefly introduced in the previous section, the AKT (Denham, 1986;Denham et al., 2002;) is a widely used measure assessing preschoolers' emotion knowledge using puppets with detachable emotion faces made of felt. The procedure of administering the two subtests and the scoring system is as follows.

Subtests
Emotion recognition. First, expressive (verbal) recognition is measured. Four emotion faces (i.e., happy, sad, angry, afraid) are presented in a row in front of the child. The researcher points to each of the emotion faces and asks the child, "How does she or he feel?" Second, receptive (nonverbal) recognition is measured. The researcher shuffles the emotion faces and presents them again in a row. The researcher tells the child, "Point to the Happy/Sad/Angry/Afraid face" for each emotion. Because four emotions are measured, each of these two tasks therefore consists of four items. Third, after completing the emotion recognition tasks, a teaching phase is held to clarify any possible misunderstanding and prepare the child for the situation knowledge task. The researcher picks up each emotion face and demonstrates the emotion with vocal and facial expression and body language (e.g., says "This is Sad. Sad." with a sad voice, head downcast, sad eyes, corners of mouth down, and shoulders dropped).
Situation knowledge. In the situation knowledge task, four emotion faces are placed in front of the child. The researcher uses puppets to act out short scenarios that preschoolers may encounter in everyday life and then asks the child how the main character feels by saying, "How does she or he feel?" Following the prompt, the researcher says, "Give [the main character's name] a face" and asks the child to choose a face from the four emotion faces, which reduces language requirements. In stereotypical scenarios, the puppet's emotional responses represent emotions that children commonly experience (e.g., feel afraid when having a nightmare). There are eight stereotypical scenarios-two scenarios for each basic emotion. However, nonstereotypical scenarios depict situations where children's emotional responses vary among different individuals (e.g., happy or sad about being dropped off at preschool). Using the information collected from their parents before the assessment, the researcher shows emotional responses that are different from emotions the child would usually experience (e.g., shows sad for a scenario where the child would usually feel angry) so that the child's perspectivetaking abilities can be also assessed. Twelve nonstereotypical scenarios consist of two types of scenarios: (a) positive-negative scenarios, in which the puppet's emotional response could be either positive or negative (e.g., happy or afraid about going to a swimming pool), and (b) negative-negative scenarios, which have two possible negative emotional responses (e.g., sad or afraid about being scolded by mother).

Scoring
Following the original AKT scoring system (Denham, 1986;Denham et al., 2002), children's performance on all AKT items is scored from 0 to 2, with 0 for a wrong response (e.g., happy for sad situation), 1 for correct positive/negative valence including a behavioral description for emotional state (e.g., angry for sad situation, "crying" for sad situation), and 2 for a correct response. No response and refusal to answer (e.g., "No"; "I don't know") are treated as missing data.
For analyses in the present study, mean item scores for expressive recognition, receptive recognition, stereotypical situation knowledge, and nonstereotypical situation knowledge were calculated to make aggregates for the four subscales. Next, we created the emotion recognition scale by combining expressive and receptive recognition subscales and the situation knowledge scale by combining stereotypical and nonstereotypical situation knowledge subscales. Finally, we calculated the mean z scores of all items to create the total emotion knowledge aggregate.

Study 1
To examine the reliability and construct validity of the AKT in Japanese culture, we administered the Japanese-translated version (Fujioka, 2008) of the AKT to Japanese preschoolers. The first step toward cross-cultural adaptation of a measure (i.e., forward and back translation) was performed by Fujioka (2008). First, a translator who is fluent in both Japanese and English translated the original AKT instructions and scenarios into Japanese. Then, the translation was back-translated into English by another translator. The back translation was compared with the original English instructions and scenarios and was demonstrated to have semantic equivalence to the original. In addition, the contents of the scenarios were checked for cultural relevance, and minor changes were made (e.g., tiger appears in nightmare has been changed to lion, which was more familiar to Japanese preschoolers). In Study 1, using the Japanese-translated version of the AKT, we carried out the second step of cross-cultural adaptation of a measure: field-testing of the adapted measure and conducting a preliminary psychometric evaluation.

Method
Participants are fifty 3-and 4-year-old Japanese preschoolers (22 boys, 28 girls; M age/month = 44.38, SD = 4.77) living in the central region of Japan. As for preschool/nursery school enrollment, 80.4% were enrolled, 3.9% have been enrolled before, and 15.7% have never been enrolled.
We determined the sample size following the sample size recommendation for pilot or field testing of a culturally adapted measure in the target culture (Sousa & Rojjanasrirat, 2011). To check whether the sample size was sufficient, we conducted a post hoc power analysis using G*Power 3.1 (Faul, Erdfelder, Lang, & Buchner, 2007). With α = .05, two-tailed, and effect size = .47, which was determined based on the average r = .22 from the intersubscale correlation analyses reported in the following section, the power analysis revealed that the statistical power for the study was .94, showing sufficient power to detect significant correlations.
The region was selected for the study because it reflects a representative sample of the national population in terms of socioeconomic status. The annual income per capita in two prefectures that comprise the region (2,974,000 yen and 2,530,000 yen) was fairly close to the national income per capita (2,790,000 yen, SD = 311,000 excluding Tokyo as an extreme outlier) (Statistics Bureau, Ministry of Internal Affairs and Communications, 2017). In addition, Japan is ethnically homogeneous, with approximately 98% of the population being ethnic Japanese (Statistics Bureau, Ministry of Internal Affairs and Communications, 2018). According to the report of the 2010 Japanese population census (Statistics Bureau, Ministry of Internal Affairs and Communications, 2011), relatively equal numbers of 3-and 4-year-old boys and girls comprise the Japanese population (boys 51%, girls 49%), and about 90% of them were enrolled in preschool/nursery school (Cabinet Office, 2013). Therefore, because of the current study's gender ratio, high preschool enrollment, and identified family income levels, the demographics of our sample match the general Japanese population reasonably well.
Families were recruited through preschools and community centers. The study was reviewed and approved by NTT Communication Science Laboratories's Ethics Committee (approval number: H22-005). The data collection took place in a playroom at a laboratory in 2011. After obtaining informed consent from parents, a trained researcher administered the AKT to preschoolers. We used the puppets and emotion faces originally created for American children and the Japanese-translated instructions and scenarios from a previous study (Fujioka, 2008). Also, as a comparison group, the data of three hundred twelve 3-and 4-year-old American preschoolers (156 boys, 156 girls; M age/month = 49.69, SD = 6.52) from a large cross-culture study (Bassett et al., 2011) were used in the analyses.

Results and Discussion
As a preliminary psychometric evaluation of the Japanesetranslated version of the AKT (Fujioka, 2008), we examined internal consistency for reliability and performed interscale correlation analyses for construct-related validity, by following recommendations on analytic strategies for cultural adaptation of a measure (e.g., Hong et al., 2006;Sousa & Rojjanasrirat, 2011). In addition, we report the results of descriptive statistics, including mean scores and frequencies of children's response to items, to show whether Japanese children understood emotions in a similar way that American children did. The results of both Studies 1 and 2 are presented in Tables 1 to 4.
Reliability. Descriptive statistics showed that there was no extreme outlier or missing data that could be problematic for the following analyses. Reliability analyses showed that, with American children, the total emotion knowledge (28 items) and both scales of emotion recognition (eight items including expressive and receptive recognition) and situation knowledge (20 items including stereotypical and nonstereotypical situations) had good internal consistencies (α = .88, .72, .87). Although Japanese children's responses also showed good internal consistencies for the total and situation knowledge (α = .80 and .85), the recognition scale showed a low internal consistency (α = .52). Furthermore, as another source of internal consistency evidence, the mean interitem correlation of each of the total emotion knowledge, emotion recognition, and situation knowledge was .13, .16, and .22, respectively. Given that a mean interitem correlation between .15 and .50 is recommended (Clark & Watson, 1995), the mean interitem correlation of the total emotion knowledge was not adequate.
Construct-related validity. Both emotion recognition and situation knowledge tasks should be assessing preschoolers' emotion knowledge, the construct intended to measure; therefore, there should be significant correlations among those tasks. As a mean to evaluate construct-related validity (Hong et al.,  Note: The U.S. sample is from a large cross-culture study (Bassett et al., 2011). JPN1 and JPN2 show the results of descriptive analyses from Studies 1 and 2. Children received 0 for an incorrect response (e.g., happy for sad situation), 1 for correct positive/negative valence including a behavioral description for emotional state (e.g., angry for sad situation, "crying" for sad situation), and 2 for a correct response. No response and refusal to answer (e.g., "No"; "I don't know") were treated as missing data.
2006), we conducted interscale correlation analyses. The results indicated that there was a significant positive relation between American children's performance on the recognition and situation tasks (r = .50, p < .001). In contrast, there was no association between Japanese children's performance on those tasks (r = .01, p > .05).
To further evaluate construct-related validity and investigate what was contributing to the lack of association, intersubscale correlation analyses among four subscales (i.e., expressive recognition, receptive recognition, stereotypical situation knowledge, and nonstereotypical situation knowledge) were conducted. Although there were significant positive relations among all subscales with American children, Japanese children only had a significant correlation between stereotypical and nonstereotypical situation knowledge subscales (see Table 1). Particularly, the expressive recognition subscale was not related to other subscales (see Table 1). It is important to note that given that correlation coefficient r 2 is an indicator of effect size and can be affected by a small sample (e.g., greater impact of having an outlier, dissimilar distribution; Goodwin & Leech, 2006), we should take into account the sample size difference between the American and Japanese samples. That is, with the American sample, r = .34 is highly significant, whereas r = .26 is only marginally significant with the much smaller Japanese sample. Nevertheless, with the Japanese sample, compared with the situation knowledge subscales that were correlated significantly positively with each other, the emotion recognition subscales were neither related to each other, nor to the situation knowledge subscales. These results may suggest the possibility that with the Japanese sample, the recognition task is not measuring the construct that it was intended to measure.
Given these findings on the recognition construct, we looked more closely at whether Japanese preschoolers interpret the four basic emotions in a similar way as their American counterparts. First, as shown in Table 2, the mean scores of the subscales illustrate that Japanese preschoolers performed as well as American preschoolers on the receptive recognition (with almost identical means), stereotypical, and nonstereotypical tasks, but not on the expressive recognition task. The results of t test confirmed that the Japanese preschoolers' expressive recognition scores were significantly lower than the American preschoolers' scores, t(352) = 7.18, p < .001. Next, in checking how many preschoolers responded to items correctly (see Table 3), very few Japanese preschoolers gave the correct response on expressive recognition for happy and sad, whereas the majority of American preschoolers responded correctly. Both Japanese and American preschoolers performed well on naming angry. Note. The U.S. sample is from a large cross-culture study (Bassett et al., 2011). JPN1 and JPN2 show the results of descriptive analyses from Studies 1 and 2. Children received 0 for an incorrect response, 1 for correct positive/negative valence, and 2 for a correct response. No response and refusal to answer (e.g., "No"; "I don't know") were treated as missing data.
Finally, although afraid is the most difficult emotion among the four basic emotions for both Japanese and American preschoolers, Japanese preschoolers struggled more than American preschoolers. These findings revealed that within the emotion recognition task, Japanese preschoolers had a particular difficulty with expressive recognition. In further analyzing Japanese preschoolers' expressive recognition responses based on findings in Tables 2 and 3, it became clear that the majority of them recognized the afraid face as surprise. On the receptive recognition task, about 42% of them also struggled with pointing to the correct face when they were asked to find the afraid face, whereas they performed well on identifying the rest of emotion faces (see Table 4). In addition, many of them also attended to the tear on the sad face on the expressive recognition task and answered, "crying," which was scored as a correct valence.
Considering that Japanese people often mask negative facial expressions and show fewer facial expressions than people in independent cultures (Aune & Aune, 1996;Gudykunst et al., 1996), the original AKT's afraid face created in the United States may have been overly expressive and looked different from afraid expressions that the Japanese preschoolers were familiar with. Also, because of the cultural display rule, Japanese people attend more to the eyes than the mouth when they interpret facial expressions (Yuki et al., 2007). The Japanese preschoolers in this study may have focused on the eyes of the AKT's emotion faces, and the tear by the eyes on the sad face may have distracted them, which led them to describe a behavior (i.e., crying) instead of an emotion (i.e., sad).
Cumulatively, these findings led to a conclusion that cultural differences in cultural display rules and emotion decoding might be affecting Japanese preschoolers' performance on the AKT emotion recognition task. Therefore, we decided to modify the AKT's emotion faces to adapt the AKT better to Japanese culture, improving its reliability and constructrelated validity.

Modifying Emotion Faces for Japanese Preschoolers
Previous research shows that cultural differences in emotion recognition exist due to the ways people decode facial expressions. Following the finding that Japanese people attend more to the eyes than the mouth as emotional cues, Jack, Blais, Scheepers, Schyns, and Caldara (2009) further examined the cultural differences between East Asian and Caucasian in emotion decoding of universal facial expressions. The results revealed that because East Asian, including Japanese, focused persistently on the eye region and less on other regions, they had more difficulty than Caucasian in distinguishing negative facial expressions, which share similar expressions of eyes. The authors of the study argue that although certain emotions are universal, expression of those emotions might not be universally identical. There might be subtle cultural-specific emotion signals, especially for negative emotions, due to cultural differences in emotion display rules and emotion decoding. These findings suggested that, when we modify the AKT's emotion faces, (a) a focal point of the modification should be the eyes, and (b) the eyes of each emotion face alone should be distinguishable for Japanese preschoolers.
First, to modify the AKT's emotion faces, we contacted a well-known Japanese illustrator, who has been designing characters for children's educational TV programs, and asked him to draw several emotion faces for each emotion (i.e., happy, sad, angry, afraid), which depict typical emotion expressions familiar to Japanese children. From the emotion faces, we selected a couple of faces for further analyses, specifically the faces that differed illustratively from the original AKT's faces' eyes. The mouths of the faces were also slightly modified to make the faces similar to common Japanese illustrations. Then, we made a group of faces for each emotion, including original emotion faces created in the United States.
Next, to validate whether modified emotion faces are appropriate to Japanese culture, we asked 30 Japanese adults (12 males, 16 females, two missing response) to rank the faces that best depict a certain emotion. Their responses were analyzed, and the faces that were most often ranked as the first choice were selected for modified emotion faces for Japanese preschoolers (see Figure 1).
To examine whether the modification of emotion faces would improve the reliability and construct validity of the AKT in Japanese preschoolers, we administered the AKT with the modified emotion faces to a second sample of Japanese preschoolers.

Method
Participants were fifty-one 3-and 4-year-old Japanese preschoolers (29 boys, 22 girls; M age/month = 46.53, SD = 6.96) living in the central region of Japan. Regarding preschool/ nursery school enrollment, 74.6% were enrolled, 7.8% have been enrolled before, and 17.6% have never been enrolled. A post hoc power analysis using G*Power 3.1 (Faul et al., 2007) with α = .05, two-tailed, and effect size = .69, which was determined based on the average r = .47 from the intersubscale correlation analyses (Table 1), indicated the statistical power of Study 2 was 1.00. In the same way as Study 1, the participants were recruited through preschools and community centers, and we followed the same procedure to collect data. A trained researcher administered the AKT with the modified emotion faces to preschoolers in a playroom at a laboratory.

Results and Discussion
Reliability. There was no extreme outlier or missing data that could be problematic for the analyses. The results of reliability analysis showed that the internal consistencies for all scales and the total emotion knowledge were improved. Cronbach's alphas were .90 for the total emotion knowledge and .89 for the situation knowledge task. The most noticeable increase in Cronbach's alpha value was found for the recognition task (α = .72). In addition, the mean interitem correlation of each the total emotion knowledge, emotion recognition task, and situation knowledge task was .22, .25, and .27, respectively, which are within the recommended range of .15 to .50 (Clark & Watson, 1995). These findings present evidence of psychometric properties of the AKT with the modified faces for Japanese preschoolers.
Construct-related validity. First, interscale correlation analysis indicated that there was a significant positive relation between Japanese children's performance on the recognition and situation tasks (r = .61, p < .001), an improvement from Study 1. Furthermore, intersubscale correlations among the four subscales showed significant positive relations among most of subscales (see Table 1).
The mean scores of the subscales demonstrate slight improvement in Japanese preschoolers' performance on the expressive recognition task, although the results of t test were not significant, t(82) = −0.89, p > .05 (see Table 2). However, their performance on receptive recognition task declined from Study 1, t(98) = 2.32, p < .05. There was no difference between Studies 1 and 2 in the mean scores of stereotypical situation knowledge, t(96.00) = −0.44, p > .05, and nonstereotypical situation knowledge t(92) = −0.20, p > .05. In looking at how many of them responded to items correctly on the expressive task (see Table 3), more preschoolers gave the correct response to sad and the correct/ correct valence response to afraid than the preschoolers who participated in Study 1. However, during the receptive task, perhaps because the tear (a common emotional cue for sad) was taken out from the sad face, it seemed that some children were not sure which face (sad or afraid) was the correct one for sad. Thus, their performance on the receptive "sad" item declined from Study 1, which contributed to the lower mean score for the receptive task. Overall, although there are things to be considered for further improvement, these findings demonstrate preliminary evidence of construct-related validity of the AKT with the modified faces. Future research should examine full psychometric evaluation with a larger sample.
Sensitivity to developmental changes. In addition to validity evidence, to examine whether the AKT with the modified faces can assess developmental changes in Japanese preschoolers' emotion knowledge, correlation analyses between children's age and AKT scores were performed. The results showed that there were significant positive correlations between children's age and the emotion recognition scores, the situation knowledge scores, and the total AKT scores (rs = .46, .41, and .48, respectively; p < .01). That is, the AKT with the modified faces demonstrated sensitivity to the developmental trajectory of emotion knowledge in Japanese preschoolers via the correlations with children's age.
Although studies with American preschoolers reported a ceiling effect on their AKT performance around 4½ years old (Denham, 2006), a ceiling effect on Japanese preschoolers' performance was not found in the present study. As shown in Tables 2 and 3, it appears that low scores on the expressive recognition task prevented a ceiling effect. It is possible that as open and overt emotion expressions (both verbal and nonverbal) are discouraged in Japanese culture (Aune & Aune, 1996;Gudykunst et al., 1996), young children may have a good understanding of positive and negative valences, but may not be good at clearly naming and distinguishing emotions within positive and negative valences. Thus, it may take longer for Japanese preschoolers to learn to correctly name emotions on the expressive task. In future research, a ceiling effect may possibly emerge with a sample including older preschoolers (i.e., 5-year-olds).

Summary and Concluding Discussion
The present study conducted a preliminary psychometric evaluation of the AKT in Japanese preschoolers and demonstrated validity evidence through two studies. In Study 1, results of reliability analyses with the Japanese-translated version of the AKT showed a low internal consistency for the emotion recognition scale, and the mean interitem correlation of the total emotion knowledge was not adequate. As for construct-related validity, the AKT's subscales were not well correlated with each other, and particularly it appeared that Japanese preschoolers struggled with the expressive recognition task. Thus, these findings indicated that the version of the AKT, which was simply translated into Japanese, was not at the point a reliable and valid measure for Japanese preschoolers.
Further analyses revealed the possibility that cultural differences in emotion display rules and emotion decoding might be affecting Japanese preschoolers' performance on the emotion recognition task. Given that Japanese people attend more to the eyes than the mouth as emotional cues (Jack et al., 2009;Yuki et al., 2007), we modified the AKT's emotion faces, focusing on the eyes. In Study 2, the AKT with the modified faces was administered to another sample of Japanese preschoolers. The results indicated improvements in the internal consistencies for all scales, with the most noticeable Cronbach's alpha increase in the recognition task, as well as good mean interitem correlation for all scales. The preliminary construct-related validity evidence was also observed via significant intersubscale correlations. Moreover, the correlations between children's age and the AKT scores indicated sensitivity to the developmental trajectory of emotion knowledge in Japanese preschoolers. Overall, these findings demonstrated preliminary psychometric evidence for future cross-cultural adaptation of the AKT with Japanese preschoolers.
The present study also pointed out considerations for a future full psychometric evaluation to flesh out preliminarily identified cultural differences. The mean scores of the four AKT tasks showed that, among all of the tasks, the expressive recognition task was most difficult for both the American and Japanese preschoolers. The same trend was found in Portuguese preschoolers as well (Machado et al., 2012). The expressive task is the only task that requires verbal response, so it is understandable that it may be challenging for some preschoolers, especially for those with lower language ability. However, it is notable that, even after making modifications to the AKT emotion faces, the Japanese preschoolers performed much worse than the American preschoolers on the expressive task, whereas they performed as well as the American preschoolers on the other tasks. That is, it seems that the Japanese preschoolers grasped the concept of emotion, could identify the four basic emotions, and understood emotion-eliciting situations, but were unable to name emotions correctly.
We wondered why the Japanese preschoolers had greater difficulty with the expressive task compared with American children. In reviewing their responses on the task, we noticed that many of the Japanese preschoolers in both Studies 1 and 2 gave correct valence responses to the happy and sad items, and most of the responses were behavioral expressions of emotion (e.g., smiling, crying). However, angry was the easiest emotion to name for the Japanese preschoolers, perhaps because the Japanese word for angry (okotta) is also used as behavioral expression of anger. Given that Japanese people tend to suppress emotion expressions (Aune & Aune, 1996;Gudykunst et al., 1996), Japanese preschoolers' emotional communication may be expressed indirectly and often involve behavioral expressions of emotions (e.g., "She is smiling"), rather than a directly expressed emotional state (e.g., "She is happy").
As Hayashi, Karasawa, and Tobin (2009) write, the goal of Japanese preschool is for young children to have opportunities to develop empathy and become aware of others' nonverbal emotion expressions, and these are often taught implicitly through interpersonal interactions. In contrast, American early childhood educators are encouraged to explicitly teach social-emotional skills. Preschools often adopt a specific Social-Emotional Learning (SEL) curriculum, and teachers often aim to intentionally teach emotion skills through books, activities, using teachable moments, and scaffolding in the form of providing cues and feedback (Ho & Funk, 2018). American preschool teachers, as well as parents, often encourage children to name the emotions that they experience, but such practices are not especially common in Japanese culture; thus, Japanese preschoolers in general may not be used to naming their own and others' emotional state. The differences in educational and family practices may explain why a sizable number of the Japanese preschoolers who participated in the present study gave no response on the expressive recognition task, which was treated as missing data in the analyses. Perhaps, they were not familiar with how to respond to the question, "how does she or her feel?" Because emotions are often implicitly taught in Japan (Hayashi et al., 2009), it may take longer for Japanese preschoolers to correctly name emotions. Future full psychometric evaluation of the AKT should include older preschoolers to see how they perform on the expressive emotion recognition task.
In regard to the higher percentage of missing data in the current study with Japanese preschoolers in contrast to the low percentage of missing data with American preschoolers, the current scoring system, as well as cultural differences, offers an explanation. Because instances of nonresponse were very low with American preschoolers, the current scoring system of coding a nonresponse or refusal to answer as missing data did not raise a problem. However, these instances were more frequent for Japanese children, likely reflecting the Japanese cultural norm of not explicitly discussing emotion. For the present study, because the current scoring system is not able to differentiate no response from regular missing data, it is difficult to point out where children struggle with naming emotions. Going forward, it would likely make more sense to code such responses as a 0, like other incorrect responses. This will increase the utility of the AKT for progress monitoring emotion knowledge development, as the main aim of SEL would be for preschoolers to verbalize their emotions. Being able to track a child's progress from a nonresponse (0) to a correct response (2) would demonstrate emotion knowledge growth. This underscores the potential usefulness of the AKT to early childhood educators to be able to assess their students' emotion knowledge.
In addition to supplying preliminary evidence for the developmental and cultural appropriateness of the modified AKT for Japanese preschoolers, the present study highlighted the importance of cross-culturally adapting measures. As research on children's emotional development expands worldwide (e.g., Raval, Raval, Salvina, Wilson, & Writer, 2013;Wang, 2001), more measures will be used among different cultures. When using a measure developed in a different culture, it is critical to achieve the cross-cultural adaption of the measure through not only language translation of the measure but also adaptation to the culture (Beaton, Bombardier, Guillemin, & Ferraz, 2000). Even if the measure has been translated into a language appropriate for the sample audience, construct validity should not be assumed (Hilton & Skrutkowski, 2002). Culture with all of its linguistic and other nuances has to be taken into account (Van de Vijver & Hambleton, 1996). If an adapted measure produces scores that are less reliable and valid than those of the original measure (in this case the Japanese AKT and the English AKT, respectively), comparisons between these samples can be difficult, research findings may be seen as unreliable, and study conclusions may not be sufficiently valid.
For decades, research has centered on the extent to which emotion knowledge is universal (Shao et al., 2015). The consensus seems to have broken away from an absolute acceptance of the universality of emotion knowledge. The social constructivist theory of emotion stipulates that although some components of even basic emotions (happy, anger, sadness, fear) are biologically based and therefore universal, these emotions are enacted differently due to enculturation, and therefore, there are noticeable differences (Shao et al., 2015). Kitayama, Markus, and Kurokawa (2000) argue for consideration of a weak view of the social constructivist theory of emotions. Our findings lend support to the idea that basic emotions are indeed universal, but that some cultural sensitivity is necessary to fully understand the intricacies in emotion knowledge across cultures. For instance, our study findings align with previous research that young children learn to identify happy, angry, and sad before they can reliably identify afraid (Widen & Russell, 2003). However, there were fine-grained differences in Japanese preschoolers' expressive emotion knowledge compared with Western populations. Therefore, to examine both universal and culturalspecific emotion knowledge, cross-cultural adaptation of the AKT with Japanese preschoolers is critical.
Although the present study demonstrated preliminary psychometric evidence of the AKT with the modified faces in Japanese preschoolers, we had small samples that were just enough to conduct field tests. To complete a full psychometric evaluation of the measure and achieve the cross-cultural adaptation of the AKT with Japanese preschoolers, a larger sample (n = 100+) is necessary. As investigated in the AKT validation study with Portuguese preschoolers (Machado et al., 2012), future research should conduct CFAs examining factor structures of the AKT with modified faces. In addition, because the AKT's scoring system is polytomous (i.e., 0, 1, 2), the Factor program developed by Lorenzo-Seva and Ferrando (2017) should also be considered when estimating reliability in future evaluations. Administering a previously established measure of emotion knowledge, such as the Puppet Causes Task (Denham & Zoller, 1991), which is an open-ended interview task asking children to talk about the causes of emotions, and examining whether children's performance on the task is correlated with the AKT would provide validity evidence. In addition to the sources of validity evidence demonstrated in the present study, it is also useful to examine criterion-related validity by examining the relation between the AKT and other indices of children's social-emotional competence, such as social competence and prosocial behaviors, which have been found to be associated with emotion knowledge (e.g., Denham et al., 2003;Ito, 1997;Izard et al., 2001). To fully evaluate validity, future research needs to collect cumulative evidence (Cook & Beckman, 2006;Royal, 2017).
In sum, as a critical step toward cross-cultural adaptation of the AKT in Japan, the present study conducted field tests with the Japanese-translated version of the AKT and demonstrated the preliminary validity evidence of the AKT using the modified faces with Japanese preschoolers, as well as its developmental appropriateness. The next logical step would be a more comprehensive evaluation of its validity. Once the measure is fully validated with Japanese preschoolers, it would be very useful to assess Japanese preschoolers' baseline emotion knowledge, development of that knowledge, and any improvements after implementing an intervention program facilitating emotion knowledge development. In addition to its practical implications in being a potential tool to identify and monitor children in need of emotion knowledge intervention, continued study of the cross-cultural adaptation of the AKT would be beneficial for expanding