A Reliability Generalization of the Suinn-Lew Asian Self-Identity Acculturation Scale

A reliability generalization was conducted on studies that reported use of the Suinn-Lew Asian Self-Identity Acculturation Scale (SL-ASIA), published between 1987 and 2013. For inclusion in this meta-analysis, each study had to have reported a Cronbach’s alpha reliability coefficient for its sample. Data from 83 Cronbach’s alpha coefficients representing 12,992 participants were analyzed; only 67 out of 193 published studies (43.52%) reported reliability scores for their sample. The reliability scores produced by the SL-ASIA ranged from .62 to .96 with an average of .91 (SD = 0.07); therefore, all of the reported reliability scores for this instrument were in the acceptable to excellent range. Our results demonstrate that SL-ASIA continues to be an instrument with strong psychometric properties when used with diverse populations, and it is therefore appropriate for continued use with studies on acculturation.


Introduction
Information concerning acculturation is useful in both research and practice as the immigration of ethnic minority populations has significantly increased in past years (U.S. Census Bureau, 2010). The U.S. Department of Commerce, Bureau of Census reported that 12.5% of the total population consists of immigrants (Bhaskar, Arenas-Germosen, & Dick, 2013). Although this figure may appear small, the U.S. Census Bureau estimates that this number is growing approximately 1.5% per year (Bhaskar et al., 2013). Within ethnic minority groups, the population of Asian immigrants grew faster than any other ethnic group between 2000 and 2010 (Hoeffel, Rastogi, Kim, & Shahid, 2012). Over the past 10 years, approximately 46% of the increase in the U.S. immigrant population has been among Asian Americans (Hoeffel et al., 2012). According to the 2010 U.S. Census, the Asian populations indicated their ethnicity as "Asian, Asian Indian, Chinese, Filipino, Korean, Japanese, Vietnamese, or provided other detailed Asian responses" (Hoeffel et al., 2012, p. 2). With the influx of Asian populations immigrating to the United States, acculturation needs to receive renewed focus in research as well as practice to better serve this population. Some of the acculturation challenges encountered by Asian American immigrants include gender and familial expectations, such as economic needs which may require a female to work outside of the home (R. H. Chung, 2001;Kibria, 1993). Asian Americans may also experience tension between individualism and collectivism, and tension may be exasperated by generation level of child, parents, and grandparents (Zhou & Bankston, 1998).

Measuring Acculturation
Acculturation has been described as "the process by which individuals adopt the attitudes, values, customs, beliefs, and behaviors of another culture" (Abraido-Lanza, Armbrister, Florez, & Aguirre, 2006, p. 52), and it has been linked to conflict, clinical symptoms and disorders, treatment withdrawal, and the use of medical and psychological resources (Sánchez et al., 2014). Extensive research on acculturation has shown that individuals who integrate their original culture with the majority culture have the best psychosocial and health outcomes (S. Lee, Chen, He, Miller, & Juon, 2013;Liebkind, 2001). Acculturation has also been associated with 661748S GOXXX10.1177/2158244016661748SAGE OpenPhillips et al.

research-article2016
1 Alliant International University, Fresno, CA, USA educational and occupational achievement relative to language fluency within a specific region (Nekby, Rödin, & Özcan, 2009). The significance of research related to levels of acculturation has only become more important as the level of first-, second-, and third-generation individuals rises in the United States (Murray et al., 2013).
In 1987, Suinn, Rickard-Figueroa, Lew, and Vigil devised the Suinn-Lew Asian Self-Identity Scale (SL-ASIA) as a response to "great interest in Asian-Americans in the research and treatment literature [as] there [were] no objective measures of acculturation" (p. 401). This instrument seeks to recognize the multitude and dimensions of acculturation, examine bicultural growth, and assess thoughts, behaviors, and attitudes as related to acculturation (Suinn, Rickard-Figueroa, Lew, & Vigil, 1987). The authors of SL-ASIA (Suinn et al., 1987) intended for the scale to measure the following attributes of acculturation: "language (4 questions), identity (4 questions), friendship choice (4 questions), behaviors (5 questions), generation/geographic history (3 questions), attitudes (1 question)" (p. 402). The SL-ASIA consists of 21 items, with a 5-point rating scale for each item. Higher scores obtained from this instrument indicate higher levels of acculturation (i.e., greater adherence to Western values), while lower scores indicate lower acculturation (Suinn et al., 1987); scores obtained from the normed study participants produced a satisfactory reliability score of .88 (Suinn et al., 1987).
Although the SL-ASIA consists of 21 items, researchers have created and used adapted versions of the scale by adding and removing items as well as by altering item wording (Hofstetter et al., 2007). For instance, Edrington et al. (2010) used two separate versions of the SL-ASIA in a single study: a shortened seven-item version of the original SL-ASIA and a Chinese translation of the SL-ASIA. Furthermore, J. Lee (2007) omitted three items of the 21-item SL-ASIA; they "were considered not relevant for the current study" (p. 71). As various adaptations of the SL-ASIA are currently being administered to diverse Asian American populations, it is important to investigate whether total score reliability can be generalized across studies (Vacha-Haase, 1998). To achieve this goal, a meta-analytic review of all published studies that reported use of any version of the SL-ASIA was conducted using the reliability generalization (RG) technique first described by Vacha-Haase in 1998.

RG
The term reliability is often misused when speaking of psychometric properties (Thompson, 1992). It is common for researchers, educators, and clinical practitioners to discuss reliability as a property of an assessment measure itself. However, reliability is not a direct property of a test or measure; rather it is a property obtained from scores on an assessment measure (Thompson, 1995;Thompson & Daniel, 1996). It has been further discussed that reliability is a "characteristic of data," (Eason, 1991, p. 84), which may be influenced by other factors (study participants) besides measurement characteristics (Dawis, 1987). Therefore, there are a number of variables and characteristics of study participants can influence scores that affect reliability (Thompson, 2003). Vacha-Haase (1998) noted, "Given the diversity of participants across studies, simple logic would dictate that authors of every study should provide reliability coefficients of the scores for the data being analyzed, even in nonmeasurement substantive inquires" (p. 8). Numerous factors can affect total reliability scores of a measure; consequently, authors should discuss reliability and psychometric properties in their research and publications.
Score reliability is influenced not only by factors concerning an assessment measure itself but also by variations within a sample (Rexrode, Petersen, & O'Toole, 2008). RG, proposed by Vacha-Haase (1998), "characterizes the typical reliability of scores for a given test across studies, the amount of variability in reliability coefficients for given measures, and the source of variability in reliability coefficients" (p. 6). Although authors sometimes cite the reliability coefficient reported for an instrument in a previous study, this practice is insufficient for describing the reliability of a present group of study participants. Reporting reliability coefficients for study participants provides important information on effect size (Reinhardt, 1996) and statistical power (Onwuegbuzie & Daniel, 2002;Rexrode et al., 2008;Vacha-Haase, Henson, & Caruso, 2002). "Several dozen" RG meta-analytic studies have been published since Vacha-Haase (1998) first proposed this method for investigating and summarizing the moderating factors for reliability scores produced by a measure (Vacha-Haase & Thompson, 2011, p. 160). Many of the RG studies have been published are widely cited, including the RG studies of the Spielberger State-Trait Anxiety Inventory (Barnes, Harp, & Jung, 2002) and the Beck Depression Inventory (Yin & Fan, 2000). RG studies provide an important contribution to the literature of psychological measurement as this type of meta-analysis allows researchers and authors "to have a better understanding of the measurement reliability of an instrument's scores across various applications" (Yin & Fan, 2000, p. 207).
The importance of measuring and recording psychometric properties of scores obtained from an assessment measure when administered to study participants has been noted throughout the literature (Crocker & Algina, 1986;Vacha-Haase, 1998;Vacha-Haase & Thompson, 2011). For example, Q. L. Huynh, Howell, and Benet-Martínez (2009) conducted a meta-analytic RG study of three instruments measuring acculturation; these authors noted the importance of examining reliability in acculturation measures, stating, "Without understanding aggregate score reliability, researchers may misinterpret important results in acculturation research, and this in turn may influence public policy and the distribution of resources for outreach and treatment programs for ethnicity minorities" (p. 257). In the present analysis, a published study that administered the SL-ASIA to group of participants was excluded from analyses if corresponding reliability coefficients were not provided. Henson and Thompson (2002) described why studies that do not report reliability coefficients for their sample cannot be included in RG studies: "reliability coefficients . . . typically become the dependent variables in RG studies" (p. 116). It has also been observed that underreporting reliability in published research continues to be problematic (Wilkinson & American Psychological Association [APA] Task Force on Statistical Inference, 1999). Failing to report reliability coefficients for each study's subsample of participants is problematic for various reasons but mainly because "interpreting the size of observed effects requires an assessment of the reliability of the scores" (Wilkinson & APA Task Force on Statistical Inference, 1999, p. 596). Crocker and Algina (1986) indicated that authors sometimes report reliability coefficients from previous samples. However, while it is helpful to provide reliability coefficients from previous studies with a similar sample, this practice is not sufficient as there could be variation between the previously studied population and the sample the researcher is using. Furthermore, researchers and practitioners are only aware of the usefulness of a measure for a specific population-aside from the population for which an instrument was normed-if reliability is calculated and reported for various subsamples of study participants.
The goals of the present research were (a) to determine the frequency with which authors report reliability coefficients for their current sample, (b) to assess the average and variation in reliability scores, and (c) to determine whether sample and study characteristics affect total reliability scores on the SL-ASIA. This instrument was selected for review in the present meta-analysis as the SL-ASIA is the most widely cited scale for measuring acculturation among Asians (Dao, Teten, & Nguyen, 2011). Henson and Thompson (2002) provided a review of recommendations for conducting an RG meta-analysis, including selection of a scale. These authors indicated "the test must enjoy enough use in the literature to allow a meta-analytic synthesis (Henson & Thompson, 2002, p. 116). An example of how to report reliability information for any Likert-type psychological instrument has been included in the "Discussion" section for easy access by readers who would like such a model. Note. RG = reliability generalization; SL-ASIA = Suinn-Lew Asian Self-Identity Scale.

Method
The term Suinn-Lew Asian  Figure 1) displays the process of identifying the studies that were included in the analyses for this study. Of the initial 204 possible sources, 11 were not included in the present study. Of the references that were excluded, three were not research studies, six could not be located, and two were not written in English. A total of 193 studies that reported the use of the SL-ASIA were included in the present study.
The 193 articles were each assessed by a first coder and sorted into four categories. The first category consisted of articles that used the SL-ASIA but failed to mention the psychometric property of reliability in any form (n = 20, 10.4%). The second category included studies that mentioned that the instrument was reliable and/or presented the issue of reliability in some form yet did not provide a reliability estimate from either the authors' own data or from previous sources that used the SL-ASIA (n = 3, 1.6%). The third category held articles that used the SL-ASIA but only presented reliability coefficients as reported for samples in previous studies (n = 86, 44.6%). The fourth category included studies that used the SL-ASIA and reported a reliability coefficient for the data collected in that study (n = 84, 43.5%). For the purpose of the present RG study, only articles in this final category were applicable for inclusion in this meta-analytic review of reliability coefficients for the SL-ASIA.
Different types of reliability coefficients may be used to estimate the reliability of a study such as spilt-half reliability estimates and test-retest reliability coefficients (Yin & Fan, 2000). Cronbach's alpha is commonly used as a measure of the strength of association between all possible combinations of items on an instrument (Zedeck, 2014). For the purpose of this study, reliability was investigated through the examination of Cronbach's alpha reliability coefficients.

Coding Method
A coding sheet was created to summarize relevant information and variables of interest for each of the included articles. Continuous variables included publication year, total reliability score/coefficient, sex of study participants (coded as percentage female in each sample), mean age, standard deviation of age, mean for SL-ASIA scores, and standard deviation for SL-ASIA scores. Other variables were coded as follows: 1. Publication type; 0 = journal article, 1 = dissertation 2. Country of birth; 0 = within the U.S., 1 = outside of the U.S. 3. Language of SL-ASIA; 0 = English, 1 = translation We were unable to include a number of other variables in the analysis. Although the following variables were assessed during the coding stage of this meta-analysis, they were not able to be included due to insufficient reporting or variation: Asian subgroup (e.g., Korean American, Chinese American, etc.), generation in the United States (e.g., first generation, second generation, etc.), and marital status of participants. Some of the variables had little to no variability and, therefore, were not appropriate for use in the analysis. For instance, most of the samples were from within the United States (n = 78, 92.9%), thus limiting the usefulness of these predictor variables in accounting for variability in the reliability coefficients. The measurement characteristics of mean score and standard deviation could not be examined as the method for calculation and the number of items used varied across the studies, rendering these variables unusable. One method of calculating the mean score in the SL-ASIA is detailed by Vang (2010): According to Suinn et al. (1987), scoring the SL-ASIA is done by summing all of the answers for the 21 items, with possible scores ranging from 21 to 105. The total scores then are divided by 21 or the total numbers of items are divided by the total scores to get the final acculturation scores. The acculturation scores can range between 1.00 (low acculturation) and 5.00 (high acculturation). (p. 52) However, Vang (2010) reported the mean and standard deviation for the SL-ASIA (M = 51.04, SD = 12.99) for the overall sample instead of the mean value of the responses on each test item. In contrast, Lee (2006) reported mean scores and standard devastation on the SL-ASIA for second-generation Korean American college students as the mean score for each item, not the total scale: (M = 3.3, SD = 0.4). It is not possible to compare the means of test items with the means on the overall test. Therefore, the variables of mean score and standard deviation of score were not able be used as predictor variables in this meta-analytic study.

Analyses
In the present RG study, SPSS Version 19 (Statistical Package for the Social Sciences) was used to conduct Pearson's bivariate correlations and independent samples t tests. Previous RG studies have used these types of statistical analyses, including the use of Pearson's bivariate correlations in an RG of the Working Alliance Inventory (Hanson, Curry, & Bandalos, 2002) and the inclusion of independent samples t tests in the RG of the Spielberger State-Trait Anxiety Inventory (Barnes et al., 2002). This present study is meta-analytic in nature as we sought to review all published studies that administered the SL-ASIA to their sample and reported a reliability coefficient for their data. Vacha-Haase (1998), who pioneered this type of meta-analytic study and presented the first RG study on the Bem Sex-Role Inventory (BSRI; Bem, 1981), described RG as a type of study "that can be used in a meta-analysis application similar to validity generalization" (p. 6). The primary inclusion criteria in this meta-analysis was for a study to report a reliability coefficient for the SL-ASIA for the data collected in their sample (i.e., not a report of a reliability coefficient from a previous source).
RG studies use meta-analytic techniques to summarize the sources for variability in reliability estimates for psychometric instruments used in published studies (Beretvas & Pastor, 2003). The different types of statistical methods used in RG studies are broad and complex. Beretvas and Pastor (2003) summarized the types of statistics that have been used to investigate the relationship between study characteristics and variability in reliability estimates. The level of sophistication of statistics varies widely, partially due to differences in sample sizes, and includes everything from descriptive statistics to canonical correlation. However, bivariate correlations, multiple regressions, and ANOVAs are among the more commonly used statistics in RG studies (Sánchez-Meca, López-López, & López-Pina, 2013).
The first goal of this RG study was to assess the frequency of studies that used the SL-ASIA and reported a reliability coefficient for their sample. This frequency was assessed through categorizing studies into the following groups based on their inclusion (or exclusion) of reliability coefficients for the SL-ASIA as used in their study: (a) reported reliability for the SL-ASIA their sample, (b) did not report reliability for the SL-ASIA in their sample. Second, the present study assessed the average variation in reliability scores reported for the SL-ASIA through a summary of the range, mean value, and standard deviation for the 84 alpha coefficients that are reported for this scale in published studies. Third, our study used independent samples t test to assess impact of predictor variables on reported reliability scores.

Results
Data from 67 studies (33 journal articles, 34 dissertations) representing 12,992 participants were analyzed. Some studies reported more than one coefficient alpha, resulting in a total of 83 Cronbach's alpha values analyzed (39 from journal articles, 44 from dissertations). As shown in Table 1, The SL-ASIA displayed acceptable internal consistency with average Cronbach's alpha coefficients ranging from .62 to .97, with a mean value of 0.85 and a standard deviation of 0.07 across the 83 alpha coefficients reported. Consistent with previous RG studies ( Barnes et al., 2002), Table 1 presents relevant information to RG analyses, including publication type, sample size, and instrument version.
As previously noted, 83 Cronbach's alpha coefficients were reported from 67 studies, and all 83 Cronbach's alphas were included in our analyses. Table 1 provides an overview of which studies provided more than one reliability coefficient for their sample, as indicated with the notation of, for example, S1 (Sample 1), S2 (Sample 2). Rexrode et al. (2008) recommended for studies to report reliability coefficients for subsamples (such as age, ethnicity, or gender) in addition to studies more adequately describing relevant identifying information. Because published studies may include more than one study group of participants, it is good practice to assess internal consistency for each separate study group; in these cases, more than one Cronbach's alpha coefficient is reported and included in study analysis separately. Each group of study participants may be diverse in age, gender, ethnicity, sex, education, and so forth, which is important to include in RG data analysis. The separate calculations of internal consistency for each group of study participants is in alignment with the recommendations set forth by the APA Task Force on Statistical Inference (Wilkinson & APA Task Force on Statistical Inference, 1999).
The variable of "country of birth" for participants (e.g., within the United States vs. outside of the United States) is also documented in Table 1. As previously noted, information was not available for all 67 studies included in the present meta-analysis, as such the "country of birth" variable could only be meaningfully reported in terms of whether the majority of study participants were either born inside the United States or outside the United States. Information on the "country of birth" variable was limited across studies; less than half of studies using the SL-ASIA reported this information which, in future studies, may be helpful for researchers in understanding further acculturation issues faced by individuals and may augment data collected by study participants.
As a reference point in interpreting Table 1 information related to Cronbach's alphas, Nunnally (1978) defined a marginal level of internal consistency as values above .70. Nunnally (1978) and Thompson (2003) defined and affirmed the acceptable level of internal consistency as equal to or exceeding .80. Only 2% of the reliability coefficients we reviewed were below .70, (Cronbach's α = .62, Lim, 2001; Cronbach's α = .68, Lei, 1998), and both of these values were present in dissertation studies. These two reliability coefficient values were not removed from analyses as the purpose of the current meta-analysis was to summarize the reliability of the SL-ASIA as reported for all previous studies that have used this instrument. In a review of whether lower data values should be removed from RG studies, Zijlstra, van der Ark, and Sijtsma (2011) concluded "only in simulated data does one know for certain whether an observation is a contaminant" (p. 209). These two reliability coefficients with relatively lower values where therefore included in the present study as they represented performance of the SL-ASIA as administered to those particular study participants. Finally, Note. Data that were not reported in a cited study have been intentionally left blank in this table. SL-ASIA = Suinn-Lew Asian Self-Identity Scale; SL-ASIA total α = Cronbach's alpha reliability score reported for the total scale; Language = Instrument language, coded dichotomous as "English" and "non-English translation" (translation); English & Trans. = the study was conducted in both English and a translated version; Birth country = country of reported birth by majority of study participants, as reported by studies, U.S. = born in the United States, or Outside U.S. = born in another country outside the United States; % female represents the sex ratio in the sample by presenting the reported percentage of female participants; S1 = subsample, Group 1; S2 = subsample, Group 2; S3 = subsample, Group 3; S4 = subsample, Group 4. only 17% of reliability coefficients were below .80. Therefore, the SL-ASIA overall has produced internal consistency scores in the acceptable range. One of the two significant predictors of reliability was type of publication. An independent samples t test was performed to assess the impact of publication type on reliability scores. We found that dissertations produced significantly lower reliability scores (M = 0.84, SD = 0.07) than journal articles (M = 0.87, SD = 0.05). This implies that respondents in dissertation studies answered items on the SL-ASIA less consistently than respondents who were administered the SL-ASIA as part of research for scholarly publication. Next, an independent samples t test was performed to assess the impact of the language (e.g., English vs. all non-English translations) of the SL-ASIA on reliability scores. There was not a significant difference between English versions (M = 0.85, SD = 0.06) and non-English translations (M = 0.84, SD = 0.07). However, there were a limited number of studies that used a non-English version of the SL-ASIA: seven studies used a translation versus 76 in English. Finally, a third independent samples t test was performed to assess the impact of birth location on reliability scores. There was not a significant difference between participants born in the United States (M = 0.84, SD = 0.05, n = 12) and participants born outside the United States (M = 0.84, SD = 0.08, n = 21). This implies the SL-ASIA produces equally reliable scores regardless of the participant's country of origin.
Bivariate correlations were performed to assess the relationship between internal consistency and the remaining variables: percentage female, publication year, mean age, and standard deviation of age. Standard deviation of age was the only variable that showed a statistically significant correlation with reliability scores (r = .30, n = 46, p = .04). This statistically significant positive correlation between standard deviation of age and reliability scores indicates that the more heterogeneous a sample was in age, the higher the reliability score produced by that sample. In RG analyses, non-significant results can still have important implications. For instance, mean age and the percentage of the sample that were female did not significantly relate to reliability scores, indicating that the SL-ASIA appears to be appropriate for use with samples diverse in sex and age. Finally, year of publication did not significantly relate to total reliability; thus, the SL-ASIA appears to continue to be a relevant and useful measure since its induction in 1984. The results from this meta-analysis suggest the demographic variables of biological sex of participants and participant mean age do not seem to influence the total reliability scores produced by this measure.

Discussion
The purpose of the current study was to identify (a) the frequency with which authors report reliability coefficients for their current sample (b) the variability in Cronbach's reliability coefficients, and (c) the instrument and sample-related characteristics that are related to reliability scores on the SL-ASIA. Of the 193 published studies (journal articles and dissertations) reviewed, only 44% reported any type of reliability coefficient for the scores produced by their sample. Yet previous RG studies of instruments with even greater name recognition and widespread use than the SL-ASIA indicate that the reporting of reliability coefficients can be even lower than that of the SL-ASIA. For instance, previous RG studies indicated that only 13.8% of publications using the BSRI (Bem, 1981) reported reliability (Vacha-Haase, 1998), and only 7.5% of publications reported reliability for the Beck Depression Inventory (Yin & Fan, 2000). Nearly half of the studies we reviewed (45%) reported a reliability coefficient for the SL-ASIA from a previous study, while 2% of studies simply attributed reliability to the instrument. Finally, 10% of studies did not make any reference to reliability. However, those authors who report reliability from a previous sample or who regard reliability as an immutable property of the instrument (e.g., "it is reliable") do not provide readers with any indication of how well the instrument captured a phenomenon such as acculturation for their particular sample (Vacha-Haase, 1998;Vacha-Haase, Kogan, & Thompson, 2000;Vacha-Haase, Ness, Nilsson, & Reetz, 1999;Whittington, 1998). The authors of the current study support the idea that researchers and clinicians should become aware of the importance of reporting reliability for their study participants-resulting in the reporting of Cronbach's alpha reliability coefficient each time a Likert-type survey instrument is used.
The standardization of reporting results in research may be helpful for increasing the level of reliability reporting (Fan & Thompson, 2001). One challenge of increasing psychometric reporting is that doctoral programs in psychology and allied health sciences do not emphasize psychometrics and testing (Aiken et al., 1990) despite widespread use of such instruments in clinical and research settings. If the reliability of instruments used in dissertations is not reported, it is reasonable to assume similar issues of not reporting reliability coefficients will continue once those graduates publish studies as researchers and clinicians. For the convenience of the reader who wishes to have an example of how to provide reliability information for the sample, we have created an example for how an author may present information on reliability data. In the following paragraph, the sentence in italics represents where an author would present data for their own study. This example illustrates one method for presenting (a) basic information about a scale, such as the number of items and original year of publication; (b) the rating point system for the scale, with a brief description of what a higher or lower score on the scale represents; (c) a description of the normative sample, with report of reliability coefficients if these were provided; (d) a report of a reliability coefficient from a previous study with a study participants that are similar to the current study's participants; and (e) reliability information for a present study, which may include report of Cronbach's alpha for subgroups within the study and different time points within the same study.
The Suinn-Lew Asian Self-Identity Acculturation Scale (SL-ASIA; Suinn, Rickard-Figueroa, Lew, & Vigil, 1987) was used to measure level of acculturation. The original 21-item, self-report questionnaire was developed in 1987, and the SL-ASIA remains the most widely-used measure of acculturation among Asian Americans (Abe- Kim, Okazaki, & Goto, 2001). Each item is rated on a 5-point Likert scale, with a response of 1 indicating a low level of acculturation (Asian identified) and a response of 5 indicating a high level of acculturation (Western identified). In the norming sample of 82 Asian undergraduate students, Suinn et al. (1987) reported a range in Cronbach's alpha from .83 to .91. Ha (2000) used a convenience sample of Vietnamese immigrants and reported a Cronbach's alpha of .88. In the present study, Cronbach's alpha was .90.
In the example, reliability coefficients were reported from (a) the original norming sample of the scale, (b) a previous study with a participant whose composition is similar to the fictitious study, and (c) the present study. As Pedhazur and Schmelkin (1991) noted, the reporting of reliability coefficients from previous studies "may be useful for comparative purpose, but it is imperative that the relevant reliability estimate is the one obtained for the sample used in the study under consideration" (p. 86).

Characteristics Related to Reliability
Publication type (e.g., journal article vs. dissertation) and standard deviation of age were the only variables in the current study that were statistically significant in explaining a portion of the variation in Cronbach's alpha reliability coefficients. Using an independent samples t test, dissertation studies reported lower reliability scores than journal articles. This may reflect the differential level of vetting that dissertation studies and journal articles receive. Timmons and Park (2008) reported that many graduate students do not prepare dissertations for broad dissemination because it is not a priority and also because they are short of time. Therefore, doctoral students may use a greater number of convenience samples-such as undergraduate students-who may not be particularly motivated to respond to a self-report questionnaire consistently.
Another factor that may lead to research articles reporting higher reliability coefficients than dissertation studies is that articles with more statistically significant results are more likely to be published than articles whose results are not statistically significant (Borenstein, Hedges, Higgins, & Rothstein, 2009;Dickersin, Min, & Meinert, 1992). Accordingly, studies submitted to a journal for publication may have been systematically screened out for not having statistically significant results, in which reliability coefficients could have been lower. Although there is a significant difference in the reliability coefficients for dissertations versus journal articles, it should be noted that the average level of reliability for both are in an acceptable range (.80-.90).
A positive correlation between standard deviation of age and reliability scores was the second statistically significant result in our meta-analysis. This follows the classical test theory assertion that more diverse and heterogeneous samples will typically produce higher reliability scores (Crocker & Algina, 1986). In this review, studies with greater variation in sample age produced higher reliability scores on the SL-ASIA.
A lack of significant results in an RG study can still be an important finding, as it indicates that an instrument or sample-characteristic does not appear to influence the reliability of scores produced by the measure for a particular group of study participants. For instance, in our study, language of the SL-ASIA (i.e., English vs. not English) did not significantly affect reliability coefficients. The English version and translated versions produced similar reliability scores in an acceptable range. This indicates translations of the SL-ASIA, such as a Chinese version used by Lei (1998), are likely to hold up well for cross-cultural use both within and outside of the United States.

Recommendations for Use of the SL-ASIA
There are multiple recommendations for authors seeking to publish a study using the SL-ASIA, or other self-report instruments that measure level of acculturation. First, a reliability coefficient should be reported for the sample of interest. Multiple reliability coefficients can be reported within the same study for separate groups, such as separate coefficients for sex or age group. For samples using the SL-ASIA to investigate level of acculturation among Asian Americans, it would be helpful to report reliability coefficients for subgroups based on ethnicity (e.g., Hmong, Laotian, Malaysian, etc.). We also recommend reporting on the country of origin for respondents in a study, such as international students who identify as Asian versus Asian American of varying generation levels (e.g., first-, second-, third-generation, etc.). Rexrode et al. (2008) noted that calculating separate reliability coefficients for different groups of participants within a study "will make considerable contribution to the literature" (p. 273). Second, sample characteristics should be adequately described. This may include information on mean and standard deviation of age, level of education, marital status, and ethnicity, among others.

Limitations of the Study and Directions for Future Research
The first limitation to be considered in any RG study is the number of studies that could not be included in analyses due to their failure to report reliability information for their present sample. In this RG of the SL-ASIA, only 44% of published journal articles and dissertations provided reliability coefficients for their sample. Although the APA's Task Force on Statistical Inference stated that "interpreting the size of observed effects requires an assessment of the reliability of the scores" (Wilkinson & APA Task Force on Statistical Inference, 1999, p. 596), numerous RG studies indicate that reporting of this important psychometric is not yet common practice in the literature (Vacha-Haase & Thompson, 2011).
An additional limitation is the lack of statistical power for many sample variables due to the underreporting of reliability coefficients in published research. Shields and Caruso (2004) proposed that the underreporting of variables may be due to a "file drawer" problem in which studies with low reliabilities may have a decreased chance of publication (p. 410). Subsequently, studies with samples that produced low reliability coefficients may not have been published, or they may be represented among the 110 studies that used the SL-ASIA but did not report reliability information for their sample.
In summary, our results suggest reliability of the SL-ASIA is not affected by certain sample characteristics including country of birth, language of instrument, mean age of sample, and percentage of females in sample. The only significant predictors of reliability were publication type, with dissertations producing scores with slightly less reliability than journal articles, and the standard deviation of age, with more heterogeneous samples producing higher reliability coefficients.
A recent criticism against the SL-ASIA's conceptualizing and operationalizing of acculturation is that it "tends to imply that acculturation is a linear and unidirectional experience" (Abe- Kim et al., 2001, p. 234). These authors further proposed that the unidimensional approach of the SL-ASIA is insufficient for capturing complex relationships between acculturation and other cultural variables such as negative impression management (Abe- Kim et al., 2001). Despite these flaws to the original 1987 scale, the SL-ASIA has gathered more data than any other acculturation instrument for Asian Americans. To our knowledge, a comparable scale to the SL-ASIA has not yet been created to measure acculturation among Asian Americans. Recent attempts at creating an improved acculturation scale, such as the Short Acculturation Scale for Korean Immigrants (S. E. Choi & Reed, 2011), present their own challenges such as focus on a single Asian subgroup. For instance, the Short Acculturation Scale for Korean Immigrants has not yet been cited in the literature. Future research on the acculturation of Asian Americans is recommended to continue use of the SL-ASIA, as our meta-analysis indicates it continues to produce reliable scores with diverse populations.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research and/or authorship of this article.