The findings and discussions related to cultural bias in testing have in no way been unanimous. However, the considerations of this area of inquiry may possess meaningful implications for educators of any subject. In this review of literature, I describe the issues, research, and arguments surrounding cultural bias in testing and discuss implications for the field of music education. A working description of cultural bias in testing for the purpose of this article involves the notions of (a) significantly different results for definable subgroups from apparently similar ability levels and (b) issues with the fair and equitable interpretation and use of test results. Applications of general education scholarship to music education settings include investigations and perceptions of cultural bias as well as suggestions for improved fairness consisting of addressing group differences, offering diverse ways to perform, discouraging misuse, and accommodating for differences.

Alordiah, C. O., Agbajor, H. T. (2014). Bias in test items and implication for national development. Journal of Education and Practice, 5(9), 1013.
Google Scholar
Banks, K. (2006). A comprehensive framework for evaluating hypotheses about cultural bias in educational testing. Applied Measurement in Education, 19, 115132. doi:10.1207/s15324818ame1902_3
Google Scholar | Crossref | ISI
Banks, K. (2012). Are inferential reading items more susceptible to cultural bias than literal reading items? Applied Measurement in Education, 25, 220245. doi:10.1080/08957347.2012.687610
Google Scholar | Crossref | ISI
Baumgartner, L. M., Johnson-Bailey, J. (2010). Racism and white privilege in adult education graduate programs: Admissions, retention, and curricula. New Directions for Adult and Continuing Education, 2010(125), 2740. doi:10.1002/ace.360
Google Scholar | Crossref
Brown, R. T., Reynolds, C. R., Whitaker, J. S. (1999). Bias in mental testing since Bias in Mental Testing. School Psychology Quarterly, 14, 208238.
Google Scholar | Crossref | ISI
Clauser, B. E., Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items. Educational Measurement: Issues and Practice, 17(1), 3144. doi:10.1111/j.1745-3992.1998.tb00619.x
Google Scholar | Crossref
Cole, N. S., Moss, P. A. (1989). Bias in test use. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 201219). New York, NY: American Council on Education/Macmillan.
Google Scholar
Cole, N. S., Nitko, A. J. (1981). Measuring program effects. In Berk, R. A. (Ed.), Educational evaluation methodology: The state of the art. Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Cole, N. S., Zieky, M. J. (2001). The new faces of fairness. Journal of Educational Measurement, 38, 369382.
Google Scholar | Crossref | ISI
Contreras, F. E. (2005). Access, achievement, and social capital: Standardized exams and the Latino college-bound population. Journal of Hispanic Higher Education, 4, 197214. doi:10.1177/1538192705276546
Google Scholar | SAGE Journals
Dorans, N. J., Zeller, K. (2004). Examining Freedle’s claims and his proposed solution: Dated data, inappropriate measurements, and incorrect and unfair scoring (Research Report No. 04-26). Retrieved from http://www.ets.org/Media/Research/pdf/RR-04-26.pdf
Google Scholar
Fagan, J. F., Holland, C. R. (2002). Equal opportunity and racial differences in I.Q. Intelligence, 30, 361387. doi:10.1016/S0160-2896(02)00080-6
Google Scholar | Crossref | ISI
Fleming, J. (2000). Affirmative action and standardized test scores. Journal of Negro Education, 69, 2737.
Google Scholar
Ford, D. Y., Helmys, J. E. (2012). Testing and assessing African Americans: “Unbiased” tests are still unfair. Journal of Negro Education, 81, 186189.
Google Scholar | Crossref
Freedle, R. O. (2003). Correcting the SAT’s ethnic and social-class bias: A method for reestimating SAT scores. Harvard Educational Review, 73, 143.
Google Scholar | Crossref | ISI
Freedle, R., Kostin, I. (1997). Predicting black and white differential item functioning in verbal analogy performance. Intelligence, 24, 417444. doi:10.1016/S0160-2896(97)90058-1
Google Scholar | Crossref | ISI
Gierl, M. J., Khaliq, S. N. (2001). Identifying sources of differential item and bundle functioning on translated achievement tests: A confirmatory analysis. Journal of Educational Measurement, 38, 164187.
Google Scholar | Crossref | ISI
Gregory, R. J. (2004). Psychological testing: History, principles, and applications. Boston, MA: Allyn & Bacon.
Google Scholar
Hash, P. M. (2013). Large-group contest ratings and music teacher evaluation: Issues and recommendations. Arts Education Policy Review, 114, 163169. doi:10.1080/10632913.2013.826035
Google Scholar | Crossref
Jencks, C., Phillips, M. (1998). The black-white test score gap. Washington, DC: Brookings Institution Press.
Google Scholar
Magnuson, K., Waldfogel, J. (2008). Steady gains and stalled progress. New York, NY: Russell Sage Foundation.
Google Scholar
Mupinga, E. E., Mupinga, D. M. (2005). Perceptions of international students toward graduate record examination (GRE). College Student Journal, 39, 402408.
Google Scholar
Nelson-Barber, S., Trumbull, E. (2007). Making assessment practices valid for Indigenous American students. Journal of American Indian Education, 46, 132147.
Google Scholar
Petchauer, E. (2013). Passing as white: Race, shame, and success in teacher licensure testing events for black preservice teachers. Race Ethnicity and Education. Advance online publication. doi:10.1080/13613324.2013.792796
Google Scholar | Crossref | ISI
Qi, C. H., Marley, S. C. (2009). Differential item functioning analysis of the Preschool Language Scale-4 between English-speaking Hispanic and European American children from low-income families. Topics in Early Childhood Special Education, 29, 171180. doi:10.1177/0271121409332674
Google Scholar | SAGE Journals | ISI
Qi, C. H., Marley, S. C. (2011). Validity study of the Preschool Language Scale-4 with English-speaking Hispanic and European American children in Head Start programs. Topics in Early Childhood Special Education, 31, 8998. doi:10.1177/0271121410391108
Google Scholar | SAGE Journals | ISI
Ramsey, P. A. (1993). Sensitivity review: The ETS experience as a case study. In Holland, P., Wainer, H. (Eds.), Differential item functioning (pp. 367388). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Reynolds, C. R. (1998). Cultural bias in testing of intelligence and personality. In Bellack, A., Hersen, M. (Series Eds.) & Belar, C. (Vol. Ed.), Comprehensive clinical psychology: Sociocultural and individual differences. New York, NY: Elsevier Science.
Google Scholar | Crossref
Saenz, T. I., Huer, M. B. (2003). Testing strategies involving least biased language assessment of bilingual children. Communication Disorders Quarterly, 24, 184193. doi:10.1177/15257401030240040401
Google Scholar | SAGE Journals
Santelices, M. V., Wilson, M. (2010). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Educational Review, 80, 106134.
Google Scholar | Crossref | ISI
Scherbaum, C. A., Goldstein, H. W. (2008). Examining the relationship between race-based differential item functioning and item difficulty. Educational and Psychological Measurement, 68, 537553. doi:10.1177/0013164407310129
Google Scholar | SAGE Journals | ISI
Schmitt, A. P., Dorans, N. J. (1990). Differential item functioning for minority examinees on the SAT. Journal of Educational Measurement, 27, 6781.
Google Scholar | Crossref | ISI
Skiba, R. J., Knesting, K., Bush, L. D. (2002). Culturally competent assessment: More than nonbiased tests. Journal of Child and Family Studies, 11, 6178. doi:10.1023/A:1014767511894
Google Scholar | Crossref
Solano-Flores, G., Nelson-Barber, S. (2001). On the cultural validity of science assessments. Journal of Research in Science Teaching, 38, 553573. doi:10.1002/tea.1018
Google Scholar | Crossref | ISI
Spencer, B., Castano, E. (2007). Social class is dead. Long live social class! Stereotype threat among low socioeconomic status individuals. Social Justice Research, 20, 418432. doi:10.1007/s11211-007-0047-7
Google Scholar | Crossref
Steele, C. (1997). A threat in the air: How stereotypes shape intellectual identity and performance. American Psychologist, 52, 613629. doi:10.1037/0003-066X.52.6.613
Google Scholar | Crossref | Medline | ISI
Steele, C., Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69, 797811. doi:10.1037/0022-3514.69.5.797
Google Scholar | Crossref | Medline | ISI
Taylor, O. L., Lee, D. L. (1987). Standardized tests and African-American children: Communication and language issues. Negro Educational Review, 38(2-3), 6780.
Google Scholar
Walpole, M., McDonough, P. M., Bauer, C. J., Gibson, C., Kanyi, K., Toliver, R. (2005). This test is unfair: Urban African American and Latino high school students’ perceptions of standardized college admission tests. Urban Education, 40, 321349. doi:10.1177/0042085905274536
Google Scholar | SAGE Journals | ISI
Whiting, G., Ford, D. (2009). Cultural bias in testing. Retrieved from http://www.education.com/reference/article/cultural-bias-in-testing
Google Scholar
Wightman, L. F. (2003). Standardized testing and equal access: A tutorial. In Chang, M. J., Witt, D., Jones, J., Hakuta, K. (Eds.), Compelling interest: Examining the evidence on racial dynamics in colleges and universities (pp. 4996). Stanford, CA: Stanford University Press.
Google Scholar
Wilson, W. J. (1998). The role of the environment in the black-white test score gap. In Jencks, C., Phillips, M. (Eds.), The black-white test score gap (pp. 501510). Washington, DC: Brookings Institution Press.
Google Scholar
View access options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Your Access Options


Purchase

UPD-article-ppv for $36.00