Above-level testing (also called above-grade testing, out-of-level testing, and off-level testing) is the practice of administering to a child a test that is designed for an examinee population that is older or in a more advanced grade. Above-level testing is frequently used to help educators design educational interventions for gifted children, especially those who may be candidates for grade skipping or Talent Search programs. However, little research has been conducted on how test items function when administered to a younger population, despite professional standards that require examiners to gather validity evidence when administering a test for a new population. In this article, we explain two studies in which we compared item functioning across two populations of examinees: gifted middle school students and older examinees that the tests were designed for. Results from Study 1 indicated a high correlation between item difficulty statistics for both groups on the Iowa Tests of Basic Skills. Results from Study 2—a mixed-methods study—showed that even though the two groups were similar in ability (as measured by the Reynolds Intellectual Assessment Scales), the high school students completed SAT-M test items more quickly and demonstrated more familiarity with the test content. In both studies, test items generally operate similarly for the two age groups. However, important local curriculum and individual educational history may cause some items to operate differently when administered above level.

Agresti, A. (2007). An introduction to categorical data analysis. Hoboken, NJ: John Wiley.
Google Scholar | Crossref
Almack, J. C., Almack, J. S. (1921). Gifted pupils in the high school. School & Society, 14, 227-228.
Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education . (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
Assouline, S., Colangelo, N., Lupkowski-Shoplik, A., Lipscomb, J., Forstadt, L. (2009). Iowa acceleration scale manual (3rd ed.). Scottsdale, AZ: Great Potential Press.
Google Scholar
Assouline, S. G., Lupkowski-Shoplik, A. (2012). The Talent Search model of gifted identification. Journal of Psychoeducational Assessment, 30, 45-59. doi:10.1177/0734282911433946
Google Scholar | SAGE Journals | ISI
Atkinson, R. C. (2001). Achievement versus aptitude in college admissions. Issues in Science and Technology, 18(2), 31-36.
Google Scholar | ISI
Barnes, J. C., Beaver, K. M., Boutwell, B. B. (2013). Average county-level IQ predicts county-level disadvantage and several county-level mortality risk rates. Intelligence, 41, 59-66. doi:10.1016/j.intell.2012.10.1007
Google Scholar | Crossref | ISI
Barnett, L. B., Gilheany, S. (1996). The CTY Talent Search: International applicability and practice in Ireland. High Ability Studies, 7, 179-190. doi:10.1080/0937445960070208
Google Scholar | Crossref
Benbow, C. P., Lubinski, D., Suchy, B. (1996). The impact of SMPY’s educational programs from the perspective of the participant. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 266-300). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Benbow, C. P., Stanley, J. C. (1980). Sex differences in mathematical ability: Fact or artifact? Science, 210, 1262-1264. doi:10.1126/science.7434028
Google Scholar | Crossref | Medline | ISI
Benbow, C. P., Wolins, L. (1996). The utility of out-of-level testing for gifted seventh and eighth graders using the SAT-M: An examination of item bias. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 333-346, 413-417). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Bickart, B., Felcher, E. M. (1996). Expanding and enhancing the use of verbal protocols in survey research. In Schwartz, N., Sudman, S. (Eds.), Answering questions: Methodology for determining cognitive and communicative processes in survey research (pp. 115-142). San Francisco, CA: Jossey-Bass.
Google Scholar
Bonner, S. M. (2006). A think-aloud approach to understanding performance on the Multistate Bar Examination. The Bar Examiner, 75, 6-15.
Google Scholar
Bonner, S. M., D’Agostino, J. V. (2012). A substantive process analysis of responses to items from the Multistate Bar Examination. Applied Measurement in Education, 25, 1-26. doi:10.1080/08957347.2012.635472
Google Scholar | Crossref | ISI
Borland, J. H. (2009). Myth 2: The gifted constitute 3% to 5% of the population. Moreover, giftedness equals high IQ, which is a stable measure of aptitude: Spinal tap psychometrics in gifted education. Gifted Child Quarterly, 53, 236-238. doi:10.1177/0016986209346825
Google Scholar | SAGE Journals | ISI
Brown, S. W., Yakimowski, M. E. (1987). Intelligence scores of gifted students on the WISC-R. Gifted Child Quarterly, 31, 130-134. doi:10.1177/001698628703100308
Google Scholar | SAGE Journals | ISI
College Board . (2013). 2013 college-bound seniors total group profile report. Retrieved from http://media.collegeboard.com/digitalServices/pdf/research/2013/TotalGroup-2013.pdf
Google Scholar
College Board . (n. d.). Official SAT® Practice Test 2013-14. Retrieved from https://satonlinecourse.collegeboard.org/SR/digital_assets/assessment/pdf/F4D31AB0-66B4-CE32-00F7-F5405701F413-F.pdf
Google Scholar
Creswell, J. (2014). Research design: Qualitative, quantitative, and mixed methods approaches (4th ed.). Thousand Oaks, CA: Sage.
Google Scholar
Crocker, L., Algina, J. (2008). Introduction to classical and modern test theory. Mason, OH: Cengage Learning.
Google Scholar
Forsyth, R. A., Ansley, T. N., Feldt, L. S., Alnot, S. D. (2001). Iowa Tests of Educational Development. Itasca, IL: Riverside.
Google Scholar
Forsyth, R. A., Ansley, T. N., Feldt, L. S., Alnot, S. D. (2003). Iowa Tests of Educational Development guide to research and development. Itasca, IL: Riverside Publishing.
Google Scholar
Gottfredson, L. S. (2004). Intelligence: Is it the epidemiologists’ elusive “fundamental cause” of social class inequalities in health? Journal of Personality and Social Psychology, 86, 174-199. doi:10.1037/0022-3514.86.1.174
Google Scholar | Crossref | Medline | ISI
Hoover, H. D., Dunbar, S. B., Frisbie, D. A. (2001). Iowa Tests of Basic Skills, Forms A, B, and C. Itasca, IL: Riverside.
Google Scholar
Hoover, H. D., Dunbar, S. B., Frisbie, D. A., Oberley, K. R., Ordman, V. L., Naylor, R. J., . . . Shannon, G. P. (2003). Iowa Tests of Basic Skills guide to research and development. Itasca, IL: Riverside.
Google Scholar
Jensen, A. R. (1980). Bias in mental testing. New York, NY: Free Press.
Google Scholar
Kane, M. T. (2006). Validation. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: Praeger Publishers.
Google Scholar
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1-73. doi:10.1111/jedm.12000
Google Scholar | Crossref | ISI
Kaplan, R. M., Saccuzzo, D. P. (2009). Psychological testing: Principles, applications, and issues (7th ed.). Belmont, CA: Wadsworth.
Google Scholar
Kelley, T. L. (1927). Interpretation of educational measurements. Yonkers-on-Hudson, NY: World Book Company.
Google Scholar
Lee, S.-Y., Matthews, M. S., Olszewski-Kubilius, P. (2008). A national picture of Talent Search and Talent Search educational programs. Gifted Child Quarterly, 52, 55-69. doi:10.1177/0016986207311152
Google Scholar | SAGE Journals | ISI
Lee, S.-Y., Olszewski-Kubilius, P. (2006). Talent search qualifying: Comparisons between talent search students qualifying via scores on standardized tests and via parent nomination. Roeper Review, 28, 157-166. doi:10.1080/02783190609554355
Google Scholar | Crossref
Leighton, J. P. (2004). Avoiding misconception, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23(4), 6-15. doi:10.1111/j.1745-3992.2004.tb00164.x
Google Scholar | Crossref
Lohman, D. F. (2005). The role of nonverbal ability tests in identifying academically gifted students: An aptitude perspective. Gifted Child Quarterly, 49, 111-138. doi:10.1177/001698620504900203
Google Scholar | SAGE Journals | ISI
Lubinski, D., Benbow, C. P. (1994). The Study of Mathematically Precocious Youth: The first three decades of a planned 50-year study of intellectual talent. In Subotnik, R. F., Arnold, K. D. (Eds.), Beyond Terman: Contemporary longitudinal studies of giftedness and talent (pp. 255-281). Westport, CT: Ablex.
Google Scholar
Lubinski, D., Webb, R. M., Morelock, M. J., Benbow, C. P. (2001). Top 1 in 10,000: A 10-year follow-up of the profoundly gifted. Journal of Applied Psychology, 86, 718-729. doi:10.1037/0021-9010.86.4.718
Google Scholar | Crossref | Medline | ISI
Lupkowski-Shoplik, A., Swiatek, M. A. (1999). Elementary student talent searches: Establishing appropriate guidelines for qualifying test scores. Gifted Child Quarterly, 43, 265-272. doi:10.1177/001698629303700304
Google Scholar | SAGE Journals | ISI
Meade, A. W., Kroustalis, C. M. (2006). Problems with item parceling for confirmatory factor analytic tests of measurement invariance. Organizational Research Methods, 9, 369-403. doi:10.1177/1094428105283384
Google Scholar | SAGE Journals | ISI
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525-543. doi:10.1007/BF02294825
Google Scholar | Crossref | ISI
Mills, C. J., Barnett, L. B. (1992). The use of the Secondary School Admission Test (SSAT) to identify academically talented elementary school students. Gifted Child Quarterly, 36, 155-159. doi:10.1177/001698629203600306
Google Scholar | SAGE Journals | ISI
Minor, L. L., Benbow, C. P. (1996). Construct validity of the SAT-M: A comparative study of high school students and gifted seventh graders. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 347-361). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Naglieri, J. A. (2007). Traditional IQ: 100 years of misconception and its relationship to minority representation in gifted programs. In VanTassel-Baska, J. (Ed.), Alternative assessments with gifted and talented students (pp. 67-88). Waco, TX: Prufrock Press.
Google Scholar
Nasser, F., Wisenbaker, J. (2003). A Monte Carlo study investigating the impact of item parceling on measures of fit in confirmatory factor analysis. Educational and Psychological Measurement, 63, 729-757. doi:10.1177/0013164403258228
Google Scholar | SAGE Journals | ISI
Olszewski-Kubilius, P. (1998). Research evidence regarding the validity and effects of talent search educational programs. Journal of Secondary Gifted Education, 9, 134-138.
Google Scholar | SAGE Journals
Olszewski-Kubilius, P., Kulieke, M. J. (2008). Using off-level testing and assessment for gifted and talented students. In VanTassel-Baska, J. (Ed.), Alternative assessments with gifted and talented students (pp. 89-106). Waco, TX: Prufrock Press.
Google Scholar
Olszewski-Kubilius, P., Lee, S.-Y. (2011). Gender and other group differences in performance on off-level tests: Changes in the 21st century. Gifted Child Quarterly, 55, 54-73. doi:10.1177/0016986210382574
Google Scholar | SAGE Journals | ISI
Raju, N. S., Laffitte, L. J., Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517-529. doi:10.1037//0021-9010.873.3.517
Google Scholar | Crossref | Medline | ISI
Rambo-Hernandez, K. E., Warne, R. T. (2015). Measuring the outliers: An introduction to out-of-level testing with high-achieving students. Teaching Exceptional Children, 47, 199-207. doi:10.1177/0040059915569359
Google Scholar | SAGE Journals
Reynolds, C. R., Kamphaus, R. W. (2003a). Reynolds Intellectual Assessment Scales. Lutz, FL: Psychological Assessment Resources.
Google Scholar
Reynolds, C. R., Kamphaus, R. W. (2003b). Reynolds Intellectual Assessment Scales and the Reynolds Intellectual Screening Test professional manual. Lutz, FL: Psychological Assessment Resources.
Google Scholar
Robertson, S. G., Pfeiffer, S. I., Taylor, N. (2011). Serving the gifted: A national survey of school psychologists. Psychology in the Schools, 48, 786-799. doi:10.1002/pits.20590
Google Scholar | Crossref | ISI
Robinson, N. M. (2002). Assessing and advocating for gifted students: Perspectives for school and clinical psychologists. Retrieved from ERIC database. (ED476372)
Google Scholar
Schmeiser, C. B., Welch, C. J. (2006). Test development. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 307-353). Westport, CT: Praeger Publishers.
Google Scholar
Shiffrin, R. M., Atkinson, R. C. (1969). Storage and retrieval processes in long-term memory. Psychological Review, 76, 179-193. doi:10.1037/h0027277
Google Scholar | Crossref | ISI
Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72-101. doi:10.2307/1412159
Google Scholar | Crossref
Stanley, J. C. (1977). Rationale of the Study of Mathematically Precocious Youth (SMPY) during its first five years of promoting educational acceleration. In Stanley, J. C., George, W. C., Solano, C. H. (Eds.), The gifted and the creative: A fifty-year perspective (pp. 75-112). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Sudman, S., Bradburn, N. M., Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. San Francisco, CA: Jossey-Bass.
Google Scholar
Swiatek, M. A., Lupkowski-Shoplik, A. (2005). An evaluation of the elementary student Talent Search by families and schools. Gifted Child Quarterly, 49, 247-259. doi:10.1177/001698620504900306
Google Scholar | SAGE Journals | ISI
Terman, L. M. (1926). Genetic studies of genius: Vol. I. Mental and physical traits of a thousand gifted children (2nd ed.). Stanford, CA: Stanford University Press.
Google Scholar
Terman, L. M., Fenton, J. C. (1921). Preliminary report on a gifted juvenile author. Journal of Applied Psychology, 5, 163-178. doi:10.1037/h0074962
Google Scholar | Crossref
Thompson, L. A., Oehlert, J. (2010). The etiology of giftedness. Learning and Individual Differences, 20, 298-307. doi:10.1016/j.lindif.2009.11.004
Google Scholar | Crossref | ISI
Thomson, D., Olszewski-Kubilius, P. (2014). The increasingly important role of off-level testing in the context of the talent development perspective. Gifted Child Today, 37, 33-40. doi:10.1177/1076217513509619
Google Scholar | SAGE Journals
Threlfall, J., Hargreaves, M. (2008). The problem-solving methods of mathematically gifted and older average-attaining students. High Ability Studies, 19, 83-98. doi:10.1080/13598130801990967
Google Scholar | Crossref | ISI
Tourón, J., Tourón, M. (2011). The Center for Talented Youth identification model: A review of the literature. Talent Development & Excellence, 3, 187-202.
Google Scholar
VanTassel-Baska, J. (1996). Contributions of the talent-search concept to gifted education. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 236-245). Baltimore, MD: Johns Hopkins University Press.
Google Scholar
Wai, J., Lubinski, D., Benbow, C. P. (2005). Creativity and occupational accomplishments among intellectually precocious youths: An age 13 to age 33 longitudinal study. Journal of Educational Psychology, 97, 484-492. doi:10.1037/0022-0663.97.3.484
Google Scholar | Crossref | ISI
Warne, R. T. (2011). Psychometric impacts of above-level testing. (Unpublished doctoral dissertation). Texas A&M University, College Station, TX.
Google Scholar
Warne, R. T. (2012). History and development of above-level testing of the gifted. Roeper Review, 34, 183-193. doi:10.1080/02783193.2012.686425
Google Scholar | Crossref
Warne, R. T. (2014). Using above-level to track growth in academic achievement in gifted students. Gifted Child Quarterly, 58, 3-23. doi:10.1177/0016986213513793
Google Scholar | SAGE Journals | ISI
Warne, R. T., Anderson, B., Johnson, A. O. (2013). The impact of race and ethnicity on the identification process for giftedness in Utah. Journal for the Education of the Gifted, 36, 487-508. doi:10.1177/0162353213506065
Google Scholar | SAGE Journals
Warne, R. T., Yoon, M., Price, C. J. (2014). Exploring the various interpretations of “test bias.” Cultural Diversity and Ethnic Minority Psychology, 20, 570-582. doi:10.1037/a0036503
Google Scholar | Crossref | Medline | ISI
Watkins, M. W., Greenawalt, C. G., Marcell, C. M. (2002). Factor structure of the Wechsler Intelligence Scale for Children—Third Edition among gifted students. Educational and Psychological Measurement, 62, 164-172. doi:10.1177/0013164402062001011
Google Scholar | SAGE Journals | ISI
Zwick, R. (2006). Higher education admissions testing. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 647-679). Westport, CT: Praeger Publishers.
Google Scholar
View access options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Your Access Options


Purchase

JPA-article-ppv for $36.00