Abstract
Above-level testing (also called above-grade testing, out-of-level testing, and off-level testing) is the practice of administering to a child a test that is designed for an examinee population that is older or in a more advanced grade. Above-level testing is frequently used to help educators design educational interventions for gifted children, especially those who may be candidates for grade skipping or Talent Search programs. However, little research has been conducted on how test items function when administered to a younger population, despite professional standards that require examiners to gather validity evidence when administering a test for a new population. In this article, we explain two studies in which we compared item functioning across two populations of examinees: gifted middle school students and older examinees that the tests were designed for. Results from Study 1 indicated a high correlation between item difficulty statistics for both groups on the Iowa Tests of Basic Skills. Results from Study 2—a mixed-methods study—showed that even though the two groups were similar in ability (as measured by the Reynolds Intellectual Assessment Scales), the high school students completed SAT-M test items more quickly and demonstrated more familiarity with the test content. In both studies, test items generally operate similarly for the two age groups. However, important local curriculum and individual educational history may cause some items to operate differently when administered above level.
|
Agresti, A. (2007). An introduction to categorical data analysis. Hoboken, NJ: John Wiley. Google Scholar | Crossref | |
|
Almack, J. C., Almack, J. S. (1921). Gifted pupils in the high school. School & Society, 14, 227-228. Google Scholar | |
|
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education . (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Google Scholar | |
|
Assouline, S., Colangelo, N., Lupkowski-Shoplik, A., Lipscomb, J., Forstadt, L. (2009). Iowa acceleration scale manual (3rd ed.). Scottsdale, AZ: Great Potential Press. Google Scholar | |
|
Assouline, S. G., Lupkowski-Shoplik, A. (2012). The Talent Search model of gifted identification. Journal of Psychoeducational Assessment, 30, 45-59. doi:10.1177/0734282911433946 Google Scholar | SAGE Journals | ISI | |
|
Atkinson, R. C. (2001). Achievement versus aptitude in college admissions. Issues in Science and Technology, 18(2), 31-36. Google Scholar | ISI | |
|
Barnes, J. C., Beaver, K. M., Boutwell, B. B. (2013). Average county-level IQ predicts county-level disadvantage and several county-level mortality risk rates. Intelligence, 41, 59-66. doi:10.1016/j.intell.2012.10.1007 Google Scholar | Crossref | ISI | |
|
Barnett, L. B., Gilheany, S. (1996). The CTY Talent Search: International applicability and practice in Ireland. High Ability Studies, 7, 179-190. doi:10.1080/0937445960070208 Google Scholar | Crossref | |
|
Benbow, C. P., Lubinski, D., Suchy, B. (1996). The impact of SMPY’s educational programs from the perspective of the participant. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 266-300). Baltimore, MD: Johns Hopkins University Press. Google Scholar | |
|
Benbow, C. P., Stanley, J. C. (1980). Sex differences in mathematical ability: Fact or artifact? Science, 210, 1262-1264. doi:10.1126/science.7434028 Google Scholar | Crossref | Medline | ISI | |
|
Benbow, C. P., Wolins, L. (1996). The utility of out-of-level testing for gifted seventh and eighth graders using the SAT-M: An examination of item bias. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 333-346, 413-417). Baltimore, MD: Johns Hopkins University Press. Google Scholar | |
|
Bickart, B., Felcher, E. M. (1996). Expanding and enhancing the use of verbal protocols in survey research. In Schwartz, N., Sudman, S. (Eds.), Answering questions: Methodology for determining cognitive and communicative processes in survey research (pp. 115-142). San Francisco, CA: Jossey-Bass. Google Scholar | |
|
Bonner, S. M. (2006). A think-aloud approach to understanding performance on the Multistate Bar Examination. The Bar Examiner, 75, 6-15. Google Scholar | |
|
Bonner, S. M., D’Agostino, J. V. (2012). A substantive process analysis of responses to items from the Multistate Bar Examination. Applied Measurement in Education, 25, 1-26. doi:10.1080/08957347.2012.635472 Google Scholar | Crossref | ISI | |
|
Borland, J. H. (2009). Myth 2: The gifted constitute 3% to 5% of the population. Moreover, giftedness equals high IQ, which is a stable measure of aptitude: Spinal tap psychometrics in gifted education. Gifted Child Quarterly, 53, 236-238. doi:10.1177/0016986209346825 Google Scholar | SAGE Journals | ISI | |
|
Brown, S. W., Yakimowski, M. E. (1987). Intelligence scores of gifted students on the WISC-R. Gifted Child Quarterly, 31, 130-134. doi:10.1177/001698628703100308 Google Scholar | SAGE Journals | ISI | |
|
College Board . (2013). 2013 college-bound seniors total group profile report. Retrieved from http://media.collegeboard.com/digitalServices/pdf/research/2013/TotalGroup-2013.pdf Google Scholar | |
|
College Board . (n. d.). Official SAT® Practice Test 2013-14. Retrieved from https://satonlinecourse.collegeboard.org/SR/digital_assets/assessment/pdf/F4D31AB0-66B4-CE32-00F7-F5405701F413-F.pdf Google Scholar | |
|
Creswell, J. (2014). Research design: Qualitative, quantitative, and mixed methods approaches (4th ed.). Thousand Oaks, CA: Sage. Google Scholar | |
|
Crocker, L., Algina, J. (2008). Introduction to classical and modern test theory. Mason, OH: Cengage Learning. Google Scholar | |
|
Forsyth, R. A., Ansley, T. N., Feldt, L. S., Alnot, S. D. (2001). Iowa Tests of Educational Development. Itasca, IL: Riverside. Google Scholar | |
|
Forsyth, R. A., Ansley, T. N., Feldt, L. S., Alnot, S. D. (2003). Iowa Tests of Educational Development guide to research and development. Itasca, IL: Riverside Publishing. Google Scholar | |
|
Gottfredson, L. S. (2004). Intelligence: Is it the epidemiologists’ elusive “fundamental cause” of social class inequalities in health? Journal of Personality and Social Psychology, 86, 174-199. doi:10.1037/0022-3514.86.1.174 Google Scholar | Crossref | Medline | ISI | |
|
Hoover, H. D., Dunbar, S. B., Frisbie, D. A. (2001). Iowa Tests of Basic Skills, Forms A, B, and C. Itasca, IL: Riverside. Google Scholar | |
|
Hoover, H. D., Dunbar, S. B., Frisbie, D. A., Oberley, K. R., Ordman, V. L., Naylor, R. J., . . . Shannon, G. P. (2003). Iowa Tests of Basic Skills guide to research and development. Itasca, IL: Riverside. Google Scholar | |
|
Jensen, A. R. (1980). Bias in mental testing. New York, NY: Free Press. Google Scholar | |
|
Kane, M. T. (2006). Validation. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: Praeger Publishers. Google Scholar | |
|
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1-73. doi:10.1111/jedm.12000 Google Scholar | Crossref | ISI | |
|
Kaplan, R. M., Saccuzzo, D. P. (2009). Psychological testing: Principles, applications, and issues (7th ed.). Belmont, CA: Wadsworth. Google Scholar | |
|
Kelley, T. L. (1927). Interpretation of educational measurements. Yonkers-on-Hudson, NY: World Book Company. Google Scholar | |
|
Lee, S.-Y., Matthews, M. S., Olszewski-Kubilius, P. (2008). A national picture of Talent Search and Talent Search educational programs. Gifted Child Quarterly, 52, 55-69. doi:10.1177/0016986207311152 Google Scholar | SAGE Journals | ISI | |
|
Lee, S.-Y., Olszewski-Kubilius, P. (2006). Talent search qualifying: Comparisons between talent search students qualifying via scores on standardized tests and via parent nomination. Roeper Review, 28, 157-166. doi:10.1080/02783190609554355 Google Scholar | Crossref | |
|
Leighton, J. P. (2004). Avoiding misconception, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23(4), 6-15. doi:10.1111/j.1745-3992.2004.tb00164.x Google Scholar | Crossref | |
|
Lohman, D. F. (2005). The role of nonverbal ability tests in identifying academically gifted students: An aptitude perspective. Gifted Child Quarterly, 49, 111-138. doi:10.1177/001698620504900203 Google Scholar | SAGE Journals | ISI | |
|
Lubinski, D., Benbow, C. P. (1994). The Study of Mathematically Precocious Youth: The first three decades of a planned 50-year study of intellectual talent. In Subotnik, R. F., Arnold, K. D. (Eds.), Beyond Terman: Contemporary longitudinal studies of giftedness and talent (pp. 255-281). Westport, CT: Ablex. Google Scholar | |
|
Lubinski, D., Webb, R. M., Morelock, M. J., Benbow, C. P. (2001). Top 1 in 10,000: A 10-year follow-up of the profoundly gifted. Journal of Applied Psychology, 86, 718-729. doi:10.1037/0021-9010.86.4.718 Google Scholar | Crossref | Medline | ISI | |
|
Lupkowski-Shoplik, A., Swiatek, M. A. (1999). Elementary student talent searches: Establishing appropriate guidelines for qualifying test scores. Gifted Child Quarterly, 43, 265-272. doi:10.1177/001698629303700304 Google Scholar | SAGE Journals | ISI | |
|
Meade, A. W., Kroustalis, C. M. (2006). Problems with item parceling for confirmatory factor analytic tests of measurement invariance. Organizational Research Methods, 9, 369-403. doi:10.1177/1094428105283384 Google Scholar | SAGE Journals | ISI | |
|
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525-543. doi:10.1007/BF02294825 Google Scholar | Crossref | ISI | |
|
Mills, C. J., Barnett, L. B. (1992). The use of the Secondary School Admission Test (SSAT) to identify academically talented elementary school students. Gifted Child Quarterly, 36, 155-159. doi:10.1177/001698629203600306 Google Scholar | SAGE Journals | ISI | |
|
Minor, L. L., Benbow, C. P. (1996). Construct validity of the SAT-M: A comparative study of high school students and gifted seventh graders. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 347-361). Baltimore, MD: Johns Hopkins University Press. Google Scholar | |
|
Naglieri, J. A. (2007). Traditional IQ: 100 years of misconception and its relationship to minority representation in gifted programs. In VanTassel-Baska, J. (Ed.), Alternative assessments with gifted and talented students (pp. 67-88). Waco, TX: Prufrock Press. Google Scholar | |
|
Nasser, F., Wisenbaker, J. (2003). A Monte Carlo study investigating the impact of item parceling on measures of fit in confirmatory factor analysis. Educational and Psychological Measurement, 63, 729-757. doi:10.1177/0013164403258228 Google Scholar | SAGE Journals | ISI | |
|
Olszewski-Kubilius, P. (1998). Research evidence regarding the validity and effects of talent search educational programs. Journal of Secondary Gifted Education, 9, 134-138. Google Scholar | SAGE Journals | |
|
Olszewski-Kubilius, P., Kulieke, M. J. (2008). Using off-level testing and assessment for gifted and talented students. In VanTassel-Baska, J. (Ed.), Alternative assessments with gifted and talented students (pp. 89-106). Waco, TX: Prufrock Press. Google Scholar | |
|
Olszewski-Kubilius, P., Lee, S.-Y. (2011). Gender and other group differences in performance on off-level tests: Changes in the 21st century. Gifted Child Quarterly, 55, 54-73. doi:10.1177/0016986210382574 Google Scholar | SAGE Journals | ISI | |
|
Raju, N. S., Laffitte, L. J., Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517-529. doi:10.1037//0021-9010.873.3.517 Google Scholar | Crossref | Medline | ISI | |
|
Rambo-Hernandez, K. E., Warne, R. T. (2015). Measuring the outliers: An introduction to out-of-level testing with high-achieving students. Teaching Exceptional Children, 47, 199-207. doi:10.1177/0040059915569359 Google Scholar | SAGE Journals | |
|
Reynolds, C. R., Kamphaus, R. W. (2003a). Reynolds Intellectual Assessment Scales. Lutz, FL: Psychological Assessment Resources. Google Scholar | |
|
Reynolds, C. R., Kamphaus, R. W. (2003b). Reynolds Intellectual Assessment Scales and the Reynolds Intellectual Screening Test professional manual. Lutz, FL: Psychological Assessment Resources. Google Scholar | |
|
Robertson, S. G., Pfeiffer, S. I., Taylor, N. (2011). Serving the gifted: A national survey of school psychologists. Psychology in the Schools, 48, 786-799. doi:10.1002/pits.20590 Google Scholar | Crossref | ISI | |
|
Robinson, N. M. (2002). Assessing and advocating for gifted students: Perspectives for school and clinical psychologists. Retrieved from ERIC database. (ED476372) Google Scholar | |
|
Schmeiser, C. B., Welch, C. J. (2006). Test development. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 307-353). Westport, CT: Praeger Publishers. Google Scholar | |
|
Shiffrin, R. M., Atkinson, R. C. (1969). Storage and retrieval processes in long-term memory. Psychological Review, 76, 179-193. doi:10.1037/h0027277 Google Scholar | Crossref | ISI | |
|
Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72-101. doi:10.2307/1412159 Google Scholar | Crossref | |
|
Stanley, J. C. (1977). Rationale of the Study of Mathematically Precocious Youth (SMPY) during its first five years of promoting educational acceleration. In Stanley, J. C., George, W. C., Solano, C. H. (Eds.), The gifted and the creative: A fifty-year perspective (pp. 75-112). Baltimore, MD: Johns Hopkins University Press. Google Scholar | |
|
Sudman, S., Bradburn, N. M., Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. San Francisco, CA: Jossey-Bass. Google Scholar | |
|
Swiatek, M. A., Lupkowski-Shoplik, A. (2005). An evaluation of the elementary student Talent Search by families and schools. Gifted Child Quarterly, 49, 247-259. doi:10.1177/001698620504900306 Google Scholar | SAGE Journals | ISI | |
|
Terman, L. M. (1926). Genetic studies of genius: Vol. I. Mental and physical traits of a thousand gifted children (2nd ed.). Stanford, CA: Stanford University Press. Google Scholar | |
|
Terman, L. M., Fenton, J. C. (1921). Preliminary report on a gifted juvenile author. Journal of Applied Psychology, 5, 163-178. doi:10.1037/h0074962 Google Scholar | Crossref | |
|
Thompson, L. A., Oehlert, J. (2010). The etiology of giftedness. Learning and Individual Differences, 20, 298-307. doi:10.1016/j.lindif.2009.11.004 Google Scholar | Crossref | ISI | |
|
Thomson, D., Olszewski-Kubilius, P. (2014). The increasingly important role of off-level testing in the context of the talent development perspective. Gifted Child Today, 37, 33-40. doi:10.1177/1076217513509619 Google Scholar | SAGE Journals | |
|
Threlfall, J., Hargreaves, M. (2008). The problem-solving methods of mathematically gifted and older average-attaining students. High Ability Studies, 19, 83-98. doi:10.1080/13598130801990967 Google Scholar | Crossref | ISI | |
|
Tourón, J., Tourón, M. (2011). The Center for Talented Youth identification model: A review of the literature. Talent Development & Excellence, 3, 187-202. Google Scholar | |
|
VanTassel-Baska, J. (1996). Contributions of the talent-search concept to gifted education. In Benbow, C. P., Lubinski, D. (Eds.), Intellectual talent: Psychometric and social issues (pp. 236-245). Baltimore, MD: Johns Hopkins University Press. Google Scholar | |
|
Wai, J., Lubinski, D., Benbow, C. P. (2005). Creativity and occupational accomplishments among intellectually precocious youths: An age 13 to age 33 longitudinal study. Journal of Educational Psychology, 97, 484-492. doi:10.1037/0022-0663.97.3.484 Google Scholar | Crossref | ISI | |
|
Warne, R. T. (2011). Psychometric impacts of above-level testing. (Unpublished doctoral dissertation). Texas A&M University, College Station, TX. Google Scholar | |
|
Warne, R. T. (2012). History and development of above-level testing of the gifted. Roeper Review, 34, 183-193. doi:10.1080/02783193.2012.686425 Google Scholar | Crossref | |
|
Warne, R. T. (2014). Using above-level to track growth in academic achievement in gifted students. Gifted Child Quarterly, 58, 3-23. doi:10.1177/0016986213513793 Google Scholar | SAGE Journals | ISI | |
|
Warne, R. T., Anderson, B., Johnson, A. O. (2013). The impact of race and ethnicity on the identification process for giftedness in Utah. Journal for the Education of the Gifted, 36, 487-508. doi:10.1177/0162353213506065 Google Scholar | SAGE Journals | |
|
Warne, R. T., Yoon, M., Price, C. J. (2014). Exploring the various interpretations of “test bias.” Cultural Diversity and Ethnic Minority Psychology, 20, 570-582. doi:10.1037/a0036503 Google Scholar | Crossref | Medline | ISI | |
|
Watkins, M. W., Greenawalt, C. G., Marcell, C. M. (2002). Factor structure of the Wechsler Intelligence Scale for Children—Third Edition among gifted students. Educational and Psychological Measurement, 62, 164-172. doi:10.1177/0013164402062001011 Google Scholar | SAGE Journals | ISI | |
|
Zwick, R. (2006). Higher education admissions testing. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 647-679). Westport, CT: Praeger Publishers. Google Scholar |

