Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This article introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered categories alone. These methods hold two practical advantages over alternative achievement gap metrics. First, they require only categorical proficiency data, which are often available where means and standard deviations are not. Second, they result in gap estimates that are invariant to score scale transformations, providing a stronger basis for achievement gap comparisons over time and across jurisdictions. The authors find three candidate estimation methods that recover full-distribution gap estimates well when only censored data are available.

Center on Education Policy . (2007). Answering the question that matters most: Has student achievement increased since No Child Left Behind? Retrieved November 1, 2008, from http://www.cep-dc.org/index.cfm?fuseaction=document.showDocumentByID&nodeID=1&DocumentID=200.
Google Scholar
Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin, 114, 494509.
Google Scholar | Crossref
Conover, W. J. (1973). Rank tests for one sample, two sample, and k samples without the assumption of a continuous distribution function. The Annals of Statistics, 1, 11061125.
Google Scholar | Crossref
Dorfman, D. D., Alf, E. (1969). Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals-rating method data. Journal of Mathematical Psychology, 6, 487496.
Google Scholar | Crossref
Downton, F. (1973). The estimation of Pr (Y > X) in the normal case. Technometrics, 15, 551558.
Google Scholar
Education Week . (2010, January 14). State of the states: Sources and notes. Education Week, 29, 4950. Retrieved June 1, 2010, from http://www.edweek.org/ew/articles/2010/01/14/17sources.h29.html.
Google Scholar
Fritsch, F. N., Carlson, R. E. (1980). Monotone piecewise cubic interpolation. Society for Industrial and Applied Mathematics: Journal on Numerical Analysis, 17, 238246.
Google Scholar | Crossref
Furgol, K. E., Ho, A. D., Zimmerman, D. L. (2010). Estimating trends from censored assessment data under no child left behind. Educational and Psychological Measurement, 70, 760776.
Google Scholar | SAGE Journals
Green, D. M., Swets, J. A. (1966). Signal detection theory and psychophysics. New York, NY: Wiley.
Google Scholar
Hedges, L. V., Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.
Google Scholar | Crossref
Ho, A. D. (2007). Describing the pliability of growth statistics under transformations of the vertical scale. Paper presented at the 2007 annual meeting of the National Council on Measurement in Education. Chicago, Illinois.
Google Scholar
Ho, A. D. (2008). The problem with “proficiency”: Limitations of statistics and policy under No Child Left Behind. Educational Researcher, 37, 351360.
Google Scholar | SAGE Journals
Ho, A. D. (2009). A nonparametric framework for comparing trends and gaps across tests. Journal of Educational and Behavioral Statistics, 34, 201228.
Google Scholar | SAGE Journals
Ho, A. D., Haertel, E. H. (2006). Metric-free measures of test score trends and gaps with policy-relevant examples (CSE Report No. 665). Los Angeles, CA: Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Testing, Graduate School of Education & Information Studies.
Google Scholar
Holland, P. (2002). Two measures of change in the gaps between the CDFs of test score distributions. Journal of Educational and Behavioral Statistics, 27, 317.
Google Scholar | SAGE Journals
Jencks, C., Phillips, M. (Eds.), (1998) The Black-White test score gap. Washington, DC: Brookings Institution Press.
Google Scholar
Kolen, M. J., Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. 2nd ed. New York, NY: Springer-Verlag.
Google Scholar | Crossref
Livingston, S. A. (2006). Double P-P plots for comparing differences between two groups. Journal of Educational and Behavioral Statistics, 31, 431435.
Google Scholar | SAGE Journals
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Google Scholar
Magnuson, K., Waldfogel, J. (Eds.). (2008) Steady gains and stalled progress: Inequality and the Black-White test score gap. New York, NY: Russell Sage.
Google Scholar
McGraw, K. O., Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361365.
Google Scholar | Crossref
Mislevy, R. J., Johnson, E. G., Muraki, E. (1992). Scaling procedures in NAEP. Journal of Educational Statistics, 17, 131154.
Google Scholar | SAGE Journals
Neal, D. A. (2006). Why has Black-White skill convergence stopped?. In Hanushek, E. A., Welch, F. (Eds.), Handbook of the Economics of Education (pp. 511576). Vol. 1, Amsterdam: North Holland.
Google Scholar | Crossref
Ogilvie, J. C., Creelman, C. D. (1968). Maximum-likelihood estimation of receiver operating characteristic curve parameters. Journal of Mathematical Psychology, 5, 377391.
Google Scholar | Crossref
Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. New York, NY: Oxford University Press.
Google Scholar
Pollack, J. M., Narajian, M., Rock, D. A., Atkins-Burnett, S., Hausken, E. G. (2005). Early childhood longitudinal study–kindergarten class of 1998–99 (ECLS-K), psychometric report for the fifth grade (NCES Report No. 2006–036). Washington, DC: U.S. Department of Education, National Center for Education Statistics.
Google Scholar
Reardon, S. F. (2008a). Differential growth in the Black-White achievement gap during elementary school among initially high- and low-scoring students. Working Paper Series. Stanford, CA: Institute for Research on Educational Policy and Practice, Stanford University.
Google Scholar
Reardon, S. F. (2008b). Thirteen ways of looking at the Black-White test score gap. Working Paper Series. Stanford, CA: Institute for Research on Educational Policy and Practice, Stanford University.
Google Scholar
Seltzer, M. H., Frank, K. A., Bryk, A. S. (1994). The metric matters: The sensitivity of conclusions about growth in student achievement to choice of metric. Educational Evaluation and Policy Analysis, 16, 4149.
Google Scholar | SAGE Journals
Simpson, A. J., Fitter, M. J. (1973). What is the best index of detectability?. Psychological Bulletin, 80, 481488.
Google Scholar | Crossref
Spencer, B. D. (1983). Test scores as social statistics: Comparing distributions. Journal of Educational Statistics, 8, 249269.
Google Scholar | SAGE Journals
Swets, J. A., Pickett, R. M. (1982). Evaluation of diagnostic systems: Methods from signal detection theory. New York, NY: Academic Press.
Google Scholar
U.S. Department of Education . (2010). A blueprint for reform: The reauthorization of the Elementary and Secondary Education Act. Washington, DC: Office of Planning, Evaluation, and Policy Development.
Google Scholar
Vanneman, A., Hamilton, L., Baldwin Anderson, J., Rahman, T. (2009). Achievement gaps: How Black and White students in public schools perform in mathematics and reading on the national assessment of educational progress (NCES 2009–455). Washington, DC: National Center for Education Statistics, U.S. Department of Education.
Google Scholar
Vargha, A., Delaney, H. D. (2000). A critique and modification of the common language effect size measure of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 101132.
Google Scholar | Abstract
Wilk, M. B., Gnanadesikan, R. (1968). Probability plotting methods for the analysis of data. Biometrika, 55, 117.
Google Scholar | Medline
Wolynetz, M. S. (1979). Algorithm AS 138: Maximum likelihood estimation from confined and censored normal data. Applied Statistics, 28, 185195.
Google Scholar | Crossref
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

JEB-article-ppv for $37.50

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top