Abstract
Does reduced class size cause higher academic achievement for both Black and other students in reading, mathematics, listening, and word recognition skills? Do Black students benefit more than other students from reduced class size? Does the magnitude of the minority advantages vary significantly across schools? This article addresses the causal questions via analysis of experimental data from Tennessee’s Student/Teacher Achievement Ratio study where students and teachers are randomly assigned to small or regular class type. Causal inference is based on a three-level multivariate simultaneous equation model (SM) where the class type as an instrumental variable (IV) and class size as an endogenous regressor interact with a Black student indicator. The randomized IV causes class size to vary which, by hypothesis, influences academic achievement overall and moderates a disparity in academic achievement between Black and other students. Within each subpopulation characterized by the ethnicity, the effect of reduced class size on academic achievement is the average causal effect. The difference in the average causal effects between the race ethnic groups yields the causal disparity in academic achievement. The SM efficiently handles ignorable missing data with a general missing pattern and is estimated by maximum likelihood. This approach extends Rubin’s causal model to a three-level SM with cross-level causal interaction effects, requiring intact schools and no interference between classrooms as a modified Stable Unit Treatment Value Assumption. The results show that, for Black students, reduced class size causes higher academic achievement in the four domains each year from kindergarten to third grade, while for other students, it improves the four outcomes except for first-grade listening in kindergarten and first grade only. Evidence shows that Black students benefit more than others from reduced class size in first-, second-, and third-grade academic achievement. This article does not find evidence that the causal minority disparities are heterogeneous across schools in any given year.
References
|
Angrist, J. D., Imbens, G. W. (1995). Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the Acoustical Society of America, 90, 431–442. Google Scholar | |
|
Angrist, J. D., Imbens, G. W., Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the Acoustical Society of America, 91, 444–455. Google Scholar | |
|
Bollen, K. A . (1989). Structural equations with latent variables. New York: John Wiley & Sons. Google Scholar | Crossref | |
|
Bollen, K. A. (1996). An alternative two stage least squares estimator for latent variable equations. Psychometrika, 61, 109–121. Google Scholar | Crossref | |
|
Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 76, 1–38. Google Scholar | |
|
Dempster, A. P., Rubin, D. B., Tsutakawa, R. K. (1981). Estimation in covariance components models. Journal of the Acoustical Society of America, 76, 341–353. Google Scholar | |
|
Finn, J. D., Achilles, C. M. (1990). Answers and questions about class size: A statewide experiment. American Ecucational Research Journal, 27, 557–577. Google Scholar | SAGE Journals | |
|
Finn, J. D., Boyd-Zaharias, J., Fish, R. M., Gerber, S. B. (2007). Project STAR and beyond: Database user’s guide. Lebanon, TN: HEROS. Google Scholar | |
|
Frangakis, C. E., Brookmeyer, R. S., Varadhan, R., Mahboobeh, S., Valhov, D., Strathdee, S. A. (2004). Methodology for evaluating a partially controlled longitudinal treatment using principal stratification, with application to a needle exchange program. Journal of the Acoustical Society of America, 99, 239–249. Google Scholar | |
|
Frangakis, C. E., Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics, 58, 21–29. Google Scholar | Crossref | Medline | |
|
Frangakis, C. E., Rubin, D. B., Zhou, X. (2002). Clustered encouragement designs with individual noncompliance: Bayesian inference with randomization, and application to advance directive forms. Biostatistics, 3, 147–164. Google Scholar | Crossref | Medline | |
|
Fryer, R. G., Levitt, S. D. (2004). Understanding the Black-White test score gap in the first two years of school. Review of Economics and Statistics, 86, 447–464. Google Scholar | Crossref | |
|
Goldstein, H., Blatchford, P. (1998). Class size and eduational achievement: A review of methodology with particular reference to study design. British Educational Research Journal, 24, 255–268. Google Scholar | Crossref | |
|
Hanushek, E. A. (1999). Some findings from an independent investigation of the Tennessee's STAR experiment and from other investigations of class size effects. Educational Evaluation and Policy Analysis, 21, 143–163. Google Scholar | SAGE Journals | |
|
Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945–960. Google Scholar | Crossref | |
|
Hong, G., Raudenbush, S. W. (2006). Evaluating Kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the Acoustical Society of America, 101, 901–910. Google Scholar | |
|
Imbens, G. W., Angrist, J. D. (1994). Identification and estimation of local average treatment effects. Econometrica, 62, 467–475. Google Scholar | Crossref | |
|
Imbens, G. W., Rubin, D. B. (1997a). Bayesian inference for cuasal effects in randomized experiments with noncompliance. Annals of Statistics, 25, 305–327. Google Scholar | Crossref | |
|
Imbens, G. W., Rubin, D. B. (1997b). Estimating outcome distributions for compliers in instrumental variables models. Review of Economic Studies, 64, 555–574. Google Scholar | Crossref | |
|
Krueger, A. B. (1999). Experimental estimates of education production functions. Quarterly Journal of Economics, 114, 497–532. Google Scholar | Crossref | |
|
Krueger, A. B., Whitmore, D. M. (2001). The effect of attending a small class in the early grades on college-test taking and middle school test results: Evidence from project STAR. Economic Journal, 111, 1–28. Google Scholar | Crossref | |
|
Laird, N. M., Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974. Google Scholar | Crossref | Medline | |
|
Lintz, M., Folger, J., Breda, C. (1990). The state of Tennessee's Student/Teacher Achievement Ratio (STAR) project: Final summary report 1985–1990. Nashville: Tennessee State Department of Education. Google Scholar | |
|
Little, R. J. A., Rubin, D. B. (2002). Statistical analysis with missing data. New York, NY: Wiley. Google Scholar | Crossref | |
|
Little, R. J. A., Yau, L. H. Y. (1998). Statistical techniques for analyzing data from prevention trials: Treatment of no-shows using Rubin's causal model. Psychological Methods, 3, 147–159. Google Scholar | Crossref | |
|
Longford, N. T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika, 74, 817–827. Google Scholar | Crossref | |
|
Milesi, C., Gamoran, A. (2006). Effects of class size and instruction on kindergarten achievement. Educational Evaluation and Policy Analysis, 28, 287–313. Google Scholar | SAGE Journals | |
|
Mosteller, F. (1995). The Tennessee study of class size in the early school grades. The Future of Children: Critical Issues for Children and Youths, 5, 113–127. Google Scholar | Crossref | Medline | |
|
Nye, B., Hedges, L. V., Konstantopoulos, S. (1999). The long-term effects of small classes: A five-year follow-up of the Tennessee class size experiment. Educational Evaluation and Policy Analysis, 21, 127–142. Google Scholar | SAGE Journals | |
|
Nye, B., Hedges, L. V., Konstantopoulos, S. (2000a). The effects of small classes on academic achievement: The results of the Tennessee class size experiment. American Educational Research Journal, 1, 123–151. Google Scholar | SAGE Journals | |
|
Nye, B., Hedges, L. V., Konstantopoulos, S. (2000b). Do the disadvantaged benefit more from small classes? Evidence from the Tennessee class size experiment. American Journal of Education, 109, 1–26. Google Scholar | Crossref | |
|
Nye, B., Konstantopoulos, S., Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26, 237–257. Google Scholar | SAGE Journals | |
|
Raudenbush, S. W. (2010). Strategies for modeling interference between units in multi-site trials. New Orleans, LA: Presentation at ENAR. Google Scholar | |
|
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592. Google Scholar | Crossref | |
|
Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Annals of Statistics, 6, 34–58. Google Scholar | Crossref | |
|
Shin, Y., Raudenbush, S. W. (2007). Just-identified versus over-identified two-level hierarchical linear models with missing data. Biometrics, 63, 1262–1268. Google Scholar | Crossref | Medline | |
|
Shin, Y., Raudenbush, S. W. (2011). The causal effect of class size on academic performance: Multivariate instrumental variable estimators with Tennessee class size data missing at random. Journal of Educational and Behavioral Statistics, 36, 154–185. Google Scholar | SAGE Journals | |
|
Tourangeau, K., Nord, C., Lê, T., Sorongon, A. G., Najarian, M. (2009). Early childhood longitudinal study, Kindergarten class of 1998-99 (ECLS-K), Combined user’s manual for the ECLS-K Eighth-Grade and K-8 Full sample data files and electronic codebooks (NCES 2009-004). Washington, DC: NCES, IES, DOE. Google Scholar | |
|
Verbitsky, N., Raudenbush, S. W. (2004). Causal inference in spatial setting. Preceedings of the Social Statistics Section, American Statistical Association, Social Statistics Section [CD-ROM], Alexandria, VA: American Statistical Association, 2369–2374. Google Scholar | |
|
Word, E., Johnston, J., Bain, H., Fulton, B., Zaharias, J., Achilles, C., Lintz, M., Folger, J., Breda, C. (1990). The state of Tennessee’s student/teacher achievement ratio (STAR) project: Final summary report 1985–1990. Nashville: Tennessee State Department of Education. Google Scholar |

