Recent lawsuits reveal common mistakes plaguing current teacher evaluation systems. Drawing on arguments in court documents for prominent cases, the authors find that evaluation systems using value-added measures (VAM) suffer from a) inconsistent and unreliable teacher ratings, b) bias toward and against teachers of certain types of students, c) easy opportunities for administrators to game the system, and d) a lack of transparency. They urge others to engage with these (and other) arguments to design better, more valid, more useful, and ultimately more defensible teacher evaluation systems.

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education . (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
Amrein-Beardsley, A. (2016, 3 16). Alleged violation of protective order in Houston lawsuit, overruled. VAMboozled! http://vamboozled.com/alleged-violation-of-protective-order-in-houston-lawsuit-overruled
Google Scholar
Carey, K. (2017, 5 19). The little-known statistician who taught us to measure teachers. The New York Times.
Google Scholar
Chiang, H., McCullough, M., Lipscomb, S., Gill, B. (2016). Can student test scores provide useful measures of school principals’ performance? Washington, DC: U.S. Department of Education.
Google Scholar
Close, K., Amrein-Beardsley, A., Collins, C. (2018). State-level assessments and teacher evaluation systems after the passage of the Every Student Succeeds Act: Some steps in the right direction. Boulder, CO: National Education Policy Center.
Google Scholar
Education Week . (2015, 10 6). Teacher evaluation heads to the courts.
Google Scholar
Gabriel, R., Lester, J.N. (2013). Sentinels guarding the grail: Value-added measurement and the quest for education reform. Education Policy Analysis Archives, 21 (9), 130.
Google Scholar
Geiger, T., Amrein-Beardsley, A. (2017). The artificial conflation of teacher-level “multiple measures” [Commentary]. Teachers College Record.
Google Scholar
Gill, B., Shoji, M., Coen, T., Place, K. (2016). The content, predictive power, and potential bias in five widely used teacher observation instruments. Washington, DC: U.S. Department of Education, Institute of Education Sciences.
Google Scholar
Grossman, P., Cohen, J., Ronfeldt, M., Brown, L. (2014). The test matters: The relationship between classroom observation scores and teacher value added on multiple types of assessment. Educational Researcher, 43 (6), 293303.
Google Scholar | SAGE Journals | ISI
Hill, H.C., Kapitula, L., Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48 (3), 794831.
Google Scholar | SAGE Journals | ISI
Holloway-Libell, J. (2015). Evidence of grade and subject-level bias in value-added measures. Teachers College Record.
Google Scholar
Houston Federation of Teachers Local 2415 et al. v. Houston Independent School District , 251 F. Supp. 3d 1168 (S.D. Tex., 2017).
Google Scholar
Kane, M.T. (2017). Measurement error and bias in value-added models (ETS RR-17-25). Princeton, NJ: Educational Testing Service.
Google Scholar
Lederman v. King , No. 26416, slip op. (N.Y. May 10, 2016). https://law.justia.com/cases/new-york/other-courts/2016/2016-ny-slip-op-26416.html
Google Scholar
Newton, X., Darling-Hammond, L., Haertel, E., Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18 (23).
Google Scholar | Medline
Polikoff, M.S., Porter, A.C. (2014). Instructional alignment as a measure of teaching quality. Education Evaluation and Policy Analysis, 36 (4), 399416.
Google Scholar | SAGE Journals | ISI
Reinhorn, S.K., Moore Johnson, S., Simon, N.S. (2017). Investing in development: Six high-performing, high-poverty schools implement Massachusetts’ teacher evaluation policy. Educational Evaluation and Policy Analysis, 39 (3), 383406.
Google Scholar | SAGE Journals | ISI
Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125 (1), 175214.
Google Scholar | ISI
State of New Mexico ex rel. the Hon. Mimi Stewart et al. v. Public Education Department (First Judicial District Court). www.aft.org/sites/default/files/nm-complaint-teacherevals_1114.pdf
Google Scholar
Steinberg, M.P., Garrett, R. (2016). Classroom composition and measured teacher performance: What do teacher observation scores really measure? Educational Evaluation and Policy Analysis, 38 (2), 293317.
Google Scholar | SAGE Journals | ISI
Trout v. Knox County Board of Education , 163 F.Supp.3d 492 (E.D. Tenn. 2016).
Google Scholar
Whitehurst, G.J., Chingos, M.M., Lindquist, K.M. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Washington, DC: Brookings Institution.
Google Scholar
Yeh, S.S. (2013). A re-analysis of the effects of teacher replacement using value-added modeling. Teachers College Record, 115 (12), 135.
Google Scholar
View access options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Your Access Options


Purchase

PDK-article-ppv for $5.00

Article available in:

Related Articles

Citing articles: 0