Abstract
This study used generalizability theory to measure reliability on the Recognizing Effective Special Education Teachers (RESET) observation tool designed to evaluate special education teacher effectiveness. At the time of this study, the RESET tool included three evidence-based instructional practices (direct, explicit instruction; whole-group instruction; and discrete trial teaching) as the basis for special education teacher evaluation. Five raters participated in two sessions to evaluate special education classroom instruction collected from two school years, via the Teachscape 360-degree video system. Data collected from raters were analyzed in a two-facet “partially” nested design where occasions (o) were nested within teachers (t), o:t, and crossed with raters (r), {o:t} x r. Results from this study are in alignment with similar studies that found multiple observations and multiple raters are critical for ensuring acceptable levels of measurement score reliability. Considerations for the feasibility of practice should be observed in future reliability and validity studies on the RESET tool, and further work is needed to address the lack of research on rater reliability issues within special education teacher evaluation.
| Baker, S. K., Chard, D. J., Ketterlin-Geller, L. R., Apichatabutra, C., Doabler, C. (2009). Teaching writing to at-risk students: The quality of evidence for self regulated strategy development. Exceptional Children, 75, 303–318. Google Scholar | SAGE Journals | ISI | |
| Bell, C. A., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17, 62–87. doi:10.1080/10627197.2012.715014 Google Scholar | Crossref | |
| Benedict, A. E., Thomas, R. A., Kimerling, J., Leko, C. (2013). Trends in teacher evaluation: What every special education teacher should know. Teaching Exceptional Children, 45(5), 60–68. Google Scholar | SAGE Journals | |
| Boe, E. E., Cook, L. H., Sunderland, R. J. (2008). Teacher turnover: Examining exit attrition, teaching area transfer, and school migration. Exceptional Children, 75, 7–31. Google Scholar | SAGE Journals | ISI | |
| Brennan, R. L. (2001). Generalizability theory. New York, NY: Springer. Google Scholar | Crossref | |
| Brennan, R. L. (2013). A multivariate generalizability analysis of portfolio assessments in dental education (CASMA Research Report Series, No. 34). Retrieved from http://www.education.uiowa.edu/centers/casma/publications-data-file Google Scholar | |
| Cardinet, J., Johnson, S., Pini, G. (2010). Applying generalizability theory using EduG. New York, NY: Routledge. Google Scholar | |
| Casabianca, J. M., McCaffrey, D. F., Gitomer, D. H., Bell, C. A., Hamre, B. K., Pianta, R. C. (2013). Effect of observation mode on measures of secondary mathematics teaching. Educational and Psychological Measurement, 73, 757–783. doi:10.1177/0013164413486987 Google Scholar | SAGE Journals | ISI | |
| Connelly, V., Graham, S. (2009). Student teaching and teacher attrition in special education. Teacher Education and Special Education, 32, 257–269. doi:10.1177/0888406409339472 Google Scholar | SAGE Journals | |
| Cook, B. G., Odom, S. L. (2013). Evidence-based practices and implementation science in special education. Exceptional Children, 79, 135–144. Google Scholar | SAGE Journals | ISI | |
| Council for Exceptional Children . (2012). The council for exceptional children’s position on special education teacher evaluation. Arlington, VA: Retrieved from http://cec.metapress.com/content/022w828643484g12/ Google Scholar | |
| Cronbach, L. J., Gleser, G. C., Nanda, H., Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York, NY: Wiley. Google Scholar | |
| Danielson, C. (2011). Evaluations that help teachers learn. Educational Leadership, 68, 35–39. Google Scholar | ISI | |
| Danielson, C. (2013). The framework for teaching evaluation instrument, 2013 edition (2nd ed.). Princeton, NJ: Danielson Group. Google Scholar | |
| Erlich, O., Shavelson, R. J. (1976). The application of generalizability theory to the study of teaching (Beginning Teacher Evaluation Study). San Francisco, CA: Far West Laboratory. Google Scholar | |
| Erlich, O., Shavelson, R. J. (1978). The search for correlations between measures of teacher behavior and student achievement: Measurement problem, conceptualization problem, or both? Journal of Educational Measurement, 15, 77–89. Google Scholar | Crossref | ISI | |
| Gersten, R., Chard, D. J., Jayanthi, M., Baker, S. K., Murphy, P., Flojo, J. (2009). Mathematics instruction for students with learning disabilities: A meta-analysis of instructional components. Review of Educational Research, 79, 1202–1242. Google Scholar | SAGE Journals | ISI | |
| Goe, L., Croft, A. (2009). Methods of evaluating teacher effectiveness. Retrieved from http://www.gtlcenter.org/sites/default/files/docs/RestoPractice_EvaluatingTeacherEffectiveness.pdf Google Scholar | |
| Goe, L., Holdheide, L. (2011). Measuring teachers’ contributions to student learning growth for nontested grades and subjects. Washington, DC. Retrieved from http://www.tqsource.org/publications/MeasuringTeachersContributions.pdf Google Scholar | |
| Hill, H. C., Charalambous, C. Y., Blazar, D., McGinn, D., Kraft, M., Beisiegel, M., Lynch, K. (2012). Validating arguments for observational instruments: Attending to multiple sources of variation. Educational Assessment, 17, 88–106. doi:10.1080/10627197.2012.715019 Google Scholar | Crossref | |
| Hill, H. C., Charalambous, C. Y., Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41, 56–64. doi:10.3102/0013189X12437203 Google Scholar | SAGE Journals | ISI | |
| Ho, A. D., Kane, T. J. (2013). The reliability of classroom observations by school personnel. Retrieved from http://www.metproject.org/downloads/MET_ReliabilityofClassroomObservations_ResearchPaper.pdf Google Scholar | |
| Holdheide, L. (2012, July). State considerations in designing and implementing evaluation systems that include teachers of students with disabilities. Presented at the Office of Special Education Programs Project Director’s Conference, Washington, DC. Google Scholar | |
| Holdheide, L., Browder, D., Warren, S., Buzick, H., Jones, N. (2012). Summary of “using student growth to evaluate educators of students with disabilities: Issues, challenges, and next steps.” Retrieved from http://www.gtlcenter.org/sites/default/files/docs/TQ_Forum_SummaryUsing_Student_Growth.pdf Google Scholar | |
| Johnson, E. S., Semmelroth, C. L. (2012). Examining interrater agreement analyses of a pilot special education observation tool. The Journal of Special Education Apprenticeship, 1(4). Retrieved from http://josea.info/index.php?page=vol1no2 Google Scholar | |
| Jones, N. D., Buzick, H. M., Turkan, S. (2013). Including students with disabilities and English learners in measures of educator effectiveness. Educational Researcher, 42, 234–241. doi:10.3102/0013189X12468211 Google Scholar | SAGE Journals | ISI | |
| Kane, T. J., Cantrell, S. (2013). Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET project’s three-year study. Retrieved from http://www.metproject.org/downloads/MET_Ensuring_Fair_and_Reliable_Measures_Practitioner_Brief.pdf Google Scholar | |
| Kane, T. J., Staiger, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Retrieved from http://www.metproject.org/downloads/MET_Gathering_Feedback_Research_Paper.pdf Google Scholar | |
| Landis, J. R., Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174. Google Scholar | Crossref | Medline | ISI | |
| Lewis, W. D., Young, T. V. (2013). The politics of accountability: Teacher education policy. Educational Policy, 27, 190–216. doi:10.1177/0895904812472725 Google Scholar | SAGE Journals | ISI | |
| McGuinn, P. (2012). The state of teacher evaluation reform. Washington, DC. Retrieved from http://www.americanprogress.org/wp-content/uploads/2012/11/McGuinn_TheStateofEvaluation-1.pdf Google Scholar | |
| McLeskey, J. (2011). Supporting improved practice for special education teachers. Journal of Special Education Leadership, 24(1), 26–36. Google Scholar | |
| McLeskey, J., Billingsley, B. S. (2008). How does the quality and stability of the teaching force influence the research-to-practice gap? A perspective on the teacher shortage in special education. Remedial and Special Education, 29, 293–305. doi:10.1177/0741932507312010 Google Scholar | SAGE Journals | ISI | |
| Medley, D. M., Mitzel, H. E. (1958). Application of analysis of variance to the estimation of the reliability of observations of teachers’ classroom behaviors. The Journal of Experimental Education, 27, 23–35. Google Scholar | Crossref | ISI | |
| Meyer, J. P., Cash, A. H., Mashburn, A. (2011). Occasions and the reliability of classroom observations: Alternative conceptualizations and methods of analysis. Educational Assessment, 16, 227–243. doi:10.1080/10627197.2011.638884 Google Scholar | Crossref | |
| Morgan, P. L., Frisco, M. L., Farkas, G., Hibel, J. (2008). A propensity score matching analysis of the effects of special education services. The Journal of Special Education, 43, 236–254. doi:10.1177/0022466908323007 Google Scholar | SAGE Journals | ISI | |
| National Council on Teacher Quality . (2012). State of the states 2012: Teacher effectiveness policies. Retrieved from http://www.nctq.org/dmsStage/State_of_the_States_2012_Teacher_Effectiveness_Policies_NCTQ_Report Google Scholar | |
| Newman, L., Wagner, M., Knokey, A.-M., Marder, C., Nagle, K., Shaver, D., . . . Schwarting, M. (2011). The post-high school outcomes of young adults with disabilities up to 8 years after high school: A Report from the National Longitudinal Transition Study-2 (NLTS2, Vol. 2). Washington DC: National Center for Special Education Research, Institute of Education Sciences. Google Scholar | |
| Nougaret, A. A., Scruggs, T. E., Mastropieri, M. A. (2005). Does teacher education produce better special education teachers? Exceptional Children, 71, 217–229. Google Scholar | SAGE Journals | ISI | |
| Odom, S. L. (2009). The tie that binds: Evidence-based practice, implementation science, and outcomes for children. Topics in Early Childhood Special Education, 29, 53–61. Google Scholar | SAGE Journals | ISI | |
| Odom, S. L., Brantlinger, E., Gersten, R., Horner, R. H., Thompson, B., Harris, K. R. (2005). Research in special education: Scientific methods and evidence-based practices. Exceptional Children, 71, 137–148. Google Scholar | SAGE Journals | ISI | |
| Odom, S. L., Collet-Klingenberg, L., Rogers, S. J., Hatton, D. D. (2010). Evidence-based practices in interventions for children and youth with Autism Spectrum Disorders. Preventing School Failure: Alternative Education for Children and Youth, 54, 275–282. doi:10.1080/10459881003785506 Google Scholar | Crossref | |
| Prince, C. D., Schuermann, P. J., Guthrie, J. W., Witham, P. J., Milanowski, A. T., Thorn, C. A. (2009). The other 69 percent: Fairly rewarding the performance of teachers of nontested subjects and grades. Washington, DC. Retrieved from http://www.cecr.ed.gov/guides/other69Percent.pdf Google Scholar | |
| Russ, S., Chiang, B., Rylance, B. J., Bongers, J. (2001). Caseload in special education: An integration of research findings. Exceptional Children, 67, 161–172. Google Scholar | SAGE Journals | |
| Scruggs, T. E., Mastropieri, M. A., Berkeley, S., Graetz, J. E. (2009). Do special education interventions improve learning of secondary content? A meta-analysis. Remedial and Special Education, 31, 437–449. doi:10.1177/0741932508327465 Google Scholar | SAGE Journals | ISI | |
| Semmelroth, C. L., Johnson, E. S., Allred, K. (2013). Special educator evaluation: Cautions, concerns and considerations. Journal of the American Academy of Special Education Professionals. Retrieved from http://aasep.org/?id=1344#5965 Google Scholar | |
| Shavelson, R. J., Dempsey, N. (1975). Generalizability of measures of teacher effectiveness and teaching process (Technical Report #75-4-2, Beginning Teacher Evaluation Study). San Francisco, CA: Far West Laboratory for Educational Research and Development. Google Scholar | |
| Shavelson, R. J., Dempsey-Atwood, N. (1976). Generalizability of measures of teaching behavior. Review of Educational Research, 46, 553–611. Google Scholar | SAGE Journals | ISI | |
| Shavelson, R. J., Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: SAGE. Google Scholar | |
| Sledge, A., Pazey, B. L. (2013). Measuring teacher effectiveness through meaningful evaluation: Can reform models apply to general education and special education teachers? Teacher Education and Special Education, 36, 231–246. doi:10.1177/0888406413489839 Google Scholar | SAGE Journals | ISI | |
| Smith, G. J., Schmidt, M. M., Edelen-Smith, P. J., Cook, B. G. (2013). Pasteur’s quadrant as the bridge linking rigor with relevance. Exceptional Children, 79, 147–161. Google Scholar | SAGE Journals | ISI | |
| Spooner, F., Algozzine, B., Wood, C. L., Hicks, S. C. (2010). What we know and need to know about teacher education and special education. Teacher Education and Special Education, 33, 44–54. doi:10.1177/0888406409356184 Google Scholar | SAGE Journals | |
| Stempien, L. R., Loeb, R. C. (2002). Differences in job satisfaction between general education and special education teachers: Implications for retention. Remedial and Special Education, 23, 258–267. doi:10.1177/07419325020230050101 Google Scholar | SAGE Journals | ISI | |
| Tindal, G., Yovanoff, P., Geller, J. P. (2010). Generalizability theory applied to reading assessments for students with significant cognitive disabilities. The Journal of Special Education, 44, 3–17. doi:10.1177/0022466908323008 Google Scholar | SAGE Journals | ISI | |
| Tyler, N. C., Yzquierdo, Z., Lopez-Reyna, N., Flippin, S. S. (2004). Cultural and linguistic diversity and the special education workforce: A critical overview. The Journal of Special Education, 38, 22–38. Google Scholar | SAGE Journals | ISI | |
| U.S. Department of Education . (2012, May 22). Race to the top fund. Retrieved from http://www2.ed.gov/programs/racetothetop/index.html Google Scholar | |
| Vannest, K. J., Hagan-Burke, S. (2009). Teacher time use in special education. Remedial and Special Education, 31, 126–142. doi:10.1177/0741932508327459 Google Scholar | SAGE Journals | ISI | |
| Webb, N. M., Shavelson, R. J., Haertel, E. H. (2006). Reliability coefficients and generalizability theory. In Rao, C. R., Sinharay, S. ( Eds.), Handbook of Statistics (Vol. 26, pp. 1–124). Amsterdam, Netherlands: Elsevier. doi:10.1016/S0169-7161(06)26004-8 Google Scholar | Crossref |

