Abstract
One consideration for selecting progress monitoring tools is the reliability in which the instrument measures student response to instruction. Researchers and vendors establish reliability of growth using two analytic methods: (a) calculating slopes to even and odd observations for each student and correlating the resulting slopes (split-half), and (b) estimating a multilevel model and computing the ratio between true and observed variance in growth estimates. It is unclear whether reliability estimates from either analytic method systematically differ. Therefore, it is also unclear whether recommendations regarding the requisite length of time to collect data to achieve sufficient levels of reliability would vary based on the analytic method used to calculate coefficients. Results from this study indicate that the multilevel method yielded systematically higher reliability estimates compared with the split-half method. Bootstrapping confirmed that differences were statistically significant. Researchers and practitioners should be aware that split-half and multilevel reliability estimates are not interchangeable and recommendations regarding the frequency and duration educators should follow when collecting data may depend on the manner in which reliability is estimated.
|
aimsweb . (2012). Reading curriculum-based measurement administration and scoring guide. Bloomington, MN: Pearson. Google Scholar | |
|
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education . (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Google Scholar | |
|
Ardoin, S. P., Christ, T. J. (2009). Curriculum-based measurement of oral reading: Standard errors associated with progress monitoring outcomes from DIBELS, AIMSweb, and an experimental passage set. School Psychology Review, 38, 266-283. Google Scholar | ISI | |
|
Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., Klingbeil, D. A. (2013). A systematic review and summarization of the recommendations and research surrounding curriculum-based measurement of oral reading fluency (CBM-R) decision rules. Journal of School Psychology, 51, 1-18. doi:10.1016/j.jsp.2012.09.004 Google Scholar | Medline | ISI | |
|
Bates, D., Martin, M. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1-48. doi:10.18637/jss.v067.i01 Google Scholar | ISI | |
|
Betts, J., Pickart, M., Heistad, D. (2009). An investigation of the psychometric evidence of CBM-R passage equivalence: Utility of readability statistics and equating for alternate forms. Journal of School Psychology, 47, 1-17. doi:10.1016/j.jsp.2008.09.001 Google Scholar | ISI | |
|
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296-322. Google Scholar | |
|
Christ, T. J., Ardoin, S. P. (2009). Curriculum-based measurement of oral reading: Passage equivalence and probe-set development. Journal of School Psychology, 47, 55-75. doi:10.1016/j.jsp.2008.09.004 Google Scholar | ISI | |
|
Christ, T. J., Hintze, J. M. (2007). Psychometric considerations when evaluating response to intervention. In Jimmerson, S. R., Burns, M. K., VanDerHeyden, A. M. (Eds.), Handbook of response to intervention (pp. 93-105). New York, NY: Springer. Google Scholar | |
|
Christ, T. J., White, M. J., Ardoin, S. P., Eckert, T. L. (2013). Curriculum based measurement of reading: Consistency and validity across best, fastest, and question reading conditions. School Psychology Review, 42, 415-436. Google Scholar | ISI | |
|
Christ, T. J., Zopluoglu, C., Long, J. D., Monaghen, B. D. (2012). Curriculum-based measurement of oral reading: Quality of progress monitoring outcomes. Exceptional Children, 78, 356-373. doi:10.1177/001440291207800306 Google Scholar | SAGE Journals | ISI | |
|
Christ, T. J., Zopluoglu, C., Monaghen, B. D., Van Norman, E. R. (2013). Curriculum-based measurement of oral reading: Multi-study evaluation of schedule, duration, and dataset quality on progress monitoring outcomes. Journal of School Psychology, 51, 19-57. doi:10.1016/j.jsp.2012.11.001 Google Scholar | Medline | ISI | |
|
Cummings, K. D., Biancarosa, G., Schaper, A., Reed, D. K. (2014). Examiner error in curriculum-based measurement of oral reading. Journal of School Psychology, 52, 361-375. doi:10.1016/j.jsp.2014.05.007 Google Scholar | Medline | ISI | |
|
Cummings, K. D., Park, Y., Bauer Schaper, H. A. (2013). Form effects on DIBELS Next oral reading fluency progress-monitoring passages. Assessment for Effective Intervention, 38, 91-104. doi:10.1177/1534508412447010 Google Scholar | SAGE Journals | |
|
Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52, 219-232. Google Scholar | SAGE Journals | ISI | |
|
Deno, S. L. (1986). Formative evaluation of individual student programs: A new role for school psychologists. School Psychology Review, 15, 358-374. Google Scholar | ISI | |
|
Deno, S. L. (2003). Developments in curriculum-based measurement. The Journal of Special Education, 37, 184-192. doi:10.1177/00224669030370030801 Google Scholar | SAGE Journals | ISI | |
|
Derr-Minneci, T. F., Shapiro, E. S. (1992). Validating curriculum-based measurement in reading from a behavioral perspective. School Psychology Quarterly, 7, 2-16. doi:10.1037/h0088244 Google Scholar | |
|
Efron, B., Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall. Google Scholar | |
|
Francis, D. J., Santi, K. L., Barr, C., Fletcher, J. M., Varisco, A., Foorman, B. R. (2008). Form effects on the estimation of students’ oral reading fluency using DIBELS. Journal of School Psychology, 46, 315-342. doi:10.1016/j.jsp.2007.06.003 Google Scholar | Medline | ISI | |
|
Fuchs, L. S., Deno, S. L. (1991). Paradigmatic distinctions between instructionally relevant measurement models. Exceptional Children, 57, 488-500. Google Scholar | SAGE Journals | ISI | |
|
Haertel, E. H. (2006). Reliability. In Brennan, R. L. (Ed.), Educational measurement (pp. 65-110). West Port. CT: American Council on Education. Google Scholar | |
|
Individuals With Disabilities Education Improvement Act of 2004 (2004). PL 108-446, 20 U.s.C. §§ 1400 et seq.: U.S. Department of Education. Google Scholar | |
|
Long, J. D. (2012). Longitudinal data analysis for the behavioral sciences using r. Thousand Oaks, CA: Sage Publications. Google Scholar | |
|
Muchnik, P. M. (1996). The correction for attenuation. Educational and Psychological Measurement, 56, 63-75. doi:10.1177/0013164496056001004. Google Scholar | |
|
National Center on Intensive Intervention . (2015a). Protocol for evaluating progress monitoring tools. Retrieved from http://www.intensiveintervention.org/sites/default/files/2015NCIIAPMProtocol.pdf Google Scholar | |
|
National Center on Intensive Intervention . (2015b). Progress monitoring technical review committee frequently asked questions. Retrieved from http://www.intensiveintervention.org/content/call-submissions-materials Google Scholar | |
|
Politis, D. N., Romano, J. P. (1994). The stationary bootstrap. Journal of the American Statistical Association, 89, 1303-1313. Google Scholar | ISI | |
|
Poncy, B. C., Skinner, C. H., Axtell, P. K. (2005). An investigation of the reliability and standard error of measurement of words read correctly per minute using curriculum-based measurement. Journal of Psychoeducational Assessment, 23, 326-338. doi:10.1177/073428290502300403 Google Scholar | SAGE Journals | ISI | |
|
R Core Team . (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available from http://www.R-project.org/ Google Scholar | |
|
Raudenbush, S. W., Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage. Google Scholar | |
|
Salvia, J., Ysseldyke, J. E., Witmer, S. (2017). Assessment in special and inclusive education (13th ed.) Boston, MA: Cengage Learning. Google Scholar | |
|
Singh, K. (1981). On asymptotic accuracy of Efron’s bootstrap. Annuals of Statistics, 9, 1187-1195. Google Scholar | ISI | |
|
Spearman, C. C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271-295. Google Scholar | |
|
Stecker, P. M., Fuchs, D., Fuchs, L. S. (2008). Progress monitoring as essential practice within response to intervention. Rural Special Education Quarterly, 27, 10-17. Google Scholar | SAGE Journals | |
|
Tindal, G. (2013). Curriculum-based measurement: A brief history of nearly everything from the 1970s to the present. ISRN Education, 2013, Article 958530. doi:10.1155/2013/958530 Google Scholar | |
|
Van Norman, E. R., Christ, T. J., Newell, K. (2017). Curriculum-based measurement of reading progressing monitoring: The importance of growth magnitude and goal setting in decision making. School Psychology Review, 46, 320-328 Google Scholar | ISI | |
|
VanDerHeyden, A. M., Burns, M. K. (2008). Examination of the utility of various measures of mathematics proficiency. Assessment for Effective Intervention, 33, 215-224. doi:10.1177/1534508407313482 Google Scholar | SAGE Journals | |
|
Willett, J. B. (1989). Some results on reliability for the longitudinal measurement of change: Implications for the design of studies of individual growth. Educational and Psychological Measurement, 49, 587-602. doi:10.1177/001316448904900309 Google Scholar | SAGE Journals | ISI | |
|
Zieffler, A. S., Harring, J. R., Long, J. D. (2011). Comparing groups: Randomization and boostrap methods using R. Hoboken, NJ: Wiley. Google Scholar |

