Abstract
To date, rater accuracy when using Direct Behavior Rating (DBR) has been evaluated by comparing DBR-derived data to scores yielded through systematic direct observation. The purpose of this study was to evaluate an alternative method for establishing comparison scores using expert-completed DBR alongside best practices in consensus building exercises, to evaluate the accuracy of ratings. Standard procedures for obtaining expert data were established and implemented across two sites. Agreement indices and comparison scores were derived. Findings indicate that the expert consensus building sessions resulted in high agreement between expert raters, lending support for this alternative method for identifying comparison scores for behavioral data.
|
Achenbach, T. M., McConaughy, S. H., Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situation specificity. Psychological Bulletin, 101, 213–232. Google Scholar | Crossref | Medline | ISI | |
|
Bernardin, H. J., Buckley, M. R. (1981). Strategies in rater training. Academy of Management Review, 6, 205–212. Google Scholar | |
|
Borman, W. C. (1977). Consistency of rating accuracy and rater errors in the judgment of human performance. Organizational Behavior and Human Performance, 20, 238–252. Google Scholar | Crossref | Medline | ISI | |
|
Borman, W. C. (1978). Exploring upper limits of reliability and validity in job performance ratings. Journal of Applied Psychology, 63, 135–144. Google Scholar | Crossref | ISI | |
|
Borman, W. C. (1979). Format and training effects on rating accuracy and rater errors. Journal of Applied Psychology, 64, 410–421. Google Scholar | Crossref | ISI | |
|
Brennan, R. L. (2001). Generalizability theory. New York, NY: Springer. Google Scholar | Crossref | |
|
Chafouleas, S. M. (2011). Direct Behavior Rating: A review of the issues and research in its development. Education & Treatment of Children, 34, 575–591. Google Scholar | Crossref | |
|
Chafouleas, S. M., Christ, T. J., Riley-Tillman, T. C. (2009). Generalizability of scaling gradients on Direct Behavior Ratings (DBRs). Educational and Psychological Measurement, 69, 157–173. Google Scholar | SAGE Journals | ISI | |
|
Chafouleas, S. M., Jaffery, R., Riley-Tillman, T. C., Christ, T. J., Sen, R. (2013). The impact of target, wording, and duration on rating accuracy for Direct Behavior Rating. Assessment for Effective Intervention, 39, 39–53. Google Scholar | SAGE Journals | |
|
Chafouleas, S. M., Kilgus, S. P., Riley-Tillman, T. C., Jaffery, R., Harrison, S. (2012). Preliminary evaluation of various training components on accuracy of Direct Behavior Ratings. Journal of School Psychology, 50, 317–334. Google Scholar | Crossref | Medline | ISI | |
|
Chafouleas, S. M., Riley-Tillman, T. C., Christ, T. J. (2009). Direct Behavior Rating: An emerging method for assessing social behavior within a tiered intervention system. Assessment for Effective Intervention, 34, 195–200. Google Scholar | SAGE Journals | |
|
Chafouleas, S. M., Riley-Tillman, T. C., Jaffery, R., Miller, F. G., Harrison, S. E. (2014). Preliminary investigation of the impact of a web-based module on Direct Behavior Rating accuracy. School Mental Health, 1–13. doi:10.1007/s12310-014-9130-z Google Scholar | Crossref | Medline | ISI | |
|
Christ, T. J., Riley-Tillman, T. C., Chafouleas, S. M. (2009). Foundation for the development and use of Direct Behavior Rating (DBR) to assess and evaluate student behavior. Assessment for Effective Intervention, 34, 201–213. doi:10.1177/1534508409340390 Google Scholar | SAGE Journals | |
|
Christ, T. J., Riley-Tillman, T. C., Chafouleas, S. M., Jaffery, R. (2011). Direct Behavior Rating (DBR): An evaluation of alternate definitions to assess classroom behaviors. School Psychology Review, 40, 181–199. Google Scholar | ISI | |
|
Cone, J. D. (1998). Psychometric considerations: Concepts, contents, and methods. In Bellack, A. S., Hersen, M. (Eds.), Behavioral assessment: A practical handbook (pp. 22–46). Needham Heights, MA: Allyn & Bacon. Google Scholar | |
|
Cooper, J. O., Heron, T. E., Heward, W. L. (2004). Applied behavior analysis (2nd ed.). Upper Saddle River, NJ: Pearson. Google Scholar | |
|
Cronbach, L. J. (1955). Process affecting scores on understanding of others and assumed similarity. Psychological Bulletin, 52, 177–193. Google Scholar | Crossref | Medline | ISI | |
|
Hintze, J. M. (2005). Psychometrics of direct observation. School Psychology Review, 34, 507–519. Google Scholar | ISI | |
|
James, L. R., Demaree, R. G., Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69, 85–98. Google Scholar | Crossref | ISI | |
|
James, L. R., Demaree, R. G., Wolf, G. (1993). rWG: An assessment of within-group interrater agreement. Journal of Applied Psychology, 78, 306–309. Google Scholar | Crossref | ISI | |
|
Kazdin, A. E. (1977). Artifact, bias, and complexity of assessment: The ABCs of reliability. Journal of Applied Behavior Analysis, 10, 141–150. doi:10.1901/jaba.1977.10-141 Google Scholar | Crossref | Medline | ISI | |
|
Kenny, D. A., Albright, L. (1987). Accuracy in interpersonal perception: A social relations analysis. Psychological Bulletin, 102, 390–402. Google Scholar | Crossref | Medline | ISI | |
|
LeBel, T. J., Kilgus, S. P., Briesch, A. M., Chafouleas, S. M. (2010). The impact of training on the accuracy of teacher-completed Direct Behavior Ratings (DBRs). Journal of Positive Behavioral Interventions, 12, 55–63. Google Scholar | SAGE Journals | ISI | |
|
LeBreton, J. M., Senter, J. L. (2007). Answers to 20 questions about interrater reliability and interrater agreement. Organizational Research Methods, 11, 815–852. doi:10.1177/1094428106296642 Google Scholar | SAGE Journals | ISI | |
|
McGraw, K. O., Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46. Google Scholar | Crossref | ISI | |
|
Mclntyre, R. M., Smith, D. E., Hassett, C. E. (1984). Accuracy of performance ratings as affected by rater training and perceived purpose of rating. Journal of Applied Psychology, 69, 147–156. Google Scholar | Crossref | ISI | |
|
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749. Google Scholar | Crossref | ISI | |
|
Murphy, K. R., Garcia, M., Kerkar, S., Martin, C., Balzer, W. K. (1982). Relationship between observation accuracy and accuracy in evaluating performance. Journal of Applied Psychology, 67, 320–325. Google Scholar | Crossref | ISI | |
|
Paulus, P. B. (1998). Developing consensus about groupthink after all these years. Organizational Behavior and Human Decision Processes, 73, 362–374. Google Scholar | Crossref | Medline | ISI | |
|
Pulakos, E. (1984). A comparison of rater training programs: Error training and accuracy training. Journal of Applied Psychology, 69, 581–588. Google Scholar | Crossref | ISI | |
|
Pulakos, E. D. (1986). The development of training programs to increase accuracy in different rating tasks. Organizational Behavior and Human Decision Processes, 38, 76–91. Google Scholar | Crossref | ISI | |
|
Riley-Tillman, T. C., Chafouleas, S. M., Christ, T. J., Briesch, A. M., LeBel, T. J. (2009). The impact of item wording and behavioral specificity on the accuracy of Direct Behavior Ratings (DBRs). School Psychology Quarterly, 24, 1–12. Google Scholar | Crossref | ISI | |
|
Roberson, Q. M., Sturman, M. C., Simons, T. L. (2007). Does the measure of dispersion matter in multilevel research? A comparison of the relative performance of dispersion indices. Organizational Research Methods, 10, 564–588. Google Scholar | SAGE Journals | ISI | |
|
Suen, H. K., Ary, D. (1989). Analyzing quantitative behavioral observation data. Hillsdale, NJ: Lawrence Erlbaum. Google Scholar | |
|
Tapp, J. (2004). MOOSES Version 3. User’s manual. Retrieved from http://kc.vanderbilt.edu/mooses/mooses.html Google Scholar |

