School Psychology, Vol 41(1), Jan 2026, 5-14; doi:10.1037/spq0000698
The purpose of our study was to compare the effectiveness of rater training and statistical adjustment at mitigating rater effects and improving the accuracy of direct behavior ratings–multi-item scales (DBR-MIS) scores targeting academic engagement and disruptive behavior. Results from a many-facet Rasch measurement analysis with a sample of video clips of 15 middle school students rated by 10 graduate students indicated that raters significantly differed in their tendencies toward severity/leniency and that rater training was only successful at reducing between-rater differences on disruptive behavior. Unfortunately, neither rater training nor statistical adjustment improved DBR-MIS score accuracy when compared to direct observation, though improved accuracy was noted for ratings on a single DBR-MIS disruptive behavior item (i.e., “noisy”). School personnel may wish to consider rater training when using DBR-MIS to assess disruptive behavior, and future research should explore the use of statistical modeling to develop customized rater training that incorporates the idiosyncrasies of individual raters. (PsycInfo Database Record (c) 2026 APA, all rights reserved)