The reliable assessment of competence is crucial for promoting the professional development of therapists. However, competence assessments are rarely included in training and research as these procedures are resource-intensive and costly, commonly relying on independent raters with high levels of expertise and extensive training. This study aimed to compare the interrater reliability (IRR) of raters with different levels of expertise. We also examined the impact of different camera perspectives on IRR.
We examined the IRR of six independent raters based on competence assessments in a standardized setting. Two raters were experienced psychotherapists (experts), and four were psychology students (novices; with/without supervision). All raters evaluated N = 359 videos of students performing role plays with standardized patients who were simulating depressive symptoms and behavior. For each video, the raters independently assessed basic communication skills (Clinical Communication Skills Scale–Short Form; CCSS-S), psychotherapeutic competence (Cognitive Therapy Scale; CTS), empathy (Empathy Scale; ES) and therapeutic alliance (Helping Alliance Questionnaire; HAQ).
IRR varied depending on rater expertise and assessment measures, with the lowest intraclass correlation coefficients (ICCs) for empathy (ES; ICCs = 0.39-0.67) and the highest ICCs for psychotherapeutic competence (CTS; ICCs = 0.66-0.78). The concordance between expert raters and supervised novice raters was good (ICCs = 0.71-0.86). The camera perspective did not influence the reliability of the ratings.
With appropriate training and regular supervision, the novices assessed therapeutic behavior in standardized role plays with reliability comparable to that of the experts. Further research is needed regarding the reliable assessment of more complex therapy situations.