Depression screening tool accuracy studies should be conducted with large enough sample sizes to generate precise accuracy estimates. We assessed the proportion of recently published depression screening tool diagnostic accuracy studies that reported sample size calculations; the proportion that provided confidence intervals (CIs); and precision, based on the width and lower bounds of 95% CIs for sensitivity and specificity. In addition, we assessed whether these results have improved since a previous review of studies published in 2013–2015.
MEDLINE was searched from January 1, 2018, through May 21, 2021.
Twelve of 106 primary studies (11%) described a viable sample size calculation, which represented an improvement of 8% since the last review. Thirty-six studies (34%) provided reasonably accurate CIs. Of 103 studies where 95% CIs were provided or could be calculated, seven (7%) had sensitivity CI widths of ≤10%, whereas 58 (56%) had widths of ≥21%. Eighty-four studies (82%) had lower bounds of CIs <80% for sensitivity and 77 studies (75%) for specificity. These results were similar to those reported previously.
Few studies reported sample size calculations, and the number of included individuals in most studies was too small to generate reasonably precise accuracy estimates.