A critical component that influences the measurement properties of a patient-reportedoutcome (PRO) instrument is the rating scale. Yet, there is a lack of general consensusregarding optimal rating scale format, including aspects of question structure, the number andthe labels of response categories. This study aims to explore the characteristics of ratingscales that function well and those that do not, and thereby develop guidelines forformulating rating scales.
Seventeen existing PROs designed to measure vision-related quality of life dimensions weremailed for self-administration, in sets of 10, to patients who were on a waiting list for cataractextraction. These PROs included questions with ratings of difficulty, frequency, severity, and global ratings. Using Rasch analysis, performance of rating scales were assessed byexamining hierarchical ordering (indicating categories are distinct from each other and followa logical transition from lower to higher value), evenness (indicating relative utilization ofcategories), and range (indicating coverage of the attribute by the rating scale).
The rating scales with complicated question format, a large number of response categories, orunlabelled categories, tended to be dysfunctional. Rating scales with five or fewer responsecategories tended to be functional. Most of the rating scales measuring difficulty performedwell. The rating scales measuring frequency and severity demonstrated hierarchical orderingbut the categories lacked even utilization.
Developers of PRO instruments should use a simple question format, fewer (four to five) andlabelled response categories.