Educational and Psychological Measurement, Ahead of Print.
Educational and psychological tests with an ordered item structure enable efficient test administration procedures and allow for intuitive score interpretation and monitoring. The effectiveness of the measurement instrument relies to a large extent on the validated strength of its ordering structure. We define three increasingly strict types of ordering for the ordering structure of a measurement instrument with clustered items: a weak and a strong invariant cluster ordering and a clustered invariant item ordering. Following a nonparametric item response theory (IRT) approach, we proposed a procedure to evaluate the ordering structure of a clustered item set along this three-fold continuum of order invariance. The basis of the procedure is (a) the local assessment of pairwise conditional expectations at both cluster and item level and (b) the global assessment of the number of Guttman errors through new generalizations of the H-coefficient for this item-cluster context. The procedure, readily implemented in R, is illustrated and applied to an empirical example. Suggestions for test practice, further methodological developments, and future research are discussed.