Educational and Psychological Measurement, Ahead of Print.
Identifying items with differential item functioning (DIF) in an assessment is a crucial step for achieving equitable measurement. One critical issue that has not been fully addressed with existing studies is how DIF items can be detected when data are multilevel. In the present study, we introduced a Lord’s Wald [math] test-based procedure for detecting both uniform and non-uniform DIF with polytomous items in the presence of the ubiquitous multilevel data structure. The proposed approach is a multilevel extension of a two-stage procedure, which identifies anchor items in its first stage and formally evaluates candidate items in the second stage. We applied the Metropolis–Hastings Robbins–Monro (MH-RM) algorithm to estimate multilevel polytomous item response theory (IRT) models and to obtain accurate covariance matrices. To evaluate the performance of the proposed approach, we conducted a preliminary simulation study that considered various conditions to mimic real-world scenarios. The simulation results indicated that the proposed approach has great power for identifying DIF items and well controls the Type I error rate. Limitations and future research directions were also discussed.