Educational and Psychological Measurement, Ahead of Print.
In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of nonignorable missing data in the VLE log file data, and this is expected to negatively affect IRT item parameter estimation accuracy, which then negatively affects any future ability estimates utilized in the VLE. In the psychometric literature, methods for handling missing data have been studied mostly around conditions in which the data and the amount of missing data are not as large as those that come from VLEs. In this article, we introduce a semisupervised learning method to deal with a large proportion of missingness contained in VLE data from which one needs to obtain unbiased item parameter estimates. First, we explored the factors relating to the missing data. Then we implemented a semisupervised learning method under the two-parameter logistic IRT model to estimate the latent abilities of students. Last, we applied two adjustment methods designed to reduce bias in item parameter estimates. The proposed framework showed its potential for obtaining unbiased item parameter estimates that can then be fixed in the VLE in order to obtain ongoing ability estimates for operational purposes.