On the Relationship Between Differential Item Functioning and Item Difficulty: An Issue of Methods? Item Response Theory Approach to Differential Item Functioning

The relationship between differential item functioning (DIF) and item difficulty on the SAT is such that more difficult items tended to exhibit DIF in favor of the focal group (usually minority groups). These results were reported by Kulick and Hu, and Freedle and have been enthusiastically discussed by more recent literature. Examining the validity of the original reports of this systematic relationship is important so that we can move on to investigating more effectively its causes and the consequences associated to test score use. This article explores the hypothesis that the observed relationship between DIF and item difficulty observed in the SAT could be because of one of the following explanations: (a) the confounding of DIF and impact by the shortcomings of the standardization approach and/or (b) by random guessing. The relationship between DIF and item difficulty is examined using item response theory, which better controls for differences between impact and DIF than the standardization approach and also allows us to test the importance of guessing. The results obtained generally find evidence in support of the relationship between item difficulty and DIF suggesting that the phenomenon reported by earlier research is not a mere artifact of the statistical methodologies used to study DIF.