The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxies for the ability trait level. One of the most popular approaches, the Mantel–Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source of potential Type I error inflation, not only in the presence of DIF but also when DIF is absent and in the presence of impact. The purpose of this article is to present an alternative statistical inference approach based on the same measure of DIF but such that the Type I error inflation is prevented. The key notion is that for DIF items, the measure has an outlying value that can be identified as such with inference tools from robust statistics. Although we use the MH log odds ratio as a statistic, the inference is different. A simulation study is performed to compare the robust statistical inference with the classical inference method, both based on the MH statistic. As expected, the Type I error rate inflation is avoided with the robust approach, although the power of the two methods is similar.