Abstract
Survey data harmonization can greatly improve the analytical potentials of survey data by making divergent measurements of the same construct comparable. While there may often be ideas on what is the optimal harmonization strategy (e.g. imputation or equipercentile equating), these approaches are not always feasible e.g. due to data-based restrictions or the lack of a suitable reference sample. Therefore, sometimes there is only a set of second-best strategies, such as linear equating or linear stretching, with no clear idea which of these should be given preference. The present paper takes this situation as a starting point and investigates the substantive consequences of different alternatives for a typical harmonization scenario using real-world survey data. Divergent substantive consequences are understood as differences in obtained analytical result patterns. Do the actual scientific conclusions differ when different harmonization approaches are employed? Results based on a variety of regression scenarios indicate that most substantive conclusions are obtained regardless of the harmonization approach chosen, but that harmonization procedures entailing an information loss due to recoding variables into smaller sets of categories should be avoided. Overall, this suggests that research based on sub-optimally harmonized scales can also yield valid scientific insights.