Disaggregating data by race and ethnicity is a critical method for shining light on racialized systems of privilege and oppression. Imputation is a powerful tool for disaggregating data by generating racial and ethnic identifiers onto datasets lacking this information. But if used without a proactive focus on equity, it can harm Black people, Indigenous people, and other people of color.
In this report, we describe lessons we learned from a case study in which we proactively incorporated equity in imputing race and ethnicity onto a nationally representative sample of credit bureau data. We organize these lessons around the following three “ethics checkpoints” to identify and address potential racial bias and inaccuracy:
checkpoint 1: before imputation, audit input data for bias
checkpoint 2: during imputation, examine where bias could be introduced at each step
checkpoint 3: after imputation, assess whether imputed race/ethnicity data are accurate enough to used ethically for your analytic purpose
At each checkpoint, we share how we approached mitigating bias where possible and transparently communicating any bias that could not be mitigated; we also discuss how to determine when the unmitigated risk is unacceptably high and therefore warrants terminating the production or use of the imputed data. Our experience navigating these checkpoints surfaced four key lessons learned: (1) equity must be considered in every decision, (2) examine differential outcomes by race and ethnicity, (3) transparently communicate analytic decisions and data limitations, and (4) the fitness for purpose of imputed data must be examined in light of their intended use. Researchers and analysts imputing race and ethnicity onto datasets can use the ethics checkpoints and lessons in this report to ensure imputation is an effective, ethical, and empathetic tool for addressing critical gaps in race and ethnicity data. This report also includes a detailed description of our imputation methodology.
This report is a product of Urban’s Racial Equity Analytics Lab, which seeks to equip today’s change agents with data and analyses to advance social and economic policies that help remedy persistent structural racism.