Various approaches have been used in the literature to handle missing vital status data in cancer registries. We aimed to compare these approaches to determine which led to the least biased estimates in typical analytic tasks of cancer registries.
Methods
A simulation study was performed using data from the Swiss National Agency for Cancer Registration for six tumor types. First, 5%, 10% and 15% missingness in the vital status were introduced artificially in the complete data. Second, missing vital status data were handled by applying no, single or multiple imputations. Five-year overall survival estimates, relative survival or standardized incidence ratio were computed. Estimates were compared with the true value.
Results
Standardized incidence ratio estimates for colorectal cancer obtained with multiple imputation yielded least biased results (−0.06 to −0.04), but the widest confidence intervals. Single imputation was more biased (−0.32) than using no imputation at all (−0.21). A similar pattern was observed for overall survival and relative survival.
Conclusion
This simulation study indicated that often used single imputation (sometimes referred to as simulating follow-up times) techniques to fill in missing vital status data are likely too biased to be useful in practice. Multiple imputation approaches yielded standardized incidence ratio, overall and relative survival estimates with the least bias, indicating reasonable performance that is likely to generalize to other settings.