Abstract
Objective
We expanded the previous assessment of a mortality variable suited for real‐world evidence‐focused oncology research.
Data source
We used a nationwide electronic health record (EHR)‐derived de‐identified database.
Data collection
We included patients with at least 1 of 18 cancer types between January 1, 2011 and December 31, 2017. Patient‐level structured data (EHRs, obituaries, and Social Security Death Index) and unstructured EHR data (abstracted) were linked to generate a composite mortality variable.
Study design
We benchmarked sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and ±15‐day agreement against the National Death Index (NDI). Real‐world overall survival (rwOS) was estimated using the Kaplan‐Meier method. We performed sensitivity analyses using a smaller patient cohort that underwent next‐generation sequencing testing.
Principal findings
Compared with the NDI across 18 cancer types (overall N = 160 436): sensitivity, 83.9%‐91.5% (17/18 cancer types had sensitivity ≥85.0%); specificity, 93.5%‐99.7%; PPV, 96.3%‐98.3%; NPV, 75.0%‐98.7%; ±15‐day agreement, 95.6%‐97.6%; and median rwOS estimates ranging from 2.8% to 12.7% greater. Sensitivity analysis results (n = 17 540) were consistent with the main analysis.
Conclusions
Across all cancer types analyzed, this composite mortality variable showed high sensitivity, specificity, PPV, NPV, and ±15‐day agreement, and yielded median rwOS values modestly overestimated when compared to NDI‐based results.