The nicotine metabolite ratio and nicotine equivalents are measures of metabolism rate and intake. Genome-wide prediction of these nicotine biomarkers in multiethnic samples will enable tobacco-related biomarker, behavioral, and exposure research in studies without measured biomarkers.
We screened genetic variants genome-wide using marginal scans and applied statistical learning algorithms on top-ranked genetic variants, age, ethnicity and sex, and, in additional modeling, cigarettes per day (CPD), (in additional modeling) to build prediction models for the urinary nicotine metabolite ratio (uNMR) and creatinine-standardized total nicotine equivalents (TNE) in 2239 current cigarette smokers in five ethnic groups. We predicted these nicotine biomarkers using model ensembles and evaluated external validity using dependence measures in 1864 treatment-seeking smokers in two ethnic groups.
The genomic regions with the most selected and included variants for measured biomarkers were chr19q13.2 (uNMR, without and with CPD) and chr15q25.1 and chr10q25.3 (TNE, without and with CPD). We observed ensemble correlations between measured and predicted biomarker values for the uNMR and TNE without (with CPD) of 0.67 (0.68) and 0.65 (0.72) in the training sample. We observed inconsistency in penalized regression models of TNE (with CPD) with fewer variants at chr15q25.1 selected and included. In treatment-seeking smokers, predicted uNMR (without CPD) was significantly associated with CPD and predicted TNE (without CPD) with CPD, time-to-first-cigarette, and Fagerström total score.
Nicotine metabolites, genome-wide data, and statistical learning approaches developed novel robust predictive models for urinary nicotine biomarkers in multiple ethnic groups. Predicted biomarker associations helped define genetically influenced components of nicotine dependence.
We demonstrate development of robust models and multiethnic prediction of the uNMR and TNE using statistical and machine learning approaches. Variants included in trained models for nicotine biomarkers include top-ranked variants in multiethnic genome-wide studies of smoking behavior, nicotine metabolites, and related disease. Association of the two predicted nicotine biomarkers with Fagerström Test for Nicotine Dependence items supports models of nicotine biomarkers as predictors of physical dependence and nicotine exposure. Predicted nicotine biomarkers may facilitate tobacco-related disease and treatment research in samples with genomic data and limited nicotine metabolite or tobacco exposure data.