Abstract
Background
Type 2 diabetes (T2D) is approximately twice as common among individuals with mental illness compared with the background population, but may be prevented by early intervention on lifestyle, diet, or pharmacologically. Such prevention relies on identification of those at elevated risk (prediction). The aim of this study was to develop and validate a machine learning model for prediction of T2D among patients with mental illness.
Methods
The study was based on routine clinical data from electronic health records from the psychiatric services of the Central Denmark Region. A total of 74,880 patients with 1.59 million psychiatric service contacts were included in the analyses. We created 1343 potential predictors from 51 source variables, covering patient-level information on demographics, diagnoses, pharmacological treatment, and laboratory results. T2D was operationalised as HbA1c ≥48 mmol/mol, fasting plasma glucose ≥7.0 mmol/mol, oral glucose tolerance test ≥11.1 mmol/mol or random plasma glucose ≥11.1 mmol/mol. Two machine learning models (XGBoost and regularised logistic regression) were trained to predict T2D based on 85% of the included contacts. The predictive performance of the best performing model was tested on the remaining 15% of the contacts.
Results
The XGBoost model detected patients at high risk 2.7 years before T2D, achieving an area under the receiver operating characteristic curve of 0.84. Of the 996 patients developing T2D in the test set, the model issued at least one positive prediction for 305 (31%).
Conclusion
A machine learning model can accurately predict development of T2D among patients with mental illness based on routine clinical data from electronic health records. A decision support system based on such a model may inform measures to prevent development of T2D in this high-risk population.