Abstract
Introduction and Aims
Infoveillance approaches (i.e. surveillance methods using online content) that leverage big data can provide new insights about infectious disease outbreaks and substance use disorder topics. We assessed social media messages about HIV, opioid use and injection drug use in order to understand how unstructured data can prepare public health practitioners for response to future outbreaks.
Design and Methods
We conducted an retrospective analysis of Twitter messages during the 2015 HIV Indiana outbreak using machine learning, statistical and geospatial analysis to examine the transition between opioid prescription drug abuse to heroin injection use and finally HIV transmission risk, and to test possible associations with disease burden and demographic variables in Indiana and Marion County. Tweets from October 2014 to June 2015 were compared to disease burden at the county level for Indiana, and classification of census blocks by presence of relevant messages was done at the census block level for Marion County. Marion County was used as it exhibited the highest total count of Tweets.
Results
257 messages about substance abuse and HIV were significantly related to HIV rates (P < 0.001) and opioid‐related hospitalisations (P = 0.037). Using 157 characteristics from the American Community Survey, a linear classifier was computed with an appreciable correlation (r = 0.49) to risk‐related social media messages from Marion County.
Discussion and Conclusions
Communities appear to communicate online in response to disease burden. Classification produced an accurate equation to model census block risk based on census data, allowing for high‐dimensional estimation of risk for blocks with sparse populations.