Background: Web-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically. Objective: This study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm. Methods: Data comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy. Results: The identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d=1.58, Δu=0.216, t=229.75, and P<.001 are more often mentioned by patients with acute diseases whereas communication skills d="−0.29," t="−42.01," and p financing diagnosis pathogenesis chronic diseases. mild were interested in medical ethics operation process patient profile symptoms meanwhile serious competence advice prescription conclusions: this mixed-methods approach integrating literature reviews data-driven topic discovery human annotation is an effective rigorous way to develop a physician review taxonomy. the proposed algorithm based on labeled-latent dirichlet allocation can achieve impressive classification results for mining interests. furthermore reveal marked differences interests across different disease types socioeconomic development levels hospital levels.>
This is the abstract only. Read the full article on the JMIR site. JMIR is the leading open access journal for eHealth and healthcare in the Internet age.