A data analytics approach to building a clinical decision support system for diabetic retinopathy: Developing and deploying a model ensemble

作者:

Highlights:

• A clinical decision support system (CDSS) for diabetic retinopathy (DR) is developed.

• DR is predicted using only patient demographics and a small set of diabetic lab data.

• A novel ensemble approach is developed and tested to show its effectiveness.

• Prediction of DR is improved by considering comorbid complications.

• Sensitivity analysis is used to identify variables with high predictive power for DR.

摘要

Diabetes is a common chronic disease that may lead to several complications. Diabetic retinopathy (DR), one of the most serious of these complications, is the most common cause of vision loss among diabetic patients. In this paper, we analyzed data from more than 1.4 million diabetics and developed a clinical decision support system (CDSS) for predicting DR. While the existing diagnostic approach requires access to ophthalmologists and expensive equipment, our CDSS only uses demographic and lab data to detect patients' susceptibility to retinopathy with a high accuracy. We illustrate how a combination of multiple data preparation and modeling steps helped us improve the performance of our CDSS. From the data preprocessing aspect, we aggregated the data at the patient level and incorporated comorbidity information into our models. From the modeling perspective, we built several predictive models and developed a novel “confidence margin” ensemble technique that outperformed the existing ensemble models. Our results suggest that diabetic neuropathy, creatinine serum, blood urea nitrogen, glucose serum plasma, and hematocrit are the most important variables in detecting DR. Our CDSS provides several important practical implications, including identifying the DR risk factors, facilitating the early diagnosis of DR, and solving the problem of low compliance with annual retinopathy screenings.

论文关键词:Diabetic retinopathy,Data analytics,Predictive modeling,Clinical decision support systems,Model ensembles,Variable importance

论文评审过程:Received 8 September 2016, Revised 22 April 2017, Accepted 7 May 2017, Available online 15 May 2017, Version of Record 19 August 2017.

论文官网地址:https://doi.org/10.1016/j.dss.2017.05.012