Author: David Wright
Co-authors: Giovanni Montesano, Katie Graham, Timos Naskas, Usha Chakravarthy, Bethany Higgins, Frank Kee, David Crabb, Ruth Hogg
Abstract
PurposeTo evaluate the extent to which visual function measures analysed using interpretable machine learning techniques are able to discriminate among eyes with age-related macular degeneration (AMD), diabetes, or neither condition.
To identify which variables among a panel of 22 visual function measures provide the best discrimination among eyes as potential functional markers for use in future clinical trials.
Setting/Venue
1122 participants (2244 eyes) from the first wave of the Northern Ireland Sensory Ageing study, UK. Median age was 65 and 59% were female. The sample comprised 310 individuals (440 eyes) with AMD, 304 individuals (584 eyes) with diabetes and the remainder with neither condition.
Methods
AMD stage (Beckman classification) and presence of diabetic retinopathy (DR) were determined using fundus photography and OCT imaging.
Three discriminant tasks were set:
1.Distinguishing different stages of AMD (Beckman classes 0-3).
2.Distinguishing eyes from those with diabetes but no DR from those with no diabetes – investigating whether early-stage diabetes damage can be detected by measuring visual function.
3.Distinguishing eyes from those with DR from those with diabetes but no DR - investigating whether clinically relevant appearance of DR features can be detected by measuring visual function.
Appropriate subsets of the data were used for each task; AMD eyes were excluded from the diabetes analysis and vice versa. For each task, predictor variables were age, sex and the visual function measures. Missing measurements were imputed using chained equations.
An ensemble of machine learning models was applied for each task, using the SuperLearner algorithm to find the optimum weighting of component models. Predictive performance of the ensemble was compared to standard statistical methods (multiple regression).
An interpretable machine learning approach (SHAP values) was used to identify variables with the greatest influence on the ensemble predictions, and to identify clusters of eyes where predictions were made for similar reasons.
Results
The ensemble machine learning approach correctly classified AMD stage in 88% of eyes. Model sensitivity was highest for AMD stage 0, with all predictions correct and for AMD stage 3, with 81% correct. Sensitivity was lower for stages 1 and 2 at 40% and 44% respectively. Most misclassifications were to AMD stage 0. Using multiple regression only 71% of eyes were correctly classified, with sensitivities of 97%, 0%, 0% and 15% for stages 0-3, respectively. Features that drove predictions of AMD stages 1-3 included below average microperimetry sensitivity and above average Smith-Kettlewell low luminance near visual acuity.
The ensemble approach correctly distinguished those with diabetes but no DR from those with no diabetes in 99% of eyes, achieving 96% sensitivity for detecting diabetes but no DR. In contrast, multiple regression correctly classified 84% of eyes but only achieved 10% sensitivity. Features that drove predictions of diabetes included below average reading speed and below average microperimetry sensitivity.
For task 3, the ensemble approach correctly distinguished those with DR from those with diabetes but no DR in 98% of eyes, achieving 99% sensitivity for detecting DR. Multiple regression correctly classified 65% of eyes and only achieved 66% sensitivity. Below average reading speed was a prominent predictor of DR.
Conclusions
Ensemble machine learning using measures of visual function demonstrated good performance in distinguishing eyes with AMD stage 3 from those with AMD stage 0. Performance was poor in distinguishing stages 1 and 2 from stage 0 indicating that functional deficits are considerably less pronounced at these stages.
Machine learning demonstrated excellent performance in distinguishing a) those with diabetes but no DR from those without diabetes and b) those with DR from those with diabetes but no DR.
Interpretable machine learning techniques enabled us to identify variables that contributed most strongly towards each model prediction and to separate those predictions resulting from artefacts in the data (principally missing measurements) from those relying on true features. These approaches show potential for functional monitoring of AMD and diabetes.
Financial diclosures
David Wright, (None);
Giovanni Montesano, CenterVue (C), Ivantis (C), Omikron (C);
Katie Graham, (None);
Timos Naskas, (None)
Usha Chakravarthy, Bayer (F), Roche (E);
Bethany Higgins, (None);
Frank Kee, (None);
David Crabb, CenterVue (F, C), Apellis (C, F), Santen (F, S), Allergan (C, F), Thea (C), Medisoft (S), Roche (C)
Ruth Hogg, Optos Plc (F), Novartis (F)