Explainable AI for High-Dimensional Neuroimaging Data

Poster No:

1459 

Submission Type:

Abstract Submission 

Authors:

Julia Kropiunig1, Øystein Sørensen1

Institutions:

1Centre for Lifespan Changes in Brain and Cognition, Department of Psychology, University of Oslo, Oslo, Norway

First Author:

Julia Kropiunig  
Centre for Lifespan Changes in Brain and Cognition, Department of Psychology, University of Oslo
Oslo, Norway

Co-Author:

Øystein Sørensen  
Centre for Lifespan Changes in Brain and Cognition, Department of Psychology, University of Oslo
Oslo, Norway

Introduction:

The availability of masses of neuroimaging data has given rise to increased use of machine learning techniques for their analysis. These models are usually complex and therefore not easily interpretable by humans. However, it is crucial to explain and interpret those seemingly incomprehensible "black box" models, both for scientific and regulatory reasons. Model interpretability has two aspects to it trying to either focus on individual level explanations or global model behaviour, often referred to as local and global interpretability (Covert et al., 2021). Explainable AI methods are still at the early stages of their development, even more so if the feature space of the machine learning application is very high-dimensional and/or features are highly correlated. The following work discusses several interpretability methods - local as well as global - in the context of correlated features in a high-dimensional feature space and shows their application on tabular neuroimaging data.

Methods:

We used magnetic resonance imaging (MRI) data from the UK Biobank (https://www.ukbiobank.ac.uk), including 44164 participants between the ages of 40 and 70 (mean=55.09, standard deviation (SD)=7.56) at recruitment having both a T1-weighted MRI scan and neuropsychological test scores for fluid intelligence (mean=7.61, SD=2.06). Two machine learning models using the tree ensemble gradient boosting library XGBoost (Chen and Guestrin, 2016) were trained to firstly predict age and secondly fluid intelligence from subcortical volumes (ASEG) and parcellations of the white surface using the Desikan-Killiany-Tourville atlas (285 features in total) on a 75:25 training-test set split using a grid parameter search on tree depth, L1-regularization, L2-regularization and learning rate. On both trained models global and local interpretability methods were applied and compared. For individual prediction explanations well-established model-agnostic interpretability methods such as LIME (Ribeiro et al., 2016) and Shapley values (Lundberg and Lee, 2017; Shapley, 1953) as well as state-of-the-art improvements to the Shapley values accounting for feature correlations (Aas et al., 2021; Jullum et al., 2021) were applied and compared. Global interpretability methods used included partial dependence plots (PDPs) (Greenwell, 2017) and global feature importance measure using Shapley values, SAGE (Covert et al., 2020).

Results:

The model predicting fluid intelligence was able to establish a weak connection between the response and the brain measures with RMSE = 2.00, MAE = 1.61 on the test dataset whereas the model trying to predict brain age could establish decent predictive power with RMSE = 4.49 and MAE = 3.60. We could assert that PDPs could not give discernible insights into global model behavior due to small effects and high feature correlations and are clearly surpassed by feature importance measures. In the plot Shapley values according to Aas et al. (2021) and Jullum et al. (2021) accounting for correlations among features for the model predicting age are visualized. Since computational efforts increase with increasing feature space dimensions the brain areas were grouped together into 28 theoretically plausible groups. Shapley values for a specific observation try to quantify how much a feature/group of features contributes to a specific prediction. Thus, it can be observed that the specific instances of Accumbens, Caudate and Amygdala lead to the addition of multiple years to the mean age whereas the values for Thalamus and Putamen take off years from the prediction. The prediction can be assembled by adding all the positive and negative contributions to the mean age.
Supporting Image: age_pred.png
   ·Age prediction explanation for one participant from the test set
 

Conclusions:

Shapley values seemed to give good local explanations but have a computational bottleneck that becomes apparent when looking at a high-dimensional feature space, even more so in combination with a huge sample size leading to the well-known trade-off between accuracy and computation time.

Modeling and Analysis Methods:

Classification and Predictive Modeling 1
Methods Development 2

Keywords:

Cognition
Data analysis
Machine Learning
Modeling
MRI
STRUCTURAL MRI

1|2Indicates the priority used for review

Provide references using author date format

Aas, K., Jullum, M., and Løland, A. (2021). Explaining individual predictions when features are
dependent: More accurate approximations to Shapley values. Artificial Intelligence, 298:103502.
Chen, T. and Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of
the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages
785–794, San Francisco California USA. ACM.
Covert, I., Lundberg, S., and Lee, S.-I. (2020). Understanding Global Feature Contributions With
Additive Importance Measures. arXiv:2004.00668 [cs, stat].
Covert, I., Lundberg, S., and Lee, S.-I. (2021). Explaining by removing: A unified framework for model
explanation. Journal of Machine Learning Research, 22(209):1–90.
Greenwell, M., B. (2017). pdp: An R Package for Constructing Partial Dependence Plots. The R
Journal, 9(1):421.
Jullum, M., Redelmeier, A., and Aas, K. (2021). groupShapley: Efficient prediction explanation with
Shapley values for feature groups. Publisher: arXiv Version Number: 1.Lundberg, S. and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions.
arXiv:1705.07874 [cs, stat].
Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). ”Why Should I Trust You?”: Explaining the
Predictions of Any Classifier. arXiv:1602.04938 [cs, stat].
Shapley, L. S. (1953). A value for n-person games. Publisher: Princeton University Press Princeton.