Print Close

Scalable Gaussian Process Neural Network with application to Neuroimaging Data

Poster No:

1365

Submission Type:

Abstract Submission

Authors:

Kan Keeratimahat¹, Thomas Nichols², Jian Kang³

Institutions:

¹University of Oxford, Oxford, Oxfordshire, ²University of Oxford, Oxford, United Kingdom, ³University of Michigan, Ann Arbor, MI

First Author:

Kan Keeratimahat
University of Oxford
Oxford, Oxfordshire

Co-Author(s):

Thomas Nichols
University of Oxford
Oxford, United Kingdom

Jian Kang
University of Michigan
Ann Arbor, MI

E-Poster

Introduction:

Scalar-on-image regression, modelling a univariate outcome per subject given image data, is a core machine learning task in neuroimaging but remains a challenge due to the high-dimensional, heterogeneous nature of the brain image data. While such prediction tasks must combine information from across the brain, conventional approaches neglect the complex spatial dependence and potential for nonlinear association. However, spatial Bayesian models do not necessarily scale to biobank-sized datasets needed to detect subtle associations. Here we introduce a fast Bayesian model designed to address these issues.

Methods:

We propose a Bayesian Gaussian Process-induced Neural Network (GPNN) that captures nonlinear associations while using spatial smoothness constraints [1] to establish a biologically plausible network structure. Specifically, we employ scalable approximate GPs with a modified squared exponential kernel, affording control over smoothness, concentration, and anchor points of each GP, allowing long-range, inter-regional dependencies in the association. While Bayesian approaches can be computationally intensive, our approach leverages low-rank basis functions and stochastic learning to handle large imaging datasets. Additionally, we employ Bayesian conjugate updates, eliminating the need for costly parameter grid searches while allowing the parameter distributions to adapt to the data. We propose three major architectures, each comprising a 1-layer fully-connected spatially-informed neural network: GPNN-GP-GP [Figure 1], GPNN-Linear-GP, and GPNN-GP-Linear. For instance, GPNN-Linear-GP denotes a GPNN with a linear connection in the input layer and a GP connection in the hidden layer.

Additionally, we implement a scalable approximate posterior sampling method, Stochastic Gradient Langevin Dynamics [2] on the trained neural network to obtain the full posterior distribution for our parameters and their corresponding posterior predictive distributions. This equips our model with the capability to quantify the uncertainty of individual predictions, a desirable yet uncommon feature in regular machine learning techniques.

We apply our method to predict Brain Age using T1-weighted structural images from the extensive UK Biobank dataset (where p >> 100,000 and n = 4,000). In this study, a brain atlas consisting of 12 grey-matter regions is employed as the spatial prior information.

Supporting Image: Screenshot2023-11-27at225339.png

·Figure1: Visualisation of the proposed GPNN-GP-GP architecture. The example diagram illustrates a GPNN-GP-GP architecture with 3 regions of interest.

Results:

All the GPNN models have superior predictive accuracy relative to the GP Regression model, and in particular, the GPNN-Linear-GP outperforms Ridge and LASSO models as well [Figure 2a]. Additionally, the saliency obtained from GPNN reveals an interpretive model representation based on the incorporated spatial prior, distinguishing it from traditional methods. Specifically, the GPNN saliency maps are smooth, unlike Ridge or LASSO which lack such spatial structure [Figure 2b]. Furthermore, the adaptation of SGLD provides posterior predictive intervals, shown to be accurate relative to Metropolis-Hasting [Figure 2a].

Supporting Image: Screenshot2023-11-28at212518.png

·Figure 2a: Real data optimisation performances and posterior inference performances (median). Figure 2b: Absolute values of saliency (GPNN) and magnitude of fitted coefficients (regressions).

Conclusions:

The proposed GPNN models demonstrate superior performance compared to traditional models, showcasing their potential for accurate and interpretable predictions. Notably, we found the GPNN-Linear-GP model to be the top-performing model. A limitation of our work is the long computation to train our network; however, in this initial work all computations were performed on a CPU, and substantial optimisations will be possible with a GPU implementation. In future work, we plan to extend the GPNN framework to incorporate multiple modalities of imaging data, as well as non-imaging data.

Modeling and Analysis Methods:

Bayesian Modeling ¹

Classification and Predictive Modeling ²

Methods Development

Keywords:

Machine Learning

Statistical Methods

STRUCTURAL MRI

Other - Bayesian Method

^1|2Indicates the priority used for review

Provide references using author date format

Kang, J. (2018). Scalar-on-image regression via the soft-thresholded gaussian process. Biometrika, 105(1):165–184.

Welling, M. (2011). Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 681–688. Citeseer