Poster No:
205
Submission Type:
Abstract Submission
Authors:
Tamoghna Chattopadhyay1, Saket Ozarkar1, Ketaki Buwa1, Sophia Thomopoulos2, Paul Thompson3
Institutions:
1University of Southern California, Los Angeles, CA, 2USC, Marina del Rey, CA, 3USC, Marina Del Rey, CA
First Author:
Co-Author(s):
Ketaki Buwa
University of Southern California
Los Angeles, CA
Introduction:
According to the World Health Organization (WHO), around 55 million individuals worldwide suffer from dementia - of whom 60-70% are diagnosed with Alzheimer's disease (AD). The defining features of AD include abnormal buildup of beta-amyloid (Aβ) plaques and tau protein tangles in the brain [2]. Amyloid positivity (Aβ+) is commonly assessed through positron emission tomography (PET) or sampling of cerebrospinal fluid (CSF) via lumbar puncture, but these procedures are costly and invasive. Although Aβ accumulation slightly precedes atrophy [3] on MRI, there is interest in how well standard anatomical MRI may detect Aβ-related brain changes, which include atrophy and structural alterations. Deep learning methods, like Vision Transformers (ViTs), can capture long-range spatial dependencies in images using self-attention mechanisms, and show great promise in computer vision. We examined the ViT architecture's performance in predicting Aβ+ status from T1-weighted scans and compared it to widely-used 3D DenseNet convolutional neural network (CNN) architecture [6].
Methods:
We analyzed 3D T1-weighted (T1w) brain MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset (phases 2/GO and 3) from 1,841 participants (age: 74.04 ± 7.40 years; 860 F/981 M) with distribution of 889 CN (cognitively normal controls)/658 with mild cognitive impairment (MCI), and 294 AD; 946 Aβ+/ 895 Aβ-. All 3D T1w brain MRI volumes were pre-processed using the following steps: nonparametric intensity normalization (N4 bias field correction), 'skull stripping' for brain extraction, 6 degrees of freedom registration to a template and isometric voxel resampling to 2 mm. These images of size 91x109x91 were scaled to take values between 0 and 1 using min-max scaling, and registered to a template created using T1w MRI from the UK Biobank dataset in MNI space [7,8]. They were resized to dimensions of 64x64x64 and 128x128x128 to guarantee direct correspondence with patch sizes used in ViT models and divided into train, validation and test sets in the ratio 80:10:10. The DenseNet architecture includes four dense blocks and three transition layers, and was used as a baseline comparison architecture. In ViT architectures [9], the input image is divided into fixed-sized patch embeddings, which are amalgamated with learnable position embeddings and class tokens. The resulting sequence of vectors is fed into a transformer encoder, which comprises alternating layers of multi-head attention and a multi-layer perceptron (MLP). We used two different ViTs - neuroimage transformer (NiT) and multiple instance NiT (MINiT) [9] - where a learned block embedding was introduced to maintain positional information of the block within the scan containing each patch. Hyperparameters were tuned and model performance was assessed using test accuracy.

·Model Architectures
Results:
Results are shown in Table 1. Best performance was achieved by MINiT architecture for image size of 64x64x64; giving test accuracy of 0.791 and test ROC-AUC of 0.857. Thus, MINiT architecture improved upon both 3D DenseNet and NiT architectures. Hyperparameter tuning of attention heads, learning rate, the encoder layer and weight decay helped to improve model performance. In our experiments, performance for 64x64x64 downscaled images was better than that for 128x128x128 upsampled images.

·Results Table
Conclusions:
We evaluated prediction capabilities of the Vision Transformer architecture for inferring Aβ+ from T1w brain MRI, benchmarked against the DenseNet121 architecture. In initial experiments, MINiT architecture performed better than the other two architectures considered. Our results are promising, in that less invasive scans may be beneficial for screening individuals, prior to more intrusive Aβ+ detection procedures. This study also has some limitations, including the limited testing, for now, on the ADNI dataset. Performance may improve by increasing the size and diversity of the training data, by including multimodal brain MRI, and additional cohorts.
Disorders of the Nervous System:
Neurodegenerative/ Late Life (eg. Parkinson’s, Alzheimer’s) 1
Modeling and Analysis Methods:
Classification and Predictive Modeling 2
Keywords:
Cognition
Computational Neuroscience
Machine Learning
MRI
STRUCTURAL MRI
1|2Indicates the priority used for review
Provide references using author date format
[1] World Health Organization, “Dementia,” 2022. https://www.who.int/news-room/fact-sheets/detail/dementia.
[2] Jack, C. R., Jr et al. “NIA-AA Research Framework: Toward a biological definition of Alzheimer's disease.” Alzheimer's & Dementia: the Journal of the Alzheimer's Association vol. 14, 4 (2018): 535-562.
[3] Jack, C. R., Jr, et. al. (2013). Tracking pathophysiological processes in Alzheimer's disease: an updated hypothetical model of dynamic biomarkers. The Lancet. Neurology, 12(2), 207–216.
[4] Chattopadhyay, T., et al., (2023). Predicting Brain Amyloid Positivity from T1 weighted brain MRI and MRI-derived Gray Matter, White Matter and CSF maps using Transfer Learning on 3D CNNs. bioRxiv.
[5] Matsoukas, C., et al., “Is it Time to Replace CNNs with Transformers for Medical Images?” 2021, [Online]. Available: http://arxiv.org/abs/2108.09038
[6] Huang, G. , et. al. “Densely Connected Convolutional Networks,” CVPR, 4700–4708 (2017).
[7] Zavaliangos-Petropulu, A., et al., “Diffusion MRI Indices and Their Relation to Cognitive Impairment in Brain Aging: The Updated Multi-protocol Approach in ADNI3”, Front Neuroinformatics 13:2 (2019).
[8] Thomopoulos, S., et al., “Diffusion MRI Metrics and their relation to Dementia Severity: Effect of Harmonization Approaches,” medRxiv (2021).
[9] Singla, A., et al., “Multiple Instance Neuroimage Transformer,” in Predictive Intelligence in Medicine: 5th International Workshop, MICCAI PRIME 2022, vol. 1, pp. 36–48.
[10] Radwan, N. (2019). Leveraging sparse and dense features for reliable state estimation in urban environments.