Multi-domain and Uni-domain Fusion for domain-generalizable fMRI-based phenotypic prediction

Poster No:

1392 

Submission Type:

Abstract Submission 

Authors:

Pansheng Chen1, Lijun An1, Naren Wulan1, Chen Zhang1, Shaoshi Zhang1, Leon Ooi1, Ru Kong1, Jianxiao Wu2, Sidhant Chopra3, Danilo Bzdok4, Simon Eickhoff5, Avram Holmes6, B. T. Thomas Yeo1

Institutions:

1National University of Singapore, Singapore, Singapore, 2Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, Jülich, Germany, 3Yale University, New Haven, CT, 4McConnell Brain Imaging Centre (BIC), Montreal Neurol, McGill Universityogical Institute (MNI), Montreal, Quebec, 5Institute for Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf, Düsseldorf, North Rhine–Westphalia Land, 6Department of Psychiatry, Brain Health Institute, Rutgers University, Piscataway, NJ

First Author:

Pansheng Chen  
National University of Singapore
Singapore, Singapore

Co-Author(s):

Lijun An  
National University of Singapore
Singapore, Singapore
Naren Wulan  
National University of Singapore
Singapore, Singapore
Chen Zhang  
National University of Singapore
Singapore, Singapore
Shaoshi Zhang  
National University of Singapore
Singapore, Singapore
Leon Ooi  
National University of Singapore
Singapore, Singapore
Ruby Kong  
National University of Singapore
Singapore, Singapore
Jianxiao Wu  
Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich
Jülich, Germany
Sidhant Chopra  
Yale University
New Haven, CT
Danilo Bzdok  
McConnell Brain Imaging Centre (BIC), Montreal Neurol, McGill Universityogical Institute (MNI)
Montreal, Quebec
Simon Eickhoff  
Institute for Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf
Düsseldorf, North Rhine–Westphalia Land
Avram Holmes  
Department of Psychiatry, Brain Health Institute, Rutgers University
Piscataway, NJ
B. T. Thomas Yeo  
National University of Singapore
Singapore, Singapore

Introduction:

Resting-state functional connectivity (RSFC) is widely used to predict phenotypes in individuals[1,2]. Due to unavoidable small sample size issues[3-5], recent work has sought to translate models trained from large-sized neuroimaging datasets to predict phenotypes on small target datasets[5,6]. However, predictive models may fail to generalize to new datasets due to differences in population, data collection, and processing across datasets[7,8]. Here we proposed a method named Multi-domain and Uni-domain Fusion (MUF) to enhance model generalizability, which outperformed 4 strong baselines on 6 target datasets.

Methods:

We used 419 X 419 RSFC matrices to predict phenotypes[5] from 7 datasets: UK Biobank[10,11], ABCD[12], GSP[13], HBN[14], eNKI[15], HCP-YA[16] and HCP-Aging[17]. We did a leave-one-dataset-out test on each dataset, except UK Biobank as we need the largest dataset and for training. Each dataset was iteratively used as the target dataset and all others were used as source datasets (Fig 1A). Predictive models trained from source datasets were adapted to K participants (K-shot) in the target dataset to predict target phenotypes. The adapted models were evaluated in the remaining test participants. This procedure was repeated 100 times for stability. We compared MUF against 4 baselines (Fig 1B): classical Kernel Ridge Regression (KRR), meta-matching with stacking[5], and our previous work: meta-matching with dataset stacking and multilayer meta-matching[9].

Classical KRR models were trained on K-shot to predict target phenotypes[5]. For meta-matching with stacking[5], a feedforward deep neural network (DNN) was trained on UK Biobank to predict 67 phenotypes. The base DNN was applied to K-shot and DNN predictions were used as features to train a KRR model on K-shot to predict target phenotypes (i.e. stacking). Meta-matching with dataset stacking[9] extends meta-matching with stacking by training separate KRR/DNN models for each source dataset and then performing stacking on K-shot to predict target phenotypes. To resolve the sample size imbalance across source datasets, multilayer meta-matching[9] gradually applied stacking from larger source datasets to smaller datasets to boost prediction accuracies. These predictive models then underwent another round of stacking using K-shot to predict target phenotypes.

Of note, the above 4 baselines trained base models independently on each dataset. Fig 1C shows our new proposed MUF method, which combines both cross-domain and intra-domain training to learn domain-general and domain-specific information. On the basis of multilayer meta-matching, we also employed a multi-domain strategy that trains a joint multi-task DNN using all source datasets together, enabling it to offset site differences to some extent and thus generalize better on unseen datasets. Predictions from the above joint DNN were concatenated together with predictions from the multilayer meta-matching method and then used as features of the stacking model to predict target phenotypes.
Supporting Image: figures-1.png
 

Results:

Fig 2 shows the prediction accuracy (Pearson's correlation) in 6 target datasets. All reported p values survived a false discovery rate of q < 0.05.

We found that original meta-matching with stacking[5] was not better than classical KRR on some datasets (e.g., HBN), but our proposed methods improved it by incorporating more diverse source datasets. In all test datasets, MUF consistently outperformed all other baselines, indicating that MUF can better generalize from different source datasets and is more robust on new target datasets. Using Wilcoxon signed-rank test, MUF was better than meta-matching with dataset stacking[9] (p < 0.02 for all K) and multilayer meta-matching[9] (p < 0.02 for K > 20).
Supporting Image: figures-2.png
 

Conclusions:

We propose a method using both multi-domain and uni-domain training, to translate phenotypic prediction models from multiple source datasets to small-sized target datasets. We found that our MUF performed the best on 6 test datasets.

Modeling and Analysis Methods:

Classification and Predictive Modeling 1
Connectivity (eg. functional, effective, structural) 2
Methods Development

Keywords:

Computational Neuroscience
FUNCTIONAL MRI
Machine Learning
Modeling

1|2Indicates the priority used for review

Provide references using author date format

1. Eickhoff, S. B. (2019). Neuroimaging-based prediction of mental traits: Road to utopia or Orwell?. PLoS biology, 17(11), e3000497
2. Varoquaux, G. (2019). Predictive models avoid excessive reductionism in cognitive neuroimaging. Current opinion in neurobiology, 55, 1-6
3. Arbabshirani, M. R. (2017). Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. Neuroimage, 145, 137-165
4. Poldrack, R. A. (2020). Establishment of best practices for evidence for prediction: a review. JAMA psychiatry, 77(5), 534-540
5. He, T. (2022). Meta-matching as a simple framework to translate phenotypic predictive models from big to small data. Nature Neuroscience, 1-10
6. Lu, B. (2022). A practical Alzheimer’s disease classifier via brain imaging-based deep learning on 85,721 samples. Journal of Big Data, 9(1), 1-22
7. Kakarmath, S. (2020). Best practices for authors of healthcare-related artificial intelligence manuscripts. NPJ digital medicine, 3(1), 134
8. Abraham, A. (2017). Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example. NeuroImage, 147, 736-745
9. Sudlow, C. (2015). UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS medicine, 12(3), e1001779
10. Chen, P. (2023). Multilayer meta-matching: translating phenotypic prediction models from multiple datasets to small data. In Prep
11. Miller, K. L. (2016). Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nature neuroscience, 19(11), 1523-1536
12. Volkow, N. D. (2018). The conception of the ABCD study: From substance use to a broad NIH collaboration. Developmental cognitive neuroscience, 32, 4-7
13. Holmes, A. J. (2015). Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures. Scientific data, 2(1), 1-16
14. Alexander, L. M. (2017). An open resource for transdiagnostic research in pediatric mental health and learning disorders. Scientific data, 4(1), 1-26
15. Nooner, K. B., (2012). The NKI-Rockland sample: a model for accelerating the pace of discovery science in psychiatry. Frontiers in neuroscience, 6, 152
16. Van Essen, D. C. (2013). The WU-Minn human connectome project: an overview. Neuroimage, 80, 62-79
17. Harms, M. P. (2018). Extending the Human Connectome Project across ages: Imaging protocols for the Lifespan Development and Aging projects. Neuroimage, 183, 972