Poster No:
1408
Submission Type:
Abstract Submission
Authors:
Jessica Dafflon1, Dustin Moraczewski1, Eric Earl1, Gabriel Loewinger1, Patrick McClure2, Adam Thomas1, Francisco Pereira1
Institutions:
1National Institute of Mental Health, Bethesda, MD, 2Naval Postgraduate School, Monterey, CA
First Author:
Co-Author(s):
Eric Earl
National Institute of Mental Health
Bethesda, MD
Adam Thomas
National Institute of Mental Health
Bethesda, MD
Introduction:
One of the ultimate objectives of neuroimaging research is to create predictive models that can reveal the connection between patterns of functional connectivity and observable characteristics or behavior, often called phenotypes. Previous studies have shown that models trained to predict behavioral features from the individual's functional connectivity have modest to poor performance [1, 2, 5]. One possible reason is that brain-behavior models have focused on meticulously chosen individual phenotypes scrutinized in isolation. Previous research has shown predicting latent phenotypes, i.e., a decomposition from the phenotypes obtained using dimensionality reduction algorithms, can yield better performance than predicting individual phenotypes [5]. The results were, however, mainly evaluated on one dataset. To address this gap, we compare the predictive performance obtained from untransformed phenotypes and latent phenotypes, and also assess the model's performance with and without regressing out covariates.
Methods:
We analyzed the reliability and predictability of phenotypes and latent phenotypes components within two large neuroimaging datasets, the Human Connectome Project (HCP; N=964; development set N=856; test set N=108) [6] and the Philadelphia Neurodevelopmental Cohort (PNC; N=973; development set N=864; test set N=109) [4].
We chose phenotypes that span behavioral domains of cognition and personality. We extracted corresponding singular value decomposition (SVD) representations of these phenotypes, and assessed if the latent phenotypes are more predictable than the original phenotype variables. For the prediction experiments, we used a Ridge regression model, estimating result variability over 100 bootstrap samples. We also assessed the impact of regressing out age and sex and their squared interaction from the phenotypes and how these impacted the prediction performance. Additionally, we explore how removing unreliable phenotypic information from the targets of the training data changes the model's performance and evaluate if there is a significant change in predictive performance using a critical difference analysis [3].
Results:
If not accounted for, the covariates can inflate the prediction results; this is particularly visible, in the relation between the phenotypic variable "strength unadjusted" and sex in the HCP dataset (r=0.6 before; r=0.1 after adjustment). Even after accounting for the covariates, prediction performance is low in both datasets. To assess if latent phenotypes were more predictable than individual phenotypes, we computed the SVD of the phenotypes on both datasets. Remarkably, we observed that only the loadings of the first 4-6 SVD components obtained from the phenotypes were reliably identified when the same experiment was repeated on different splits of the same dataset (Fig 1). Finally, we examine how removing unreliable phenotypic (cutoff 30/100 in our reliability analysis) information from the targets of the training data changes the model's performance. We observed that reconstructing the phenotypes using only the first five components achieved a very similar performance as using all components, indicating that most of the information relevant to prediction is present in the first components and the remaining can be filtered out without harm to the predictive performance (Fig 2).


Conclusions:
If not accounted for, covariates can inflate the performance of predictive models. We observed that apart from the first latent phenotypes the remainder were not reliably identified over repetitions of the same dataset. Highlighting the fact that most of the predictive information is present in the first components, we showed that removing the remaining components did not significantly impact the model's performance (Fig 2). In summary, our study sheds light on the intricate relationship between resting state functional connectivity, predictability, and reliability of phenotypic information.
Higher Cognitive Functions:
Executive Function, Cognitive Control and Decision Making
Reasoning and Problem Solving
Learning and Memory:
Working Memory
Modeling and Analysis Methods:
Classification and Predictive Modeling 1
Methods Development 2
Keywords:
Cognition
Machine Learning
1|2Indicates the priority used for review
Provide references using author date format
[1] Chen, J (2023), 'Relationship between prediction accuracy and feature importance reliability: An empirical and theoretical study', NeuroImage, vol. 274, pp. 120115, https://doi.org/10.1016/j.neuroimage.2023.120115
[2] Bertolero, M (2020), ‘Deep Neural Networks Carve the Brain at its Joints’, arxiv, https://doi.org/10.48550/arXiv.2002.08891
[3] Demsar. J (2006), ‘Statistical comparisons of classifiers over multiple data sets’, The Journal of Machine learning research, vol. 7, p. 1-30, https://www.jmlr.org/papers/volume7/demsar06a/demsar06a.pdf
[4] Gur, R. C (2010), ‘A cognitive neuroscience-based computerized battery for efficient measurement of individual differences: standardization and initial construct validation’, Journal of neuroscience methods, vol. 187, p. 254–262, https://doi.org/10.1016/j.jneumeth.2009.11.017
[5] Ooi, L (2022), ‘Comparison of individualized behavioral predictions across anatomical, diffusion and functional connectivity MRI’, NeuroImage, vol. 263, p. 119636, https://doi.org/10.1016/j.neuroimage.2022.119636
[6] Van Essen, D. C. (2012), ‘The human connectome project: a data acquisition perspective’. Neuroimage, vol. 62, no. 4, p. 2222-2231, https://doi.org/10.1016/j.neuroimage.2012.02.018