Generalization of Brain-Behavior Predictions to Unharmonized Data Towards Real-World Applications

Poster No:

1478 

Submission Type:

Abstract Submission 

Authors:

Brendan Adkinson1, Matthew Rosenblatt2, Javid Dadashkarimi2, Link Tejavibulya2, Dustin Scheinost2

Institutions:

1Yale School of Medicine, New Haven, CT, 2Yale University, New Haven, CT

First Author:

Brendan Adkinson  
Yale School of Medicine
New Haven, CT

Co-Author(s):

Matthew Rosenblatt  
Yale University
New Haven, CT
Javid Dadashkarimi, PhD  
Yale University
New Haven, CT
Link Tejavibulya  
Yale University
New Haven, CT
Dustin Scheinost  
Yale University
New Haven, CT

Introduction:

Despite the significant attention devoted to power, reproducibility, and generalizability, only a minority of neuroimaging studies undertake external validation. Compounding this issue, there is a notable lack of focus on external validation under real-world conditions wherein–unlike research settings–imaging and phenotypic data are largely unharmonized across sites. Neuroimaging research studies, by design, remove the between-site variations that future clinical applications will demand. Only by including multiple datasets with different imaging parameters, patient demographics, and choice of clinical instruments can the true effect sizes of predictive models be evaluated. Therefore, it is imperative to assess whether predictions can survive generalization across diverse dataset features. In this work, we use advanced methodological approaches to achieve generalization of predictive models across three markedly unharmonized samples.

Methods:

Data from the Philadelphia Neurodevelopmental Cohort (PNC, n=1294, ages 8-21), Healthy Brain Network (HBN, n=1110, ages 6-17), and Human Connectome Project in Development (HCPD, n=428, ages 8-22) were used to generate connectome-based predictive models (CPM) of language abilities and executive function. PCA-derived 'latent' measures of each cognitive construct were created from 23 individual tasks. Resting-state and task-fMRI data were collected on 3T Siemens Tim Trio (HBN and PNC) and 3T Siemens Prisma (HBN and HCPD) scanners. Imaging data (motion < 0.20mm) were processed using BioImage Suite. Whole-brain functional connectivity matrices were created using the Shen 268x268 atlas. Connectomes were combined across rest and task runs within each dataset. Within-dataset models were trained with 10-fold cross-validation. Both within-dataset and externally-validated models were evaluated with Pearson's r, representing the correspondence between predicted versus observed values. Significance testing was performed via permutation testing.
Supporting Image: FIgure1.png
 

Results:

CPM successfully predicted language abilities (PNC: r=0.50, p<0.01, q2=0.24, MSE=1.05; HBN: r=0.27, p<0.01, q2=0.06, MSE=4.42; HCPD: r=0.22, p<0.01, q2=0.01, MSE=1.47) and executive function (PNC: r=0.39, p<0.01, q2=0.14, MSE=1.17; HBN: r=0.17, p<0.01, q2=0.02, MSE=2.03; HCPD: r=0.17, p=0.02, q2=-0.01, MSE=1.98) within each dataset. The addition of covariates into models yielded similar results for age, sex, race, socioeconomic status, head motion, and clinical symptom burden. Training models in one dataset and testing in another yielded significant predictions across all six dataset pairings for both language abilities (range of r=0.13 to 0.35, all p<0.001) and executive function (range of r=0.1 to 0.28, all p<0.001).
Supporting Image: Figure2.png
 

Conclusions:

PNC, HBN, and HCPD are characterized by a high degree of inter-dataset heterogeneity, encompassing substantial variations in participant demographics such as age, sex, and race, as well as geographic distribution and clinical symptom burdens. Further diversifying these datasets is a notable lack of harmonization in imaging acquisition parameters, fMRI tasks, and behavioral paradigms employed to assess language abilities and executive function. We demonstrate that robust and reproducible brain-behavior associations can indeed be realized across such diverse dataset features, which are inherent to future clinical applications. Our results were achieved by employing state-of-the-field methodological approaches including the combination of multiple connectomes, harmonization of behavioral measures via PCA, and preservation of participant data (e.g., use of behavioral data from 6745 PNC and 1281 HBN participants without imaging data to derive principal components). This work provides a critical foundation for future work to test the generalizability of brain-behavior predictions to clinical settings. Furthermore, our findings contribute to the ongoing discourse surrounding the requisite sample sizes for reproducible brain-behavior association studies.

Modeling and Analysis Methods:

Classification and Predictive Modeling 1
Connectivity (eg. functional, effective, structural) 2

Keywords:

Cognition
Data analysis
Development
FUNCTIONAL MRI
Language
Machine Learning
MRI
PEDIATRIC
Psychiatric Disorders
Statistical Methods

1|2Indicates the priority used for review

Provide references using author date format

1. Rosenblatt, M., Tejavibulya, L., Camp, C. C., Jiang, R., Westwater, M. L., Noble, S., & Scheinost, D. (2023), 'Power and reproducibility in the external validation of brain-phenotype predictions', bioRxiv : the preprint server for biology, 2023.10.25.563971.

2. Alexander, L., Escalera, J., Ai, L. et al. (2017), 'An open resource for transdiagnostic research in pediatric mental health and learning disorders', Sci Data, vol. 4, 170181.

3. Satterthwaite, T. D., Connolly, J. J., Ruparel, K., Calkins, M. E., Jackson, C., Elliott, M. A., Roalf, D. R., Hopson, R., Prabhakaran, K., Behr, M., Qiu, H., Mentch, F. D., Chiavacci, R., Sleiman, P. M. A., Gur, R. C., Hakonarson, H., & Gur, R. E. (2016), 'The Philadelphia Neurodevelopmental Cohort: A publicly available resource for the study of normal and abnormal brain development in youth', NeuroImage, vol. 124, Pt B, pp. 1115–1119.

4. Somerville, L. H., Bookheimer, S. Y., Buckner, R. L., Burgess, G. C., Curtiss, S. W., Dapretto, M., Elam, J. S., Gaffrey, M. S., Harms, M. P., Hodge, C., Kandala, S., Kastman, E. K., Nichols, T. E., Schlaggar, B. L., Smith, S. M., Thomas, K. M., Yacoub, E., Van Essen, D. C., & Barch, D. M. (2018), 'The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5-21 year olds', NeuroImage, vol. 183, pp. 456–468.

5. Shen, X., Finn, E., Scheinost, D. et al. (2017), 'Using connectome-based predictive modeling to predict individual behavior from brain connectivity', Nat Protoc, vol. 12, pp. 506–518.

6. Gao, S., Greene, A. S., Constable, R. T., & Scheinost, D. (2019), 'Combining multiple connectomes improves predictive modeling of phenotypic measures', NeuroImage, vol. 201, 116038.

7. Marek, S., Tervo-Clemmens, B., Calabro, F.J. et al. (2022), 'Reproducible brain-wide association studies require thousands of individuals', Nature, vol. 603, pp. 654–660.