Generalization performance of sex classification models to multiple datasets

Poster No:

1447 

Submission Type:

Abstract Submission 

Authors:

Lisa Wiersch1,2, Patrick Friedrich1,2, Sami Hamdan1,2, Vera Komeyer1,2, Felix Hoffstaedter1,2, Kaustubh Patil1,2, Simon Eickhoff1,2, Susanne Weis1,2

Institutions:

1Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf, Düsseldorf, Germany, 2Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich, Jülich, Germany

First Author:

Lisa Wiersch  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany

Co-Author(s):

Patrick Friedrich  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Sami Hamdan  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Vera Komeyer  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Felix Hoffstaedter  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Kaustubh Patil  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Simon Eickhoff  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany
Susanne Weis  
Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf|Institute of Neuroscience and Medicine (INM-7), Research Centre Jülich
Düsseldorf, Germany|Jülich, Germany

Introduction:

Machine Learning (ML) methods are a powerful tool increasingly applied for studying phenotypes based on neuroimaging data. Knowing to what extent the choice of training samples influences the generalizability of ML-models is crucial to achieve most accurate model performance in across-sample predictions. The present study investigates the influence of the choice of sample for sex classification analyses based on resting-state functional connectivity (RSFC) in a variety of cohorts differing in sample size, age range and imaging quality.

Methods:

We employed samples from four different independent cohorts: HCP (N = 878, age range = 22-37, mean age: 28.49; [1]), GSP (N = 854, age range = 21-35, mean age: 22.92; [2]), eNKI (N = 190, age range = 20-83, mean age: 46.02; [3]) and 1000Brains (N = 1000, age range = 21-85, mean age: 61.18; [4]). For each sample, we included healthy subjects aged 20 years or older with a similar number of men and women who were matched for age. Sex classification models were trained on the data of each of the four cohorts individually and a compound sample comprising 75% data of all four samples (N = 2190, age range = 20-85, mean age: 40.10). Following the parcelwise approach by Weis et al. (2020; [5]), we trained sex classifiers on the parcelwise RSFC profile of 436 parcels individually resulting in five sets of parcelwise Classifiers (pwCs; [6]). Each of the classification models was trained with a Support Vector Machine Classifier implemented in Julearn [7]. All five pwCs were applied on test samples derived from the four original samples. In addition, out-of-sample performance of all pwCs was evaluated on data extracted from the AOMIC dataset (N = 370, age range: 20-26, mean age: 22.50) which was not included in training any of the pwCs.

Results:

For pwCs trained on single samples, pwC HCP demonstrated highest mean cross-validation (CV) accuracy averaged across all 436 parcels (figure 1a). However, pwC HCP showed lowest generalization in across-sample predictions with mean accuracies ranging between 52% (eNKI) and 55% (1000Brains). In contrast, pwC GSP, pwC eNKI and pwC 1000Brains achieved lower mean CV accuracies than pwC HCP but higher generalization performance for across-sample predictions (figure 1a). Highest classifying parcels per model application were consistently located in the temporal lobe, inferior parietal lobule, posterior cingulate gyrus and inferior frontal gyrus (figure 2a).
Except for pwC HCP, pwC compound achieved higher mean CV and generalization performance than pwCs trained on single samples: Mean accuracies in across-sample predictions ranged between 61% (HCP test sample) and 65% (GSP test sample, figure 1b) with up to 83% accuracy (eNKI test sample, figure 2b). Highest classifying parcels were located in similar regions as for pwCs trained on single samples (figure 2b). Likewise, pwC compound also showed highest generalization performance for the out-of-sample prediction of the AOMIC sample with a mean accuracy of 59% (figure 1c) with highest classifying parcels - up to 69% - being located in the inferior parietal lobule, inferior frontal gyrus and posterior cingulate gyrus (figure 2c).
Supporting Image: Figure1.png
Supporting Image: Figure2.png
 

Conclusions:

The present results showed that classifiers trained on single samples generalized well for some, but not all, test samples. Specifically, classifiers trained on one sample performed well on a certain other sample, and vice versa as e.g. HCP & 1000Brains as well as eNKI & GSP. In contrast, pwC trained on the compound sample outperformed single sample trained classifiers in classification accuracy. This was not only the case for the test samples, parts of which were used to train pwC compound, but also for the independent AOMIC sample. Overall, the aggregation of multiple samples for training sex classification models resulted in superior generalization performance which could be attributed to the high sample size but also the heterogeneous data composition of pwC compound.

Modeling and Analysis Methods:

Classification and Predictive Modeling 1
Connectivity (eg. functional, effective, structural) 2

Keywords:

Computing
Data analysis
FUNCTIONAL MRI
Machine Learning
Statistical Methods
Other - sex classification

1|2Indicates the priority used for review

Provide references using author date format

Caspers, S., Moebus, S., Lux, S., Pundt, N., Schutz, H., Muhleisen, T. W., . . . Amunts, K. (2014). Studying variability in human brain aging in a population-based German cohort-rationale and design of 1000BRAINS. Front Aging Neurosci, 6, 149. doi:10.3389/fnagi.2014.00149
Hamdan, S., More, S., Sasse, L., Komeyer, V., Patil, K. R., & Raimondo, F. . (2023). Julearn: an easy-to-use library for leakage-free evaluation and inspection of ML models. arXiv preprint arXiv:2310.12568.
Holmes, A. J., Hollinshead, M. O., O'Keefe, T. M., Petrov, V. I., Fariello, G. R., Wald, L. L., . . . Buckner, R. L. (2015). Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures. Sci Data, 2, 150031. doi:10.1038/sdata.2015.31
Nooner, K. B., Colcombe, S. J., Tobe, R. H., Mennes, M., Benedict, M. M., Moreno, A. L., . . . Milham, M. P. (2012). The NKI-Rockland Sample: A Model for Accelerating the Pace of Discovery Science in Psychiatry. Frontiers in neuroscience, 6, 152. doi:10.3389/fnins.2012.00152
Van Essen, D. C., Smith, S. M., Barch, D. M., Behrens, T. E., Yacoub, E., Ugurbil, K., & Consortium, W. U.-M. H. (2013). The WU-Minn Human Connectome Project: an overview. NeuroImage, 80, 62-79. doi:10.1016/j.neuroimage.2013.05.041
Weis, S., Patil, K. R., Hoffstaedter, F., Nostro, A., Yeo, B. T. T., & Eickhoff, S. B. (2020). Sex Classification by Resting State Brain Connectivity. Cereb Cortex, 30(2), 824-835. doi:10.1093/cercor/bhz129
Wiersch, L., Friedrich, P., Hamdan, S., Komeyer, V., Hoffstaedter, F., Patil, K. R., ... & Weis, S. (2023). Sex classification from functional brain connectivity: Generalization to multiple datasets. bioRxiv, 2023-08.