Translating between effect size and classification accuracy

Poster No:

1976 

Submission Type:

Abstract Submission 

Authors:

Chris Camp1, Rongtao Jiang2, Stephanie Noble3, Dustin Scheinost1

Institutions:

1Yale University, New Haven, CT, 2Yale School of Medicine, New Haven, CT, 3Northeastern University, Boston, MA

First Author:

Chris Camp  
Yale University
New Haven, CT

Co-Author(s):

Rongtao Jiang  
Yale School of Medicine
New Haven, CT
Stephanie Noble  
Northeastern University
Boston, MA
Dustin Scheinost  
Yale University
New Haven, CT

Introduction:

Biomarkers in neuroimaging are typically evaluated on the strength of their association, or effect size, with a phenotype of interest. Alternatively, machine learning models can be trained on potential biomarkers to classify individuals and evaluated on their accuracy. Although classification accuracy is dependent on the effect size of the features of interest, no work thus far has illustrated the translation of Cohen's d effect sizes and classification accuracy in models of real-world data. We used simulated neuroimaging data to investigate the relationship between effect size and classification accuracy. We further explored the effects of variance, sample size, and reliability in both univariate and multivariate models to develop a comprehensive understanding of this relationship. We also surveyed the range of effect sizes observed in brain-behavior associations using group differences and features covariance from the UK Biobank. This work provides a contextualization of practical classification accuracies within conventional effect sizes.

Methods:

We used normal probability distributions to model effect sizes between two samples. We estimated maximum classification accuracy with the cumulative distribution function of the normal distribution. This allowed us to measure the area of overlap between the two groups and therefore the best possible classifier accuracy. By convolving cumulative distribution functions, we could then observe classifier accuracy in multivariate settings. We measured the relationship between effect size and classification accuracy while manipulating the number of features and covariance between them. To contextualize these results within neuroimaging, we used the range of effect sizes and covariances in the 3,571 features of the UK Biobank.

Results:

As expected, increasing the number of features while keeping Cohen's d effect size constant between them increased classification accuracy [Fig. 1]. However, the number of parameters required for a given accuracy increased substantially with smaller effect sizes. At a Cohen's d 0.8, accuracy of 0.9 was achieved with 10 features. At Cohen's d of 0.2, however, the features required increased to 170. Covariance between features diminished classification accuracy [Fig. 2]. With no covariance, nearly perfect classification accuracy could be achieved with 100 features for a Cohen's d of 0.5. However, with features correlated by Pearson's r = 0.5, classification accuracy plummeted to just 0.65. Using 3,571 imaging features from the UK Biobank, we computed Cohen's d effect sizes of sex differences and Pearson's r correlation between each feature. The 50th, 75th, 90th, and 99th percentiles of Cohen's d's (absolute value) were 0.15, 0.43, 0.74, and 1.05, respectively, while the same percentiles of Pearson's r's were 0.02, 0.07, 0.19, and 0.51.
Supporting Image: Fig1.jpg
Supporting Image: Fig2.jpg
 

Conclusions:

We developed a method to model classification accuracy of normal distributions as a function of effect size. This framework allows us to observe this relationship across data conditions, including feature count and covariance. We observed a positive relationship between feature count and accuracy and negative relationship between covariance and accuracy. The impact of covariance between features on accuracy was particularly notable, and reflects the importance of choosing orthogonal features when developing classifiers. Finally, we used real neuroimaging data from the UK Biobank to contextualize these ranges of values within imaging features.

Modeling and Analysis Methods:

Classification and Predictive Modeling 2
Methods Development
Multivariate Approaches 1
Univariate Modeling

Keywords:

Data analysis
Experimental Design
Machine Learning
Modeling
Multivariate
Statistical Methods
Univariate

1|2Indicates the priority used for review

Provide references using author date format

This research was conducted using resources from the UK Biobank (application number 49636 and 95864).

Bycroft, C. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature, 562(7726), Article 7726.