Higher criticism statistics for fMRI: a secondary analysis of NARPS team results

Poster No:

1323 

Submission Type:

Abstract Submission 

Authors:

Benedikt Sundermann1,2, Anke McLeod3, Christian Mathys3,2

Institutions:

1Evangelisches Krankenhaus Oldenburg, Medical Campus University of Oldenburg, Oldenburg, Lower Saxony, 2Research Center Neurosensory Science, University of Oldenburg, Oldenburg, Germany, 3Evangelisches Krankenhaus Oldenburg, Medical Campus University of Oldenburg, Oldenburg, Germany

First Author:

Benedikt Sundermann  
Evangelisches Krankenhaus Oldenburg, Medical Campus University of Oldenburg|Research Center Neurosensory Science, University of Oldenburg
Oldenburg, Lower Saxony|Oldenburg, Germany

Co-Author(s):

Anke McLeod  
Evangelisches Krankenhaus Oldenburg, Medical Campus University of Oldenburg
Oldenburg, Germany
Christian Mathys  
Evangelisches Krankenhaus Oldenburg, Medical Campus University of Oldenburg|Research Center Neurosensory Science, University of Oldenburg
Oldenburg, Germany|Oldenburg, Germany

Introduction:

The Neuroimaging Analysis Replication and Prediction Study (NARPS) (Botvinik-Nezer et al. 2020) highlighted the heterogeneity of results obtained by different teams analyzing the same task-fMRI dataset (Botvinik-Nezer et al. 2019) and thus potential problems associated with analytical variability. Most NARPS-teams used mass-univariate testing, combined with cluster-based thresholding for multiple comparison (MC) correction. This raises the question, to what extent the variability of conclusions by different NARPS-teams was triggered by the limited capability of identifying consistent effects within areas of interest using mass-univariate approaches across the different original analyses.
In this secondary analysis of NARPS data we used an alternative statistical approach, higher criticism (HC) to search for at least rare or weak effects in intermediate team results (unthresholded result maps). HC is a statistical approach for testing a global hypothesis that such effects are present within a large number of primary statistical tests by quantifying an excess of low p-values (Donoho and Jin 2015). HC has been introduced to assess subthreshold effects in fMRI avoiding limitations of conventional mass-univariate testing (Gerlach et al. 2021, Sundermann et al. 2023).
Purpose of this analysis was: to assess whether HC-based global hypothesis testing for rare/weak effects based on result maps from different teams reduces the variability of results.

Methods:

Analysis of publicly available intermediate results (unthresholded result maps; aggregate data, not individual subjects) from 64 teams in NARPS. Calculation of z-value maps with original published NARPS code and conversion to p-value maps representing equivalents of two-sided tests as a prerequisite for the rationale of p-value histogram interpretation underlying HC (different from original one-sided NARPS hypothesis tests). Extraction of p-values of all covered voxels from the regions of interests (ventro-medial prefrontal cortex, ventral striatum, and amygdala) from the NARPS common thresholding analysis. Visualization of p-value histograms and global hypothesis tests using the classical HC statistic. Comparison of HC-based global hypothesis test results and original team decisions for the 9 ROI/hypothesis-combinations.

Results:

Most p-value histograms exhibited typical distributions as a prerequisite for an HC-based analysis. The global null hypothesis was rejected more frequently in HC-based analyses (82.1 % of tests) compared with rejected null hypotheses in the original NARPS team results (26.9 %), see Figure. For 5 out of 9 hypotheses with ambiguous results (null hypothesis rejected in 29.1 %) in the original analysis, HC rejected the global null hypothesis in a majority of teams (87.8 %). For 3 out of 9 hypotheses (amygdala) with originally negative results (null hypothesis rejected in 4.1 %), HC-based results were ambiguous (null hypothesis rejected in 68.2 % of tests). Secondary analyses revealed a small association of smoothness estimates from team maps with the HC decisions.

Conclusions:

HC-based analyses revealed at least rare or weak effects within ROIs despite negative or ambiguous findings in conventional mass-univariate analyses of task-based fMRI data of mixed gambles reported in NARPS. It thus reduced variability of conclusions in a subset of NARPS hypotheses. HC-based findings also include effects in the amygdala, discussed as a negative result in the original fMRI study (Tom et al. 2007). Thus, HC-based analyses appear to have a higher probability of positive results than conventional thresholding of mass-univariate tests. They could not solve the problem with heterogeneous results across NARPS teams in general. HC rather shifted ambiguity towards hypotheses with negative results in the original analyses. While there is small effect of spatial map smoothness, smoothness does not appear to be the main driver of differences of HC-based results.

Modeling and Analysis Methods:

Activation (eg. BOLD task-fMRI) 1
Exploratory Modeling and Artifact Removal 2

Keywords:

Data analysis
FUNCTIONAL MRI
Meta- Analysis
Open Data
Statistical Methods

1|2Indicates the priority used for review
Supporting Image: Fig.png
   ·Figure - Comparison of original conclusions by 64 teams in NARPS vs. conclusions based on HC global hypothesis testing
 

Provide references using author date format

Botvinik-Nezer, R., Holzmeister, F., Camerer, C. F., Dreber, A., Huber, J., Johannesson, M., et al. (2020). 'Variability in the analysis of a single neuroimaging dataset by many teams.' Nature, vol. 582, no. 7810, pp. 84-88.

Botvinik-Nezer, R., Iwanir, R., Holzmeister, F., Huber, J., Johannesson, M., Kirchler, M., Dreber, A., Camerer, C.F., Poldrack, R.A. and Schonberg, T., (2019). 'fMRI data of mixed gambles from the Neuroimaging Analysis Replication and Prediction Study.' Sci Data, vol. 6, no. 1, pp. 106.

Donoho, D. and Jin, J. (2015). 'Higher Criticism for Large-Scale Inference, Especially for Rare and Weak Effects.' Statistical Science, vol. 30, no. 1, pp. 1-25.

Gerlach, A. R., Karim, H. T., Kazan, J., Aizenstein, H. J., Krafty, R. T. and Andreescu, C. (2021). 'Networks of worry-towards a connectivity-based signature of late-life worry using higher criticism.' Transl Psychiatry, vol. 11, no. 1, pp. 550.

Sundermann, B., Pfleiderer, B., McLeod, A. and Mathys, C. (2023). 'Seeing more than the tip of the iceberg: Approaches to subthreshold effects in functional magnetic resonance imaging of the brain.' PsyArXiv Preprints. https://doi.org/10.31234/osf.io/fyhst

Tom, S. M., Fox, C.R., Trepel, C. and Poldrack, R.A. (2007). 'The neural basis of loss aversion in decision-making under risk.' Science, vol. 315, no. 5811, pp. 515-518.