SAN: mitigating spatial covariance heterogeneity in cortical thickness data in multi-scanner studies

Poster No:

1964 

Submission Type:

Abstract Submission 

Authors:

Rongqian Zhang1, Linxi Chen1, Lindsay Oliver2, Aristotle Voineskos2, Jun Young Park1

Institutions:

1University of Toronto, Toronto, Ontario, 2Centre for Addiction and Mental Health, Toronto, Ontario

First Author:

Rongqian Zhang  
University of Toronto
Toronto, Ontario

Co-Author(s):

Linxi Chen  
University of Toronto
Toronto, Ontario
Lindsay Oliver  
Centre for Addiction and Mental Health
Toronto, Ontario
Aristotle Voineskos  
Centre for Addiction and Mental Health
Toronto, Ontario
Jun Young Park  
University of Toronto
Toronto, Ontario

Introduction:

In neuroimaging studies, combining data collected from multiple study sites or scanners is becoming common to increase the reproducibility of scientific discoveries. At the same time, unwanted variations arise by using different scanners (inter-scanner biases), which need to be corrected before downstream analyses. While statistical harmonization methods such as ComBat (Johnson et al., 2007) have become popular in mitigating inter-scanner biases in neuroimaging, recent methodological advances have shown that harmonizing heterogeneous covariances result in higher data quality. Our work proposes a new statistical harmonization method called SAN (Spatial Autocorrelation Normalization via Gaussian Process) that preserves homogeneous covariance vertex-level cortical thickness data across different scanners. We use an explicit Gaussian process to characterize scanner-invariant and scanner-specific variations to reconstruct spatially homogeneous data across scanners. We demonstrate the utility of the proposed method using cortical thickness data from the Social Processes Initiative in the Neurobiology of the Schizophrenia(s) (SPINS) study registered to the fsaverage5 space with approximately 10000 vertices.

Methods:

SAN uses probabilistic modeling to characterize covariance heterogeneity in multi-scanner studies. SAN decomposes data into heterogeneous (i) spatial variations and (ii) spatial variations through the Gaussian process, both revealing batch effects. Supported by extensive exploratory data analyses, SAN assumes that the underlying spatial autocorrelations are scanner-invariant while the corresponding variance terms are specific to each scanner. Although working with V=10000 vertices could be computationally intensive, we use a computationally feasible method-of-moment approach to harmonize N>300 images within 1 hour on a laptop.

Results:

To characterize covariance heterogeneity, we developed a new measure called "CovarF statistic, " extending the F test statistic to covariance values. Figure 1 shows that covariance heterogeneity is prominent in localized areas of the brain including, but not limited to, pericalcarine, caudal anterior cingulate, paracentral, precentral, postcentral, superior temporal, midtemporal, and insula and entorhinal cortices. Also, covariance heterogeneity is amplified when cortical thickness data are surface-smoothed with 5mm and 10mm. Therefore, we worked on the unsmoothed cortical thickness data for harmonization.

We compared the harmonization performance of SAN to other statistical harmonization methods, including ComBat (Fortin et al., 2018), CovBat (Chen et al., 2022) and RELIEF (Zhang et al., 2023). SAN was most effective in reducing the CovarF statistic throughout the brain while existing methods did not fully address covariance heterogeneity in a few regions.
Supporting Image: OHBM_SAN_Figure1.png
   ·CovarF statistic characterizes the covariance heterogeneity in a neighbor surrounding each vertex. Higher CovarF statistic implies higher heterogeneity in covariances.
Supporting Image: OHBM_SAN_Figure2.png
   ·After applying SAN, the heterogeneity of covariance decreases significantly when compared to other harmonization methods (ComBat, CovBat, and RELIEF)).
 

Conclusions:

In vertex-level cortical thickness, spatial covariance appears to be the most crucial factor that induces batch effects. SAN, which uses pairwise distance information explicitly for modeling inter-scanner effects, effectively harmonized data for downstream analysis. SAN is publicly available as an R package at https://github.com/junjypark/SAN, which supports user-friendly implementation of the method. SAN expands the spectrum of harmonization methods both to higher dimensions (vertex-level) and to methodological formations, which is expected to facilitate spatial localization of imaging biomarkers by integrating it with recent developments in spatial-extent inferences (e.g., Park et al. (2022), Weinstein et al., (2022), Pan et al., (2023)).

Modeling and Analysis Methods:

Exploratory Modeling and Artifact Removal 2
Methods Development
Multivariate Approaches 1

Neuroanatomy, Physiology, Metabolism and Neurotransmission:

Cortical Anatomy and Brain Mapping

Novel Imaging Acquisition Methods:

Anatomical MRI

Keywords:

Computing
Cortex
Data analysis
Modeling
MRI
Multivariate
Open-Source Software
Schizophrenia
Spatial Normalization
Statistical Methods

1|2Indicates the priority used for review

Provide references using author date format

TBD