Poster No:
2261
Submission Type:
Abstract Submission
Authors:
Yu-Wei Wang1, Chao-Gan Yan1
Institutions:
1Institute of Psychology, Chinese Academy of Sciences, Beijing, Beijing
First Author:
Yu-Wei Wang
Institute of Psychology, Chinese Academy of Sciences
Beijing, Beijing
Co-Author:
Chao-Gan Yan
Institute of Psychology, Chinese Academy of Sciences
Beijing, Beijing
Introduction:
Pooling multi-site datasets is the dominant trend to expand sample sizes in neuroimaging field, thereby enhancing statistical power and reproducibility of research findings. Nevertheless, the heterogeneity derived from aggregating data from various imaging sites obstruct efficient inferences. In particular, the removal of such site effects generally necessitates a certain level of programming expertise. In our effort to streamline the harmonization of site effects using advanced methodologies, we are pleased to introduce DPABI Harmonization module. This versatile tool, allowing agnostic to specific analysis methods, integrates a range of techniques, including linear models, ComBat/CovBat (Johnson 2007; Fortin 2017; Fortin 2018; Yu 2018; Chen 2022), subsampling maximum-mean-distance algorithms (SMA, Zhou 2018), and invariant conditional variational auto-encoder (ICVAE, Moyer 2020). It equips neuroscientists with an easy-to-use and transparent harmonization workflow, ensuring the feasibility of post-hoc analysis for multi-site studies.
Methods:
DPABI Harmonization (Fig 1.A) is open-source and distributed under GNU/GPL, available at http://www.rfmri.org/dpabi. It supports Windows, macOS and Linux operating systems. We are continually updating the toolbox since its first release.
Step1: Set computational environment. Timing: 1-5 minutes.
Users can find the latest version of Harmonization in https://github.com/Chaogan-Yan/DPABI and downloaded it from website green "Code" button or git clone https://github.com/Chaogan-Yan/DPABI.git.
Step2: Prepare brain and demographic data. Timing: 10-30 minutes.
› Data preparation:
i) Organize them into one .mat/.xlsx file and add it with "Add Image" button;
ii) Use original .nii/.nii.gz/.gii/.gii.gz/.mat for individual, and organize them under one directory. Add this directory under "Parent Directory" within "Add Directory/Sites" and keep "Reference File for Sites" empty;
iii) When arrange them based on their sites, it is required to name then by the same path. Add any one site directory to "Parent Directory" within "Add Directory/Sites" and choose any one .nii/.nii.gz/.gii/.gii.gz/.mat file of this site for "Reference File for Sites" .
› Demographic file:
All input choices require ensure the subjects align across covariates in demographic file (rows). For way i) this is the prerequisite. As for ii) and iii), an alternative way besides checking, is to add a column named "FileList" for voxel-based and network-based files/ two columns named "FileList_LH" and "FileList_RH" for vertex-based files in demographic file. Once input this file, no matter you have added images or not, it would load files by the order of provided "FileList".
Step 3: Methods setting. Timing: 10-15min.
Fig1.B showcases four sub-GUIs of methods setting. The first step is to load demographic file. And next set corresponding parameters based on the specific application requirements (Fig 1.B).
Step 4: Compute. Timing: Not sure.
Based on the size of data to be harmonized, methodology option as well as hardware and software foundations, the runtime is uncertain.

Results:
We showcase the runtime of all methods on example dataset with 41 subjects each scanned in three scanners once, and each image has 38810 features (Table 1). The parametric ComBat/CovBat is very fast. However, given to our evaluations, it is preferred to use nonparametric ComBat/CovBat over parametric. We illustrate our code structure in Fig.2.
Conclusions:
We designed DPABI Harmonization to deliver a flexible, well-coordinated, and cohesive analysis experience. To achieve this, we commit to staying abreast of field developments and adapting to advancing technologies for continuous harmonization. Users have the option to share their needs, questions, or advice on our online forum at http://rfmri.org/Networking or can reach out directly through the authors' emails ycg.yan@gmail.com or dwong6275@gmail.com. Let's build DPABI Harmonization together!
Modeling and Analysis Methods:
Methods Development 2
Neuroinformatics and Data Sharing:
Workflows 1
Keywords:
Data analysis
FUNCTIONAL MRI
Informatics
Workflows
Other - Software, multi-site, harmonization
1|2Indicates the priority used for review
Provide references using author date format
Ashburner, J. (2012), "SPM: a history." Neuroimage 62(2): 791-800.
Chen, A. A. (2022), "Mitigating site effects in covariance for machine learning in neuroimaging data." Hum Brain Mapp 43(4): 1179-1195.
Fortin, J. P. (2018), "Harmonization of cortical thickness measurements across scanners and sites." Neuroimage 167: 104-120.
Fortin, J. P. (2017), "Harmonization of multi-site diffusion tensor imaging data." Neuroimage 161: 149-170.
Johnson, W. E. (2007), "Adjusting batch effects in microarray expression data using empirical Bayes methods." Biostatistics 8(1): 118-127.
Marek, S. (2022), "Reproducible brain-wide association studies require thousands of individuals." Nature 603(7902): 654-660.
Moyer, D. (2020), "Scanner invariant representations for diffusion MRI harmonization." Magn Reson Med 84(4): 2174-2189.
Pomponio, R. (2020), "Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan." Neuroimage 208: 116450.
Wang, Y. W. (2023), "Comprehensive evaluation of harmonization on functional brain imaging for multisite data-fusion." Neuroimage 274: 120089.
Yan, C. G. (2013), "Standardizing the intrinsic brain: towards robust measurement of inter-individual variation in 1000 functional connectomes." Neuroimage 80: 246-262.
Yan, C. G. (2016), "DPABI: Data Processing & Analysis for (Resting-State) Brain Imaging." Neuroinformatics 14(3): 339-351.
Yu, M. (2018), "Statistical harmonization corrects site effects in functional connectivity measurements from multi-site fMRI data." Hum Brain Mapp 39(11): 4213-4227.
Zhou, H. H. (2018), "Statistical tests and identifiability conditions for pooling and analyzing multisite datasets." Proc Natl Acad Sci U S A 115(7): 1481-1486.