Multiverse analysis on depression biomarkers from EEG

Poster No:

1634 

Submission Type:

Abstract Submission 

Authors:

Yasmin Hollenbenders1, Christoph Maier2, Alexandra Reichenbach1

Institutions:

1Heilbronn University, Heilbronn, Baden-Württemberg, 2Medical Informatics, Heilbronn University, Heilbronn, Baden-Wuerttemberg

First Author:

Yasmin Hollenbenders  
Heilbronn University
Heilbronn, Baden-Württemberg

Co-Author(s):

Christoph Maier  
Medical Informatics, Heilbronn University
Heilbronn, Baden-Wuerttemberg
Alexandra Reichenbach  
Heilbronn University
Heilbronn, Baden-Württemberg

Introduction:

Major depressive disorder (MDD) is one of the most common mental disorders (GBD 2019 Mental Disorders Collaborators, 2022) but diagnosis is rather subjective (Cai et al., 2018). A method to gain objective biomarkers is based on electroencephalography (EEG). Studies show that those distinguish between MDD patients and healthy controls (HC) but with contradictory results regarding discriminatory features (De Aguiar Neto & Rosa, 2019). One substantial problem is that EEG data acquisition and processing is not standardized (Kołodziej et al., 2021). However, to support clinical decision-making, biomarkers need to be robust against variations. This multiverse analysis (Steegen et al., 2016) systematically investigates the effect of processing steps on the performance of α-band depression biomarkers in diagnostic classification.

Methods:

We used a public dataset with 5 min resting-state eyes closed EEG (HC:28/MDD:30) (Mumtaz, 2016). 13 channels were chosen for analysis: frontal (Fp1/2, F3/4, F7/8, Fz), central (C3/4), parietal (P3/4), and occipital (O1/2). Data was bandpass filtered (1-40 Hz) and artifacts were removed with ICLabel (Li et al., 2022). The effect of processing steps on biomarker performance was investigated with a multiverse analysis constructing 720 paths over the processing step (Fig. 1). A) Data normalization (3 level): None, z-normalization over all channels of each subject, and z-normalization for each channel of each subject. B) Window length (4 level): non-overlapping windows of 5-, 10, 15, and 20 sec. Subjects with less than 10 windows were excluded from further analyses. This left 41/36/23 subjects for 5, 10/15/20 sec windows from whom we sub-sampled 10 windows each. C) Feature extraction (9 features + combination (all)): absolute and relative α band power, α peak frequency (αpf), and from the upper signal envelope (env) kurtosis (kurt), skewness (skew), median, interquartile range, variance, and range. D) Aggregation (2 level): 10 individual values or median. E) Classification algorithm (3 level): Logistic Regression, Random Forest, and Support Vector Machine. The classification models were trained with six-fold cross-validation. With resulting accuracies, we conducted an ANOVA for comparison of processing steps and t-tests against chance-level for statistical robustness of the models.
Supporting Image: PipelineMultiverseAnalysis.png
   ·Figure 1
 

Results:

All processing steps but normalization have a sign. influence on diagnostic accuracy (all p<.001). For further analysis, we used subject-wise normalization to keep the regional information of electrodes. Aggregating to the median (63.4±17.0%) achieves higher accuracy than individual values (F(1,3600)=59.467, p<.001). However, the latter yields more statistically robust models (15 vs 7). 15-sec windows (65.7±13.5%) achieve sign. higher accuracies (all t(2158)>5.288, p<.001), while 10-sec windows yield most statistically robust models (8). Random Forest performs sign. better (all t(2878)>3.243, p<.001), yet Logistic Regression yields most statistically robust models (8). Env skew scores the highest accuracies (8/9, t(862)>3.737, p<.001), followed by all (7/9, t(862)>4.847, p<.001), αpf (6/9, t(862)>4.519, p<.001), and env kurt (6/9, t(862)>7.337, p<.001). However, the order of the most statistically robust markers differs: env kurt (7), αpf (5), env skew (4), and all (2).
Supporting Image: ResultsMultiverseAnalysis.png
   ·Figure 2
 

Conclusions:

We demonstrate the effect of processing on the performance of biomarkers for depression diagnosis with a multiverse analysis. Biomarkers used in clinical settings need high discriminatory value as well as robustness. The biomarkers of this study achieved neither. Furthermore, none of the variations of the processing steps yield a clear advantage. We can thus neither recommend a set of methods nor biomarkers. This study is restricted by the choice of biomarkers and additional EEG features need to be considered. Nonetheless, we demonstrate the relevance of systematic work on the influence of data processing methods to resolve contradictory results in biomarker research.

Disorders of the Nervous System:

Psychiatric (eg. Depression, Anxiety, Schizophrenia) 2

Modeling and Analysis Methods:

Classification and Predictive Modeling
EEG/MEG Modeling and Analysis 1
Multivariate Approaches
Task-Independent and Resting-State Analysis

Keywords:

Computational Neuroscience
Data analysis
Design and Analysis
DISORDERS
Electroencephaolography (EEG)
Experimental Design
Machine Learning
Psychiatric Disorders
Other - Major Depressive Disorder; Multiverse Study

1|2Indicates the priority used for review

Provide references using author date format

Cai, H.(2018), 'A Pervasive Approach to EEG-Based Depression Detection', Complexity, vol. 2018, pp. 1–13

De Aguiar Neto, F. S. (2019), 'Depression biomarkers using non-invasive EEG: A review', Neuroscience & Biobehavioral Reviews, vol. 105, pp. 83-93

GBD 2019 Mental Disorders Collaborators, G. (2022), 'Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019', The Lancet Psychiatry, vol. 9, no. 2, 137-150

Kołodziej, A. (2021), 'No relationship between frontal alpha asymmetry and depressive disorders in a multiverse analysis of five studies', eLife, vol. 10

Li, A. (2022), 'MNE-ICALabel: Automatically annotating ICA components with ICLabel in Python', Journal of Open Source Software, vol. 7, no. 76, p. 4484

Mumtaz, W. (2016), 'MDD Patients and Healthy Controls EEG Data (New)', figshare Dataset

Steegen, S. (2016), 'Increasing Transparency Through a Multiverse Analysis', Perspectives on Psychological Science, vol. 11, no. 5, pp. 702–712