Poster No:
1561
Submission Type:
Abstract Submission
Authors:
Rukuang Huang1, Chetan Gohil2, Mark Woolrich3
Institutions:
1OHBA, Department of Psychiatry, University of Oxford, Oxford, Oxfordshire, 2University of Oxford, Oxford, Oxford, 3University of Oxford, Oxford, Oxfordshire
First Author:
Rukuang Huang
OHBA, Department of Psychiatry, University of Oxford
Oxford, Oxfordshire
Co-Author(s):
Introduction:
With recent studies, data driven models like the Hidden Markov Model (HMM, Baker et al., 2014) are getting more attention due to their ability to infer fast temporal dynamics in functional networks in an unsupervised manner. However, these dynamic network models are limited by only giving a group-level description, e.g. of the brain regions and spectral content in each brain network. Whilst it is possible to post-hoc estimate the subject-specific networks, this does not allow the model to discover and benefit from subject-wise structure in the population, e.g. sub-groupings of subjects. We propose an extension to the HMM model that incorporates embedding vectors (c.f. word embedding in Natural Language processing) into the group model. Applying this model to resting-state and task MEG data, we show the learnt embedding vectors capture meaningful sources of variation across a population. This includes sub-groupings related to demographics and systematic differences, such as scanner types or measurement sites.
Methods:
We assume that the data is generated by a HMM generative model, but each subject has their own set of state covariances which are generated by our novel subject encoding block. The subject encoding block groups together subjects with similar covariances with a combination of Bayesian hierarchical modelling and embedding vectors. Inference is done with a variant of the EM algorithm.
Datasets:
We demonstrate the usecases of the proposed model with 3 publicly available MEG datasets, including two resting-state and one visual task dataset. The datasets are preprecessed, source- reconstructed and parcellated to 38 regions with the osl toolbox. The first resting-state dataset (the Cam-CAN dataset, Taylor et al., 2017) contains eyes-closed data of 612 healthy participants. These data were collected using an Elekta scanner. In the visual task MEG dataset ([Wakeman and Henson, 2015]), each of the 19 health participants were scanned 6 times, during which 3 types of visual stimuli were shown to the participants. The data were also collected using an Elekta scanner. The second resting-state dataset (Nottingham) was collected suing a CTF scanner. It contains eyes-closed data of 64 healthy participants, collected at Nottingham University, UK as part of the MEGUK partnership.
Results:
In Figure 1, we show results of SE-HMM trained on the visual task data and assign each session an embedding vector. The session-pairwise distances of embedding vectors are plotted in a), which shows a clear block diagonal structure - sessions for a subject are more similar than those for different subjects. Session-pairwise distance of inferred covariances from SE-HMM is compared with dual estimation in b) and c), where we see that SE-HMM infers covariances that form better-separated clusters.
In Figure 2, we train SE-HMM on data which combines the Nottingham and Cam-CAN datasets. In a), we see that dataset and age information are encoded in different directions in the embedding space. In b), we again see a block diagonal structure in the pairwise distances of embedding vectors. In c), we show that both SE-HMM and HMM dual estimated covariances form well-defined clusters. But with 3 metrics, we show SE-HMM better separates scanner types in d). SE-HMM can also summarise the differences in state-specific spectral content between scanner types. Shown in e) are the group-level power, FC for each state, as well as the difference in power and PSD between the datasets.

·Figure 1: SE-HMM finds subgroups in the dataset.

·Figure 2: SE-HMM finds systematic differences between datasets acquired from different scanners.
Conclusions:
We proposed a novel generative model that explicitly models subject variability in a principled way and provided a way to perform efficient inference. With a Bayesian prior, the model pools information across individuals for how they may deviate from the group average. The additional feature of embedding vectors allows the model to group together similar data and help the interpretation of more details in a population. Source code is available at the osl-dynamics toolbox (Gohil et al., 2023).
Modeling and Analysis Methods:
Bayesian Modeling
Connectivity (eg. functional, effective, structural) 1
EEG/MEG Modeling and Analysis
Methods Development
Multivariate Approaches 2
Keywords:
Computational Neuroscience
Data analysis
Machine Learning
MEG
Modeling
Open-Source Code
1|2Indicates the priority used for review
Provide references using author date format
Baker A.P. (2014). Fast transient networks in spontaneous human brain activity. elife, 3:e01867.
Wakeman, D. G. (2015). A multi-subject, multi-modal human neuroimaging dataset. Scientific data, 2(1):1–10.
Taylor, J. R (2017). The cambridge centre for ageing and neuroscience (cam-can) data repository: Structural and functional mri, meg, and cognitive data from a cross-sectional adult lifespan sample. neuroimage, 144:262–269.
Gohil, C. (2023). osl-dynamics: A toolbox for modelling fast dynamic brain activity. bioRxiv, pages 2023–08.