Decoding naturalistic auditory and visual information using high-density diffuse optical tomography

Poster No:

1378 

Submission Type:

Abstract Submission 

Authors:

Morgan Fogarty1, Kalyan Tripathy2, Zachary Markow1, Jason Trobaugh1, Joseph Culver2

Institutions:

1Washington University in St. Louis, St. Louis, MO, 2Washington University School of Medicine, St. Louis, MO

First Author:

Morgan Fogarty, MS  
Washington University in St. Louis
St. Louis, MO

Co-Author(s):

Kalyan Tripathy  
Washington University School of Medicine
St. Louis, MO
Zachary Markow  
Washington University in St. Louis
St. Louis, MO
Jason Trobaugh  
Washington University in St. Louis
St. Louis, MO
Joseph Culver, PhD  
Washington University School of Medicine
St. Louis, MO

Introduction:

Decoding of naturalistic information from functional neuroimaging data has important neuroscience and clinical implications. Prior studies using functional magnetic resonance imaging (fMRI) have reconstructed visual (Nishimoto et al., 2011) and language (Tang et al., 2023) from neural responses. However, the physical constraints of fMRI make some natural studies and widespread application impractical. High density diffuse optical tomography (HD-DOT) is an optical imaging modality that uses many overlapping light measurements to reconstruct images similar to fMRI (Eggebrecht et al., 2014). HD-DOT has a quiet, open scanning environment, and in some cases portability, allowing for a range of decoding studies. Here, we evaluate the performance of decoding audiovisual movie clip identity through a template-matching with HD-DOT data. This work establishes the feasibility of decoding complex auditory and visual stimulus information from optical neuroimaging data.

Methods:

Our HD-DOT system has 128 sources and 125 detectors distributed across posterior, lateral, and dorsal surfaces of the head with a grid spacing of 11mm. Highly sampled adults (n=3, 20-31 years old) watched a library of 20, 5-minute animated audiovisual movie clips twice each over 5 imaging sessions. A spatiotemporal template-matching approach was selected to decode the clip identity based on the HD-DOT data (Tripathy et al., 2021). Data from each session was divided into a template run and test run with one viewing of each movie clip in each half. The voxel-wise Pearson correlation coefficients were computed between the test data and the template data for each clip and averaged across the voxels to determine the mean template correlation. Decoded clip identity was assigned based on the maximum correlation between the templates and test run. Confusion matrices were generated by tracking the number of times that each test clip was decoded as each of the possible options across all imaging sessions was included in the analysis. Mean decoding accuracy was reported as the total percentage of trials across all sessions that were decoded correctly. To assess factors that impact decoding performance, the number of templates and clip duration was varied, and decoding accuracy reevaluated for each parameter.

Results:

Participants viewed four movie clips twice in each imaging session. Cortical responses between independent viewings of every possible pairing of clips reveals strong correlations between runs in which the participant was presented the same movie clip (Fig 1A). Averaging inter-run correlation maps across all possible pairings of matched and mismatched movies illustrates that movies evoke reproducible, clip-specific patterns of brain activity (Fig 1B). The clip identity was decoded through template matching (Fig 1C) and aggregated across sessions and subjects (Fig 1D). The bright main diagonal of the confusion matrix illustrates that the decoded clip usually matched the clip that was presented with an accuracy of 92.3±4.4%. Decoding improved with increased clip duration but was already significantly above chance with as little as 15 seconds of data (Fig 2A). Using a fixed clip duration of 45 seconds, the number of decoding choices was varied between 4 and 16. Decoding accuracy decreased with an increasing number of templates but remained well above chance (Fig 2B). Varying both clip duration and the number of templates across all sessions (Fig 2C-E) resulted in accuracies greater than chance for decoding 8 clip segments (90 second clip, accuracy = 76.0±5.0%, chance = 12.5%), 16 clip segments (45 seconds clip, accuracy = 53.4±6.3%, chance = 6.25%), and 64 clip segments (15 second clip, accuracy = 23.3±4.2%, chance = 3.13%).
Supporting Image: MOV_decode_Fig1.png
Supporting Image: MOV_decode_Fig2.png
 

Conclusions:

This work illustrates that complex, naturalistic information can be decoded from HD-DOT data. These results encourage further studies applying more intricate decoding algorithms to HD-DOT data, for instance to reconstruct novel scenes from optical neuroimaging data.

Modeling and Analysis Methods:

Classification and Predictive Modeling 1

Novel Imaging Acquisition Methods:

NIRS 2

Perception, Attention and Motor Behavior:

Perception: Auditory/ Vestibular
Perception: Visual

Keywords:

Hearing
Machine Learning
Modeling
Near Infra-Red Spectroscopy (NIRS)
OPTICAL
Optical Imaging Systems (OIS)
Vision
Other - High density diffuse optical tomography; Naturalistic stimuli; Neural decoding

1|2Indicates the priority used for review

Provide references using author date format

Eggebrecht, A. T., Ferradal, S. L., Robichaux-Viehoever, A., Hassanpour, M. S., Dehghani, H., Snyder, A. Z., Hershey, T., & Culver, J. P. (2014). Mapping distributed brain function and networks with diffuse optical tomography. Nature Photonics, 8(6), 448-454. https://doi.org/10.1038/nphoton.2014.107
Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. L. (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21(19), 1641-1646. https://doi.org/10.1016/j.cub.2011.08.031
Tang, J., LeBel, A., Jain, S., & Huth, A. G. (2023). Semantic reconstruction of continuous language from non-invasive brain recordings. Nature Neuroscience. https://doi.org/10.1038/s41593-023-01304-9
Tripathy, K., Markow, Z. E., Fishell, A. K., Sherafati, A., Burns-Yocum, T. M., Schroeder, M. L., Svoboda, A. M., Eggebrecht, A. T., Anastasio, M. A., Schlaggar, B. L., & Culver, J. P. (2021). Decoding visual information from high-density diffuse optical tomography neuroimaging data. NeuroImage, 226. https://doi.org/10.1016/j.neuroimage.2020.117516