Decoding Number of Syllables from Human Intracranial Electroencephalography.

Poster No:

977 

Submission Type:

Abstract Submission 

Authors:

Gyuwon Lee1, Chun Kee Chung1

Institutions:

1Seoul National University, Seoul, Korea, Republic of

First Author:

Gyuwon Lee  
Seoul National University
Seoul, Korea, Republic of

Co-Author:

Chun Kee Chung  
Seoul National University
Seoul, Korea, Republic of

Introduction:

Speech decoding is a prominent field in the human brain-computer interface (BCI) over the past decade. While successful decoding has been reported for attempted tasks, including overt speech and mimed speech, achieving intelligible sounds for imagined speech remains unsolved. This difficulty arises due to the absence of automatic feedback and the distinct neural substrates associated with attempted speech. To address this, we used a linguistic innate property, the number of syllables, to predict the imagined spoken word. In this study, we investigated the encoding features of the number of syllables in imagined speech and decoded based on the findings.

Methods:

We recruited six patients with drug-resistant epilepsy underwent intracranial electroencephalography (iEEG) for clinical purpose. During a session, 108 words were presented to the patients, each displayed for 3 seconds with a preceding fixation slide lasting one second. These words were grouped into four classes based on the number of syllables (1, 2, 3, >4 syllables). Patients were instructed to mentally speak the presented word at any time, and simultaneous iEEG signals were recorded simultaneously. A deep learning classifier, comprising a single bidirectional gated recurrent unit (GRU) layer and one fully-connected layer, was used to predict the number of syllables. Each frequency band (delta: 1–4 Hz, theta: 4–8 Hz, alpha: 8–13 Hz, beta: 13–30 Hz, low gamma: 30–70 Hz, high gamma: 70–170 Hz) from the iEEG signals was selected as a model input sequentially. The input was bootstrapped 20 times to generate an accuracy distribution, and the significance of accuracy was assessed using surrogate data.

Results:

The accuracy from surrogate data was notably high (p<0.001) at 32.19%, and features surpassing this threshold were considered significant. The envelope of the alpha wave showed the greatest number of significant features. These significant features were observed across diverse regions, including pSTG, vSMC, pMTG, medial occipital gyrus, and angular gyrus. With these features, the decoding of the number of syllables achieved an accuracy exceeding 40% across the four specified classes.

Conclusions:

In this study, we demonstrated the number of imagined syllables can be successfully decoded from iEEG signals. Moreover, we observed that the processing related to the number of imagined syllables occurs across various regions.

Higher Cognitive Functions:

Executive Function, Cognitive Control and Decision Making
Imagery 1

Language:

Speech Production 2
Language Other

Keywords:

Data analysis
ELECTROCORTICOGRAPHY
Electroencephaolography (EEG)
Language
Machine Learning

1|2Indicates the priority used for review

Provide references using author date format

Meng, K. (2023) 'Continuous synthesis of artificial speech sounds from human cortical surface recordings during silent speech production', Journal of Neural Engineering, vol. 20, no. 4.

Metzger, S.L. (2023) 'A high-performance neuroprosthesis for speech decoding and avatar control', Nature, vol. 620, no. 7976, pp. 1037–1046