Recognition Of Speech From Brain Activity

Speech is produced in the human cerebral cortex. Brain waves associated with speech processes can be directly recorded with electrodes located on the surface of the cortex.

Rsearchers from Karlsruhe Institute of Technology and Wadsworth Center have now shown that is possible to reconstruct basic units, words, and complete sentences of continuous speech from these brain waves and to generate the corresponding text.

Tanja Schultz, who conducted the present study with her team at the Cognitive Systems Lab of KIT, said:

“It has long been speculated whether humans may communicate with machines via brain activity alone,” “As a major step in this direction, our recent results indicate that both single units in terms of speech sounds as well as continuously spoken sentences can be recognized from brain activity.”

Brain-to-Text System

The results were achieved by an interdisciplinary collaboration of researchers of informatics, neuroscience, and medicine. Methods for signal processing and automatic speech recognition were developed and applied in Karlsruhe.

Christian Herff und Dominic Heger, who developed the Brain-to-Text system within their doctoral studies, comments:

“In addition to the decoding of speech from brain activity, our models allow for a detailed analysis of the brain areas involved in speech processes and their interaction.”

The present work is the first that decodes continuously spoken speech and transforms it into a textual representation. For this purpose, cortical information is combined with linguistic knowledge and machine learning algorithms to extract the most likely word sequence.

Speech From Thought

Currently, Brain-to-Text is based on audible speech. However, the results are an important first step for recognizing speech from thought alone.

The brain activity was recorded in the USA from 7 epileptic patients, who participated voluntarily in the study during their clinical treatments.

An electrode array was placed on the surface of the cerebral cortex (electrocorticography, or ECoG) for their neurological treatment. While patients read aloud sample texts, the ECoG signals were recorded with high resolution in time and space.

Later on, the researchers in Karlsruhe analyzed the data to develop Brain-to-Text.

In addition to basic science and a better understanding of the highly complex speech processes in the brain, Brain-to-Text might be a building block to develop a means of speech communication for locked-in syndrome patients in the future.

Reference:

Christian Herff, Dominic Heger, Adriana de Pesters, Dominic Telaar, Peter Brunner, Gerwin Schalk, Tanja Schultz.
Brain-to-text: decoding spoken phrases from phone representations in the brain.
Frontiers in Neuroscience, 2015; 9 DOI: 10.3389/fnins.2015.00217

Photo: Brain activity recorded by electrocorticography (blue circles). From the activity patterns (blue/yellow), spoken words can be recognized. Credit: CSL/KIT