A new artificial intelligence system called a semantic decoder can translate a person’s brain activity, while listening to a story or silently imagining telling a story, into a continuous stream of text. The system developed by researchers at the University of Texas at Austin could help people who are mentally aware but physically unable to speak, such as those debilitated by strokes, communicate intelligibly again.
The study, published in the journal neuroscience of nature, was led by Jerry Tang, a doctoral student in computer science, and Alex Huth, an assistant professor of neuroscience and computer science at UT Austin. The work is based in part on a transformer model, similar to those powered by Open AI’s ChatGPT and Google’s Bard.
Unlike other language decoding systems in development, this system does not require subjects to have surgical implants, making the process non-invasive. Participants also do not need to only use words from a prescribed list. Brain activity is measured using an fMRI scanner after extensive decoder training, in which the individual listens to hours of podcasts on the scanner. Later, as long as the participant is open to having their thoughts decoded, hearing a new story or imagining themselves telling a story allows the machine to generate the corresponding text from brain activity alone.
“For a non-invasive method, this is a real breakthrough compared to what’s been done before, which is usually single words or short sentences,” Huth said. “We are getting the model to decode a continuous language over long periods of time with complicated ideas.”
The result is not a word-for-word transcription. Instead, the researchers designed it to capture the essence of what is said or thought, albeit imperfectly. About half the time, when the decoder has been trained to monitor a participant’s brain activity, the machine produces text that closely (and sometimes accurately) resembles the intended meanings of the original words.
For example, in the experiments, the thoughts of a participant who heard a speaker say, “I don’t have my driver’s license yet” were translated as, “She hasn’t even started learning to drive yet.” Hearing the words “I didn’t know whether to scream, cry, or run. Instead I said, ‘Leave me alone!'” were decoded as “I started screaming and crying, and then she just said, ‘I told you to leave me. in peace'”.
Starting with an earlier version of the article that appeared as a preprint online, the researchers addressed questions about potential misuse of the technology. The paper describes how the decoding worked only with cooperative participants who had voluntarily participated in the training of the decoder. The results for people who had not been trained to decoder were unintelligible, and if participants who had later trained the decoder resisted, for example by thinking other thoughts, the results were equally unusable.
“We take concerns that it could be used for bad purposes very seriously and have worked to prevent it,” Tang said. “We want to make sure that people only use these types of technologies when they want to and that they help them.”
In addition to having the participants listen to or think about stories, the researchers asked the subjects to watch four short, silent videos while in the scanner. The semantic decoder was able to use his brain activity to accurately describe certain events in the videos.
The system is currently impractical for use outside of the laboratory due to its reliance on the time required in an fMRI machine. But the researchers believe this work could be transferred to other, more portable brain imaging systems, such as functional near-infrared spectroscopy (fNIRS).
“fNIRS measures where there is more or less blood flow in the brain at different points in time, which turns out to be exactly the same type of signal that fMRI is measuring,” Huth said. “So our exact type of focus should translate to fNIRS,” although, he pointed out, the resolution with fNIRS would be lower.
This work was supported by the Whitehall Foundation, the Alfred P. Sloan Foundation, and the Burroughs Wellcome Fund.
The other co-authors on the study are Amanda LeBel, a former research assistant in Huth’s lab, and Shailee Jain, a graduate student in computer science at UT Austin.
Alexander Huth and Jerry Tang have filed a PCT patent application related to this work.
—————————————————-
Source link