The brain activity of a paralysed woman is being translated into words spoken by an avatar. This milestone could help others who have lost speech.
At Ann Johnson’s wedding reception 20 years ago, her gift for speech was vividly evident. In an ebullient 15-minute toast, she joked that she had run down the aisle, wondered if the ceremony programme should have said “flutist” or “flautist” and acknowledged that she was “hogging the mic”.
Just two years later, Johnson — then a 30-year-old teacher, volleyball coach and mother of an infant — had a cataclysmic stroke that paralysed her and left her unable to talk.
In August, scientists reported a remarkable advance toward helping her, and other patients, speak again. In a milestone of neuroscience and artificial intelligence, implanted electrodes decoded Johnson’s brain signals as she silently tried to say sentences. Technology converted her brain signals into written and vocalised language and enabled an avatar on a computer screen to speak the words and display smiles, pursed lips and other expressions.
The research, published in the journal Nature, demonstrates the first time spoken words and facial expressions have been directly synthesised from brain signals, experts say. Johnson chose the avatar, a face resembling hers, and researchers used her wedding toast to develop the avatar’s voice.
“We’re just trying to restore who people are,” said the team’s leader, Dr Edward Chang, the chair of neurological surgery at the University of California, San Francisco.
“It let me feel like I was a whole person again,” Johnson, now 48, wrote to me.
The goal is to help people who cannot speak because of strokes or conditions such as cerebral palsy and Lou Gehrig’s disease (or amyotrophic lateral sclerosis). To work, Johnson’s implant must be connected by cable from her head to a computer, but her team and others are developing wireless versions. Eventually, researchers hope, people who have lost speech may converse in real time through computerised pictures of themselves that convey tone, inflection and emotions such as joy and anger.
“What’s quite exciting is that just from the surface of the brain, the investigators were able to get out pretty good information about these different features of communication,” said Dr Parag Patil, a neurosurgeon and biomedical engineer at the University of Michigan, who was asked by Nature to review the study before publication.
Johnson’s experience reflects the field’s fast-paced progress. Just two years ago, the same team published research in which a paralysed man, who went by the nickname Pancho, used a simpler implant and algorithm to produce 50 basic words such as “hello” and “hungry” that were displayed as text on a computer after he tried to say them.
Johnson’s implant has nearly twice as many electrodes, increasing its ability to detect brain signals from speech-related sensory and motor processes linked to the mouth, lips, jaw, tongue and larynx. Researchers trained the sophisticated AI to recognise not individual words, but phonemes, or sound units such as “ow” and “ah” that can ultimately form any word.
“It’s like an alphabet of speech sounds,” said David Moses, the project manager.
While Pancho’s system produced 15 to 18 words per minute, Johnson’s rate was 78 using a much larger vocabulary list. Typical conversational speech is about 160 words per minute.
When researchers began working with her, they didn’t expect to try the avatar or audio. But the promising results were “a huge green light to say, ‘Okay, let’s try the harder stuff; let’s just go for it,’” Moses said.
They programmed an algorithm to decode brain activity into audio waveforms, producing vocalised speech, said Kaylo Littlejohn, a graduate student at the University of California, Berkeley, and one of the study’s lead authors, along with Moses, Sean Metzger, Alex Silva and Margaret Seaton.
“Speech has a lot of information that is not well preserved by just text, like intonation, pitch, expression,” Littlejohn said.
Working with a company that produces facial animation, researchers programmed the avatar with data on muscle movements. Johnson then tried to make facial expressions for happy, sad and surprised, each at high, medium and low intensity. She also tried to make various jaw, tongue and lip movements. Her decoded brain signals were conveyed on the avatar’s face.
Through the avatar, she said, “I think you are wonderful,” and, “What do you think of my artificial voice?”
“Hearing a voice similar to your own is emotional,” Johnson told the researchers.
She and her husband, William Johnson, a postal worker, even engaged in conversation. She said through the avatar: “Do not make me laugh.” He asked how she was feeling about the Toronto Blue Jays’ chances. “Anything is possible,” she replied.
The field is moving so quickly that experts believe federally approved wireless versions might be available within the next decade. Different methods might be optimal for certain patients.
Nature also published another team’s study involving electrodes implanted deeper in the brain, detecting activity of individual neurons, said Dr Jaimie Henderson, a professor of neurosurgery at Stanford University and the team’s leader, who was motivated by his childhood experience of watching his father lose speech after an accident. He said their method might be more precise but less stable because specific neurons’ firing patterns can shift.
Their system decoded sentences at 62 words per minute that the participant, Pat Bennett, 68, who has ALS, tried to say from a large vocabulary. That study didn’t include an avatar or sound decoding.
Both studies used predictive language models to help guess words in sentences. The systems don’t just match words but are “figuring out new language patterns” as they improve their recognition of participants’ neural activity, said Melanie Fried-Oken, an expert in speech-language assistive technology at Oregon Health & Science University, who consulted on the Stanford study.
Neither approach was completely accurate. When using large vocabulary sets, they incorrectly decoded individual words about one-quarter of the time.
For example, when Johnson tried to say, “Maybe we lost them,” the system decoded, “Maybe we that name.” But in nearly half of her sentences, it correctly deciphered every word.
Researchers found that people on a crowdsourcing platform could correctly interpret the avatar’s facial expressions most of the time. Interpreting what the voice said was harder, so the team is developing a prediction algorithm to improve that. “Our speaking avatar is just at the starting point,” Chang said.
Experts emphasise that these systems aren’t reading people’s minds or thoughts. Rather, Patil said, they resemble baseball batters who “are not reading the mind of the pitcher but are kind of interpreting what they see the pitcher doing” to predict pitches.
Still, mind reading may ultimately be possible, raising ethical and privacy issues, Fried-Oken said.
Johnson contacted Chang in 2021, the day after her husband showed her my article about Pancho, the paralysed man the researchers had helped. Chang said he initially discouraged her because she lived in Saskatchewan, Canada, far from his lab in San Francisco, but “she was persistent.”
William Johnson, 48, arranged to work part-time. “Ann’s always supported me to do what I’ve wanted,” including leading his postal union local, he said. “So I just thought it was important to be able to support her in this.”
She started participating last September. Travelling to California takes them three days in a van packed with equipment, including a lift to transfer her between wheelchair and bed. They rent an apartment there, where researchers conduct their experiments to make it easier for her. The Johnsons, who raise money online and in their community to pay for travel and rent for the multiyear study, spend weeks in California, returning home between research phases.
“If she could have done it for 10 hours a day, seven days a week, she would have,” William Johnson said.
Determination has always been part of her nature. When they began dating, she gave him 18 months to propose, which he said he did “on the exact day of the 18th month,” after she had “already gone and picked out her engagement ring.”
Ann Johnson communicated with me in emails composed with the more rudimentary assistive system she uses at home. She wears eyeglasses affixed with a reflective dot that she aims at letters and words on a computer screen.
It’s slow, allowing her to generate only 14 words per minute. But it’s faster than the only other way she can communicate at home: using a plastic letter board, a method William Johnson described as “her just trying to show me which letter she’s trying to try to look at and then me trying to figure out what she’s trying to say”.
The inability to have free-flowing conversations frustrates them. When discussing detailed matters, he sometimes says something and receives her response by email the next day.
“Ann’s always been a big talker in life, an outgoing, social individual who loves talking, and I don’t,” he said, but her stroke “made the roles reverse, and now I’m supposed to be the talker.”
Ann Johnson was teaching high school maths, health and physical education, and coaching volleyball and basketball when she had her brainstem stroke while warming up to play volleyball. After a year in a hospital and a rehabilitation facility, she came home to her 10-year-old stepson and her 23-month-old daughter, who has now grown up without any memory of hearing her mother speak, William Johnson said.
“Not being able to hug and kiss my children hurt so bad, but it was my reality,” Ann Johnson wrote. “The real nail in the coffin was being told I couldn’t have more children.”
For five years after the stroke, she was terrified. “I thought I would die at any moment,” she wrote, adding, “The part of my brain that wasn’t frozen knew I needed help, but how would I communicate?”
Gradually, her doggedness resurfaced. Initially, “y face muscles didn’t work at all”, she wrote, but after about five years, she could smile at will.
She was entirely tube-fed for about a decade but decided she wanted to taste solid food. “If I die, so be it,” she told herself. “I started sucking on chocolate.” She took swallowing therapy and now eats finely chopped or soft foods. “My daughter and I love cupcakes,” she wrote.
When Johnson learned that trauma counsellors were needed after a fatal bus crash in Saskatchewan in 2018, she decided to take a university counselling course online.
“I had minimal computer skills and, being a maths and science person, the thought of writing papers scared me,” she wrote in a class report. “At the same time, my daughter was in grade 9 and being diagnosed with a processing disability. I decided to push through my fears and show her that disabilities don’t need to stop us or slow us down.”
Helping trauma survivors remains her goal. “My shot at the moon was that I would become a counsellor and use this technology to talk to my clients,” she told Chang’s team.
At first when she started making emotional expressions with the avatar, “I felt silly, but I like feeling like I have an expressive face again,” she wrote, adding that the exercises also enabled her to move the left side of her forehead for the first time.
She has gained something else, too. After the stroke, “It hurt so bad when I lost everything,” she wrote. “I told myself that I was never again going to put myself in line for that disappointment again.”
Now “I feel like I have a job again,” she wrote.
Besides, the technology makes her imagine being in Star Wars: “I have kind of gotten used to having my mind blown.”
This article originally appeared in The New York Times.
Written by: Pam Belluck
Photographs by: Sarah Hylton
©2023 THE NEW YORK TIMES