If you have trouble speaking, you know how frustrating it can be to use a voice assistant such as Alexa or Siri. She starts responding before you’ve completed your thought, or she doesn’t understand you at all. Google has listened, and it announced this month that it was working on “new technology that makes speech recognition strikingly more responsive.” Voice commands may become more user-friendly for people with aphasia.
The first change will be where the voice recognition software is stored. Most devices have software that is stored on cloud servers. This means that when you speak into the device, it takes time for that message to travel to the server and a response to be sent back. Google has shrunk the software, making it possible to load it onto the device itself.
Because the software is on the phone and not on a server, it can also do things that are currently impossible with Siri or Alexa:
Google also used its on-device technology to create a new feature for its future phones called Live Caption. Once activated, captions appear on screen for any speech playing on the phone, such as a video from a friend, or a podcast. Because the processing takes place on the phone, it works even in airplane mode.
Because the software is on the phone and not in the server, those captions can happen instantly, matching with the words.
Strokes and Software
Google also has an internal team called Project Euphonia which, “aims to adapt speech recognition to people with speech problems, for example, due to a stroke or disease.” In other words, they are “using AI to improve computers’ abilities to understand diverse speech patterns, such as impaired speech.
Because the software is on your phone and not in a cloud server, it can become individualized and trained to recognize your unique speech patterns.
If you’re excited about this, don’t just tell us — tell Google! They’ve set up a short form for people who are willing to record their unique speech patterns in order to improve the software. Please read their post before filling out the form so you understand what they are asking for.
Other Possible Applications
While it wasn’t part of the discussion, it is easy to see how voice assistants could be used for speech therapy practice in the future. Endlessly patient and with time at all hours of the day to work on the same phrase, having the speech recognition software local in the device could mimic an actual speech therapy session, albeit with a robot.
Exciting advances on the horizon! This is the only information we have right now, but reach out to Google to hear more about this project.