The University of Illinois (UIUC) is working with Apple and other tech giants on the Speech Accessibility Project, which aims to improve voice recognition systems for people with speech patterns and disabilities current versions have trouble understanding.
While often derided for mishearing a user's request, voice recognition systems for digital assistants like Siri have become more accurate over the years, including the development of on-device recognition. In a new move, a project is aiming to increase the accuracy further, by targeting people with speech impediments and disabilities.
Partnering with Apple, Amazon, Google, Meta, and Microsoft, as well as non-profits, UIUC's Speech Accessibility Project will try to expand the range of speech patterns that voice recognition systems can understand. This includes a focus on speech affected by diseases and disabilities, including Lou Gehrig's disease, Amyotrophic Lateral Sclerosis, Parkinson's, cerebral palsy, and Down syndrome.
In some cases, speech recognition systems could provide quality-of-life improvements to users with ailments that inhibit movement, but issues affecting the user's voice can impact its effectiveness.
Under the Speech Accessibility Project, samples will be collected from individuals "representing a diversity of speech patterns," to create a private and de-identified dataset. That dataset, which will focus on American English at first, could then be used to train machine learning models to better cope with the speech.
The involvement of a wide array of tech companies that have virtual assistants or offer speech recognition features in their tools could help speed up developments within the project. Instead of using separate teams that could duplicate the results found by others, the teams can instead collaborate directly through the project.
"Speech interfaces should be available to everybody, and that includes people with disabilities," said Mark Hasegawa-Johnson, a professor at UIUC. "This task has been difficult because it requires a lot of infrastructure, ideally the kind that can be supported by leading technology companies, so we've created a uniquely interdisciplinary team with expertise in linguistics, speech, AI, security, and privacy."