The technology behind AphaSay.
Three AI layers working together. None of them require you to understand them.
What happens when you tap the microphone.
When a person with aphasia speaks into AphaSay, the audio passes through three steps before anyone hears the result. Each step uses a different AI model trained for a specific job.
Step 1 — Speech Recognition
The app sends your audio to a transcription model called gpt-4o-transcribe. This is OpenAI's speech recognition system released in late 2025. It produces seventy percent fewer hallucinations than older models like Whisper, which matters a lot when you are working with disordered speech.
The output of this step is a literal transcription. If you said "blurk kitchen wife cold," that is what comes back.
Raw transcript
"blurk kitchen wife cold"
Step 2 — Sentence Reconstruction
The transcription gets sent to GPT-4o along with your patient profile. The profile includes things like your wife's name, the rooms in your house, your daily routine, and which words you have struggled with before.
GPT-4o uses all of this context to figure out the most likely sentence you were trying to produce. In a 2025 peer-reviewed study published in Scientific Reports, this approach reconstructed aphasia speech with eighty percent accuracy.
If the AI is not confident enough, the app shows you two or three options to choose from. You tap the right one.
Confidence: 94%
"I want my wife to bring me something cold from the kitchen."
Step 3 — Speech Output
The reconstructed sentence is read aloud using either ElevenLabs voice cloning or OpenAI's text-to-speech engine. If you upload a few minutes of pre-stroke voice recordings, ElevenLabs creates a clone so the app speaks in your own voice.
If you do not have old recordings, you can pick a natural-sounding voice from a library.
Spoken aloud
Voice: [Your own cloned voice]
The app knows when it might be wrong.
Not every reconstruction is a sure thing. AphaSay handles uncertainty in three tiers.
The reconstruction is spoken immediately with no interruption. Most everyday conversation falls here once the profile is set up.
The app shows you two or three options on screen. You tap the one you meant. The system learns from your selection.
The app asks you to try again or confirms a single best guess. This is rare once the patient profile has been used for a few weeks.
The more the app knows about you, the better it gets.
The reconstruction model only works as well as the context you give it. AphaSay builds what we call a Living Patient Profile. It includes your family members and their names, the rooms in your home, your medications, your routine, and the topics you talk about most often.
This profile is fed into every reconstruction call. So if you say "wife... cold... thing," the AI knows your wife's name is Fatima, you have a fridge in your kitchen, and you ask for cold water every afternoon at three.
The profile also learns over time. The words you produce correctly get tracked. The words you struggle with get flagged. The AI gets better at predicting what you mean as it sees more of how you speak.
LIVING PATIENT PROFILE
This is not science fiction. The papers are published.
The core idea behind AphaSay — using a large language model to reconstruct aphasic speech — was validated in a 2025 paper in Scientific Reports titled "Reconstructing impaired language using generative AI for people with aphasia." The authors tested GPT-4o on nearly 2,000 utterances from 180 participants in the AphasiaBank database.
The result was 80% reconstruction accuracy. Not for cleaning up dysarthric (slurred) speech. For actual aphasic speech, with paraphasias, neologisms, and word-finding gaps.
AphaSay is the first product to take this research and package it into something a stroke survivor can actually use.