Technology

The technology behind AphaSay.

Three AI layers working together. None of them require you to understand them.

What happens when you tap the microphone.

When a person with aphasia speaks into AphaSay, the audio passes through three steps before anyone hears the result. Each step uses a different AI model trained for a specific job.

Step 1 — Speech Recognition

The app sends your audio to a transcription model called gpt-4o-transcribe. This is OpenAI's speech recognition system released in late 2025. It produces seventy percent fewer hallucinations than older models like Whisper, which matters a lot when you are working with disordered speech.

The output of this step is a literal transcription. If you said "blurk kitchen wife cold," that is what comes back.

Raw transcript

"blurk kitchen wife cold"

Step 2 — Sentence Reconstruction

The transcription gets sent to GPT-4o along with your patient profile. The profile includes things like your wife's name, the rooms in your house, your daily routine, and which words you have struggled with before.

GPT-4o uses all of this context to figure out the most likely sentence you were trying to produce. In a 2025 peer-reviewed study published in Scientific Reports, this approach reconstructed aphasia speech with eighty percent accuracy.

If the AI is not confident enough, the app shows you two or three options to choose from. You tap the right one.

Confidence: 94%

"I want my wife to bring me something cold from the kitchen."

Step 3 — Speech Output

The reconstructed sentence is read aloud using either ElevenLabs voice cloning or OpenAI's text-to-speech engine. If you upload a few minutes of pre-stroke voice recordings, ElevenLabs creates a clone so the app speaks in your own voice.

If you do not have old recordings, you can pick a natural-sounding voice from a library.

Spoken aloud

Voice: [Your own cloned voice]

The app knows when it might be wrong.

Not every reconstruction is a sure thing. AphaSay handles uncertainty in three tiers.

High confidenceabove 80%

The reconstruction is spoken immediately with no interruption. Most everyday conversation falls here once the profile is set up.

Medium confidence50%–80%

The app shows you two or three options on screen. You tap the one you meant. The system learns from your selection.

Low confidencebelow 50%

The app asks you to try again or confirms a single best guess. This is rare once the patient profile has been used for a few weeks.

The more the app knows about you, the better it gets.

The reconstruction model only works as well as the context you give it. AphaSay builds what we call a Living Patient Profile. It includes your family members and their names, the rooms in your home, your medications, your routine, and the topics you talk about most often.

This profile is fed into every reconstruction call. So if you say "wife... cold... thing," the AI knows your wife's name is Fatima, you have a fridge in your kitchen, and you ask for cold water every afternoon at three.

The profile also learns over time. The words you produce correctly get tracked. The words you struggle with get flagged. The AI gets better at predicting what you mean as it sees more of how you speak.

LIVING PATIENT PROFILE

FamilyWife: Fatima, Son: Kamran, Daughter: Sara

HomeKitchen, bedroom, living room, garden

RoutineCold water at 3pm, news at 7pm

MedicationsAspirin, Lisinopril

Frequent topicsFamily, food, cricket, prayer

This is not science fiction. The papers are published.

The core idea behind AphaSay — using a large language model to reconstruct aphasic speech — was validated in a 2025 paper in Scientific Reports titled "Reconstructing impaired language using generative AI for people with aphasia." The authors tested GPT-4o on nearly 2,000 utterances from 180 participants in the AphasiaBank database.

The result was 80% reconstruction accuracy. Not for cleaning up dysarthric (slurred) speech. For actual aphasic speech, with paraphasias, neologisms, and word-finding gaps.

AphaSay is the first product to take this research and package it into something a stroke survivor can actually use.

80%

Reconstruction accuracy

1,984

Utterances tested

180

Participants (AphasiaBank)

2025

Scientific Reports

See the Features