The Equity Test · BML-13.03

The AI That Doesn't Speak Your Language

Series 13: The Equity Test

By Syam Adusumilli · 10 min read · Cross-Cutting

Article Cross References (5) References (4)

Carmen Gutierrez is 74 years old, and the two-year gap in her care is a gap the system created. She and her husband Jorge, also 74, immigrated from Mexico when they were 32. They have been married 48 years. Their English is functional. They use it at the pharmacy, at the bank, at the grocery store on the corner of their block in San Antonio. They use it when they have to.

Their Spanish is the language of everything else. Their marriage runs in Spanish. Their jokes, their arguments, the stories they tell their grandchildren about the village where they grew up. Carmen’s medical history, the surgeries and the recoveries, the pregnancies and the losses, the years of work that wore down her knees and her shoulders, all of it lives in Spanish. When she is tired, she thinks in Spanish. When she is frightened, she prays in Spanish. When she cannot find a word, the word she cannot find is Spanish, because that is where her words live.

Two years ago, her neurologist recommended cognitive screening. The screening was conducted in English. Carmen scored in the borderline range. Her husband, watching from the corner of the room, knew the results were wrong before anyone told him. She had hesitated on questions she would have answered without thinking in Spanish. She had struggled with a timed section not because she could not think fast enough but because she was translating in her head before she answered, and the clock does not wait for translation.

The family spent two years navigating the gap. Two years of follow-up screenings in English that continued to produce borderline results. Two years of worry. Two years of Carmen wondering whether she was losing herself. When they finally found a Spanish-speaking neurologist who administered validated Spanish-language cognitive screening, the results were clear. Carmen had mild cognitive impairment. Not the borderline English results that could have been language interference. Definitive MCI, identifiable and treatable at an earlier stage than the one she was now in.

The dementia identified at year two was not the MCI that year-zero screening in Spanish would have found. The two-year delay narrowed the intervention window. It did not close it. But the window is narrower now than it needed to be, and the narrowing was caused by a system that could not hear Carmen in the language she thinks in.

Why Language Matters for Cognitive Assessment
#

The neuroscience of bilingualism explains what happened to Carmen without excusing the system that let it happen. Cognitive processing in a second language is measurably slower than in a first language, even in fluent bilinguals. The difference is not large in everyday conversation. It becomes significant under the conditions of cognitive testing: time pressure, unfamiliar vocabulary, abstract reasoning tasks, and the anxiety of a clinical environment where the stakes feel high and the language feels foreign.

A bilingual person asked to name as many animals as possible in sixty seconds will produce fewer in their second language than their first. This is not a cognitive deficit. It is a language access effect. The words are there. The retrieval pathway runs through a different neural network, one that requires a fraction of a second more for each word. Under time pressure, fractions of seconds accumulate into points lost on a screening tool that was not designed to measure what it is actually measuring.

The Montreal Cognitive Assessment, the screening tool most widely used for mild cognitive impairment, was developed and validated primarily in English and French. Translations exist for Spanish, Mandarin, and several dozen other languages, but translation is not validation. A translated test may use words that carry different levels of difficulty, cultural references that do not transfer across languages, or syntactic structures that change the cognitive demand of the task. Validation requires testing the translated instrument on a normative population of native speakers and establishing scoring norms for that population. For many language versions, this validation is incomplete or absent.

Carmen’s English screening measured her English, not her cognition. The borderline result was an accurate measurement of a bilingual woman performing a cognitive task in her second language under time pressure. It was an inaccurate measurement of her cognitive capacity.

The Current Multilingual Landscape
#

Validated multilingual cognitive screening tools exist, but their availability and use are inconsistent. Spanish-language validated assessments are the most developed. The Spanish MoCA has been validated in several Spanish-speaking populations, though normative data varies by country of origin and educational background. A Mexican-born woman and a Cuban-born woman may perform differently on the same Spanish-language test because the vocabulary norms are different.

Mandarin-language cognitive assessments are available and increasingly validated, reflecting the size of the Mandarin-speaking elder population in the United States. Cantonese versions are less well established. Korean-language tools exist for several commonly used assessments. Vietnamese, Tagalog, and other languages spoken by large communities of older adults have fewer validated options, and the ones that exist are less consistently available in clinical settings.

The honest picture: better than ten years ago. Far short of what the population requires. In 2020, roughly 22 percent of Americans over 65 spoke a language other than English at home. The validated cognitive screening infrastructure serves a fraction of that population in their primary language.

The AI Problem
#

The AI health monitoring systems described across this publication operate in English. The health AI from Series 1 that checks in with the user daily, that monitors speech patterns, asks about symptoms, and flags changes to clinicians, works in English. Its natural language processing was trained on English speech. Its cognitive monitoring algorithms were validated on English speakers. Its medication interaction alerts reference labels and instructions that may be available in English only.

For Carmen, this means the AI that should be monitoring her cognitive trajectory is doing so in a language that introduces systematic measurement error. Every daily check-in conducted in English produces data that blends cognitive signal with language noise. The AI cannot distinguish between a word-finding delay caused by advancing MCI and a word-finding delay caused by retrieving a word from the wrong language. Both look the same in the data. They are not the same.

The care coordination functions face a related problem. The AI that communicates with specialists, summarizes clinical notes, and generates care plans operates on English-language records. When Carmen’s records include notes from her Spanish-speaking neurologist, the AI must process them through a translation layer. Translation introduces latency, loses nuance, and occasionally produces errors that a monolingual system would never make. A medication instruction that is clear in Spanish may be ambiguous when translated, and ambiguity in medication instructions is not a minor problem.

Life Story Documentation in the Native Language
#

The memory exoskeleton from Series 5 is built on life story documentation. The system learns who the person is, what matters to them, what they remember, and what those memories mean, and uses that knowledge to support cognitive scaffolding as memory changes. The life story is the raw material from which the AI builds the person’s cognitive architecture.

Carmen’s life story is in Spanish. The village, the immigration, the pregnancies, the work, the years of building a life in a new country while keeping the old one alive in the language she speaks at home. A life story documented in English would capture the facts. It would miss the texture. The word Carmen uses for the courtyard where she played as a child does not translate. The way she describes her mother’s cooking is not a list of dishes but a rhythm of language that carries sensory memory. The AI baseline established from an English-language documentation process would measure change against a version of Carmen that was always incomplete.

What Is Genuinely Close
#

Large language model improvements in Spanish-language natural language processing are the most advanced of any non-English language. Clinical validation of Spanish-language AI health monitoring is in progress at several academic medical centers. The timeline for validated Spanish-language cognitive AI, a system that can monitor Carmen’s speech in Spanish and detect changes from a Spanish-language baseline, is one to two years for initial deployment.

Mandarin is several years behind Spanish in clinical validation quality for AI health monitoring. The NLP capability is strong. The clinical validation, the proof that the system accurately detects cognitive change in Mandarin speakers, is not yet complete.

Tagalog, Vietnamese, Korean, and other languages spoken by large communities of older adults in the United States are further behind still. The AI companies building health monitoring tools will reach these languages in order of market size, which means the smallest communities will wait the longest.

The honest timeline is this: Spanish speakers will have validated AI cognitive monitoring within two years. Mandarin speakers within three to four. Other language communities within five or more. The people who wait the longest are the people whose communities have the least commercial leverage.

What Families Can Do Now
#

Families navigating this gap today have specific options, none of which are ideal and all of which are better than the default.

Request cognitive screening in the patient’s native language. Validated Spanish-language assessments are available at most academic medical centers and many community health centers. Ask for them. If the provider does not offer native-language screening, ask why, and ask for a referral to a provider who does.

Bring a family member who is medically literate in the patient’s language to all appointments. Not a child translating for a parent, which reverses the family hierarchy in ways that are uncomfortable for everyone and compromise the clinical encounter. A family member who can participate in the medical conversation as an equal, who can catch the moment when the patient is struggling with language rather than cognition, and who can tell the clinician the difference.

Document the patient’s life story in their native language now, before it becomes clinically necessary. The life story documentation described in Series 5 is most powerful when it is built early, when the person is fully themselves and the memories are vivid. Record it in the language they dream in. The AI will eventually be able to use it. The family can use it now.

Carmen, Now
#

Carmen has a Spanish-speaking neurologist. Her cognitive AI is being established with a Spanish-language baseline. She is receiving care that accounts for who she is and what language she thinks in. The intervention is two years later than it needed to be.

The two years cost her something that cannot be given back. MCI progresses. The interventions available at the point of early identification, cognitive training, lifestyle modifications, medication management, are more effective the earlier they begin. Carmen began two years later than she should have. Her trajectory is different from the trajectory she would have had if the system had spoken her language at year zero.

She is not angry, exactly. Jorge is. He watched his wife take a test that measured the wrong thing, receive results that described the wrong person, and spend two years worrying about a diagnosis that was both less and more than what the test suggested. He knew the screening was wrong before anyone in the room with a degree did. He knew because he knows her in the language the system did not speak.

The system that should have found Carmen earlier did not speak her language. The one that is now monitoring her does. The harm from the delay is real. The delay is still the default for most people who think in a language the AI has not learned yet.

How this article connects to others in Blue Mirror.

BML-04.02 (The Cognitive Baseline Nobody Established) argues for establishing a cognitive baseline before there is a reason to need one; this article documents what happens when the baseline is established in the wrong language for a bilingual elder, producing a two-year diagnostic delay that 04.02's early-baseline argument is designed to prevent — the equity failure of the advice 04.02 gives.

BML-05.07 (The Story Only You Can Tell) describes life story documentation as the foundation of the memory exoskeleton; this article adds the language dimension that 05.07's framework requires but does not specify — Carmen's life story is in Spanish, and a documentation process that captures facts in English while missing the texture of the language the memories live in produces an incomplete foundation.

BML-13.01 (The AI That Hears You Wrong) covers the dialect bias failure within English; this article covers the parallel failure for bilingual elders in a second language, and the two together span the full range of language and speech failures in AI cognitive monitoring — the training data problem within English and the validation problem across languages.

BML-13.SYN (Design With, Not For) places Jorge Gutierrez in the changed product development room as the person who knew the cognitive screening was wrong before anyone with a degree did; the reader who has spent time with Jorge watching his wife take a test that measured the wrong thing will understand why his presence in the room is the synthesis's argument made concrete.

BGM's coverage of the immigrant aging experience (BGM-12H, Aging Between Two Countries) documented the structural conditions that produce Carmen's situation — the healthcare system's linguistic assumptions, the social isolation of the first-generation elder, the family dynamics of translating across a generational divide; readers who want the full diagnosis of what aging between two languages means will find that context in BGM.

Sources cited in this article.

Ardila, Alfredo, et al. "Neuropsychological Testing in Spanish Speakers." Applied Neuropsychology, vol. 1, no. 1-2, 1994, pp. 46-53.
Gollan, Tamar H., et al. "Frequency Drives Lexical Access in Reading but Not in Speaking: The Frequency-Lag Hypothesis." Journal of Experimental Psychology: General, vol. 137, no. 2, 2008, pp. 275-285.
United States Census Bureau. "Language Spoken at Home by Age." American Community Survey, 2020.
Rosselli, Monica, et al. "Culture, Ethnicity, and Level of Education in Alzheimer's Disease." Neurotherapeutics, vol. 19, 2022, pp. 26-54.

Why Language Matters for Cognitive Assessment#

The Current Multilingual Landscape#

The AI Problem#

Life Story Documentation in the Native Language#

What Is Genuinely Close#

What Families Can Do Now#

Carmen, Now#