
The Hiring Algorithm That Accidentally Became a Medical Exam
A friend of mine — brilliant software engineer, one of the best pattern thinkers I've ever met — told me he'd been rejected from eleven companies in a row. Not after technical rounds. Before them. He never made it past the initial "personality assessment."
He's autistic. And every one of those companies was using some version of the same AI-powered screening tool.
I kept thinking about that conversation when, in May 2024, the ACLU filed a formal complaint with the Federal Trade Commission against Aon Consulting. The allegation was stunning in its specificity: Aon's suite of AI hiring tools — marketed as "bias-free" and diversity-enhancing — was likely functioning as a covert disability screen. The tools measured traits like "liveliness," "positivity," and "emotional awareness." Traits that aren't just vague personality dimensions. They're near-perfect mirrors of the clinical criteria used to diagnose autism.
When I read the full complaint, sitting at my desk at eleven at night with a cold cup of chai, something clicked into place that had been bothering me for years. The AI hiring industry doesn't have a bias problem. It has an architecture problem. And no amount of "Responsible AI" branding is going to fix it.
The Promise That Broke
For the better part of a decade, the pitch from AI hiring vendors has been seductive and simple: humans are biased, algorithms are not. Let the machine decide, and you'll get fairer outcomes.
I bought into a version of this early on. When I started Veriprajna, I genuinely believed that if you could formalize decision-making — strip out the gut feelings, the "culture fit" hunches, the unconscious preference for people who remind you of yourself — you'd get something closer to meritocracy. The math would set us free.
The AI hiring industry doesn't have a bias problem. It has an architecture problem.
Then I started looking at what these tools actually measure. And I realized the math was encoding exactly the biases it claimed to eliminate — just translating them into a language that looked objective.
Aon's flagship personality tool, ADEPT-15, evaluates candidates on fifteen dimensions. Things like "Liveliness" (are you outgoing or reserved?), "Awareness" (can you read between the lines?), "Composure" (are you calm under pressure or passionate?), and "Flexibility" (do you prefer routine or change?). The tool uses a forced-choice format — you pick between two statements — and adapts in real-time based on your previous answers.
On paper, it sounds sophisticated. In practice, it's asking: how neurotypical are you?
What Happens When a Hiring Tool Mirrors a Clinical Diagnosis?

This is the part that kept me up that night. I pulled up the Autism Spectrum Quotient — a standard clinical screening tool — and laid it next to Aon's ADEPT-15 construct definitions. The overlap wasn't subtle. It was structural.
The AQ measures social skills, attention shifting, attention to detail, communication, and imagination. ADEPT-15 measures "Liveliness," "Flexibility," "Structure," "Awareness," and "Assertiveness." These aren't distant cousins. They're the same constructs wearing different clothes.
When an algorithm penalizes someone for being "reserved" instead of "outgoing," it's not measuring job fitness. It's measuring social performance. And for someone whose brain processes social information differently — someone autistic, someone with ADHD, someone with social anxiety — that measurement is a trap disguised as a test.
The ACLU's complaint puts it bluntly: these assessments "closely track autism/mental health diagnostics." Under the Americans with Disabilities Act, employers can't administer medical examinations as part of the hiring process unless they're directly job-related. If a personality test is functionally indistinguishable from a clinical screening tool, what exactly is it?
I remember bringing this up with a colleague who'd spent years in industrial-organizational psychology. His first reaction was defensive — "These are validated psychometric instruments." My response: validated against what? Against a normative sample that was overwhelmingly neurotypical? That's not validation. That's circular reasoning wearing a lab coat.
The Video Interview Problem Is Worse Than You Think
Aon's second tool, vidAssess-AI, layers the personality model on top of asynchronous video interviews. Candidates record themselves answering questions. An NLP engine transcribes their speech, analyzes the content, and scores it against the ADEPT-15 personality framework.
Here's where it gets genuinely alarming. Natural language processing models are trained on massive text datasets that overwhelmingly reflect neurotypical communication patterns. The rhythm of typical speech. The expected cadences of confidence. The "normal" way to structure a narrative.
My team spent weeks testing how different speech patterns interact with commercial NLP systems. Flat intonation — common in autistic speakers — gets flagged as "lack of enthusiasm." Atypical pauses get interpreted as "uncertainty." Non-linear storytelling — the way many people with ADHD naturally organize thoughts, jumping between connected ideas before circling back — registers as "disorganized thinking."
When an algorithm penalizes someone for being "reserved" instead of "outgoing," it's not measuring job fitness. It's measuring social performance.
None of this has anything to do with whether someone can do the job. All of it has everything to do with whether someone performs neurotypicality convincingly on camera.
Research from Duke University found that large language models systematically associate neurodivergent terms with negative connotations. In some models, "I have autism" scores as more negative than "I am a bank robber." When these same models power hiring tools through API integrations, they carry those associations straight into the screening process. No developer intended it. The architecture guaranteed it.
I wrote about the technical mechanics of this in more depth in the interactive version of our research, but the short version is this: you cannot fix emergent ableism with a wrapper around a biased model. The bias isn't a bug. It's a feature of how the system was built.
Why I Stopped Believing in "Bias-Free"
There was a moment — and I can place it precisely — when my thinking about this shifted from "we need better bias testing" to "the entire paradigm is wrong."
We were running an internal audit on a client's hiring pipeline. Standard stuff: demographic parity checks, adverse impact ratios, the metrics everyone uses. The numbers looked clean. Hire rates across demographic groups were within acceptable ranges. The client was happy. Their legal team was happy.
Then one of my engineers, Priya, asked a question that stopped the room: "What if the people who would have been screened out never applied in the first place?"
She was right. We were measuring fairness among the people who made it through the personality screen. But the screen itself had already filtered the candidate pool. We were auditing the survivors and calling it equity.
That's when I understood the fundamental flaw in the "wrapper" approach to AI fairness. A wrapper takes an existing foundation model — GPT-4, whatever — passes data through it, and presents the output. You can add bias checks on top. You can post-process the results. But the model's internal representations have already encoded the biases of its training data. You're putting a fairness sticker on a fundamentally unfair machine.
The hiring data these models train on reflects decades of neurotypical preference. When the model deploys, its decisions feed back into future training sets. Reserved candidates get rejected, so the model learns that "reserved" predicts rejection, so it rejects more reserved candidates. The loop tightens. The bias compounds. And the dashboard says everything is fine.
How Do You Actually Build Hiring AI That Doesn't Discriminate?

This is the question I've spent the last several years trying to answer. Not "how do you make AI less biased" — that framing accepts the current architecture and tries to patch it. The real question is: how do you build systems where the bias can't hide?
The approach we've developed at Veriprajna rests on one core insight: correlation is where discrimination hides. Traditional machine learning finds patterns in data. If neurotypical communication style correlates with getting hired, the model will use communication style as a proxy for hire-worthiness. It doesn't know it's discriminating. It's just optimizing.
To break this, you need causal reasoning, not just statistical pattern-matching.
We use something called Causal Representation Learning. Instead of asking "what features predict hiring success?", we ask "what features predict hiring success that aren't causally downstream of a protected characteristic?" It's a fundamentally different question, and it requires a fundamentally different architecture.
Think of it this way. Imagine a candidate's profile as a web of connected attributes. Some connections are legitimate — years of experience connects to skill level. But some connections run through protected territory — communication style connects to neurotype, which connects to how a personality test scores you, which connects to whether you get an interview. Causal Representation Learning maps these pathways and mathematically severs the illegitimate ones.
We pair this with adversarial training — a technique where we pit two models against each other. One model tries to predict job performance. The other tries to guess the candidate's disability status from the first model's internal representations. If the adversary succeeds, it means the predictor is leaking protected information, and the system penalizes it. Over training cycles, the predictor learns to make decisions that genuinely can't be reverse-engineered to reveal someone's neurotype.
You cannot fix emergent ableism with a wrapper around a biased model. The bias isn't a bug. It's a feature of how the system was built.
And then there's counterfactual testing — the part I find most intellectually honest. We take a real candidate's data, generate a synthetic twin where only the protected characteristic changes, and check whether the model's recommendation stays the same. Not "are group-level statistics balanced?" but "would this specific person get a different outcome if they weren't autistic?" That's the question the ADA actually asks. That's the question most hiring AI can't answer.
For the full technical breakdown of these methods — the math behind interventional invariance, the adversarial loss functions, the structural causal models — see our technical research paper.
The Regulators Aren't Waiting Anymore
One thing the Aon complaint made unmistakably clear: the era of "move fast and audit later" is over.
The FTC's "Operation AI Comply" initiative has already resulted in enforcement actions against companies making unsubstantiated AI claims. DoNotPay got hit with a $193,000 fine for overpromising what its AI legal tool could do. Rytr was targeted for generating fake reviews. The FTC has been explicit: if you claim your tool is "bias-free," you'd better have the empirical evidence to prove it. "We trained it on big data" is not evidence. It's a confession.
The EEOC, meanwhile, has made algorithmic discrimination a top enforcement priority. Their position is straightforward: employers are legally responsible for discrimination caused by the AI tools they purchase, even if the vendor sold them a bill of goods about fairness. You can't outsource your civil rights obligations to a software contract.
People sometimes ask me whether this regulatory pressure will slow down AI adoption in hiring. I think it's the wrong question. The pressure will slow down bad AI adoption. It will accelerate the market for tools that can actually demonstrate fairness — not with marketing copy, but with auditable evidence. Companies that invested in rigorous architecture will have a massive advantage. Companies that bought wrappers will have a massive legal bill.
Designing for Brains That Work Differently
There's a deeper issue underneath the technical and legal arguments, and it's the one I care about most.
Most hiring AI is built on what disability scholars call the "medical deficit" model — the assumption that neurodivergent traits are deviations from a norm that need to be detected and screened out. The entire architecture presupposes that there's a "correct" way for a brain to work, and the algorithm's job is to find candidates whose brains work that way.
This is not just ethically bankrupt. It's strategically idiotic.
Neurodivergent individuals frequently excel at exactly the capabilities companies say they're desperate for: deep pattern recognition, sustained attention to detail, creative problem-solving that breaks out of conventional frames. A hiring system that screens for "liveliness" and "social boldness" is systematically filtering out the people most likely to see what everyone else misses.
At Veriprajna, we've started building what I think of as temporally elastic assessment systems. Instead of comparing every candidate to a neurotypical baseline — average response time, typical speech cadence, expected emotional expression — the system establishes an individual baseline during the early stages of interaction. It learns what "normal" looks like for this person, not for some abstract average.
We also advocate hard for what should be obvious: every automated assessment must include a clear, penalty-free option to request a human alternative. The ADA requires reasonable accommodation. But beyond legal compliance, it's just good engineering. Any system that breaks when a user asks for a different interface is a fragile system.
The Question Nobody Wants to Answer
When I present this work, there's always a moment of uncomfortable silence. Usually it comes after I point out that the same AI tools Fortune 500 companies are using to "improve diversity" may be systematically excluding disabled candidates. Someone in the room — usually someone who signed the contract with the vendor — shifts in their seat.
The uncomfortable truth is that most enterprises have never audited their hiring AI for disability bias. They've checked for racial and gender disparities because those are the metrics regulators have historically focused on. But neurodivergence? It's not even in the dashboard.
The Aon complaint changes this. Not because Aon is uniquely bad — they're representative of an industry-wide approach. It changes things because it names the mechanism. It shows exactly how a "personality assessment" becomes a disability screen. And once you've seen it, you can't unsee it.
Any company using personality-proxy algorithms to screen candidates is systematically filtering out the very talent that drives innovation.
I think about my friend — the brilliant engineer who couldn't get past the personality screen. He eventually got hired by a company that did a live technical assessment instead. Within six months, he'd redesigned their entire data pipeline. The eleven companies that rejected him didn't just miss out on a good hire. They were told by an algorithm that he wasn't worth talking to.
That's not a bias problem. That's a broken system telling itself it works.
Where This Goes Next
The Aon-ACLU complaint isn't the end of something. It's the beginning of a reckoning that will reshape how every enterprise thinks about AI in human capital decisions.
By the time this wave of enforcement and litigation crests, the companies that will be standing are the ones that treated AI governance as an engineering discipline, not a PR exercise. The ones that demanded causal logic instead of correlation. The ones that audited for individual fairness, not just demographic parity. The ones that designed for the full spectrum of human cognition, not just the slice that happens to match the training data.
I didn't start Veriprajna to build compliance tools. I started it because I believe AI can be the most powerful equalizer in the history of hiring — but only if we build it right. Not wrappers on biased models. Not personality proxies dressed up as psychometrics. Deep systems that understand the difference between what a person can do and how their brain happens to be wired.
The algorithm that rejected my friend eleven times wasn't evil. It was just shallow. And in hiring, shallow is the same thing as discriminatory.
We can build deeper. We have to.


