A conceptual split showing the core tension of the article — a real damaged car photo vs. an AI-"enhanced" pristine version, representing the truth problem in insurance AI.

Artificial IntelligenceInsuranceComputer Vision

An AI "Fixed" a Wrecked Car and Denied the Claim. That's When I Knew the Industry Had a Problem.

Ashutosh Singhal February 18, 202612 min read

I stared at two photos of the same car.

The first was taken by a policyholder after a rear-end collision. Crunched metal, paint scraped to bare steel, a bumper that looked like it had been used as a speed bump. The second photo — supposedly the same vehicle, processed through the insurer's shiny new AI tool — showed a pristine rear end. Smooth lines, perfect paint, not a scratch. The automated claims engine looked at that second image and did exactly what you'd expect: it denied the claim. Zero visible damage.

The policyholder, standing in their driveway next to a car that very obviously had a destroyed bumper, sued for bad faith. And the insurer was left holding a piece of digitally fabricated evidence that contradicted physical reality.

This is the "Pristine Bumper" incident, and when I first read the details, I felt a mix of horror and vindication. Horror because an AI had effectively committed evidence spoliation — altering a legal record in a way that harmed a real person. Vindication because this was the exact failure mode my team and I had been warning about for months, the reason we built Veriprajna the way we did.

The insurance industry doesn't have an AI problem. It has a truth problem. And the tools most carriers are rushing to adopt are making it worse.

The Night the Dent Disappeared

Let me explain what actually happened in that bumper case, because the technical mechanism matters.

The insurer had integrated a generative AI tool into their mobile claims app. The stated goal was innocent enough: "enhance" the quality of customer-uploaded photos so adjusters could see damage more clearly. Better lighting, sharper details, that kind of thing.

But here's what generative image models actually do. They're trained on billions of images to learn what things should look like. In the model's mathematical universe — its latent space — a "car" is overwhelmingly represented as a smooth, symmetrical object with unbroken surfaces. That's what cars look like in the vast majority of photos on the internet.

So when this model encountered a dent, it didn't see damage. It saw noise. A statistical anomaly. A deviation from the expected pattern of "car." And it did what it was designed to do: it removed the noise. The model used a process called inpainting to digitally smooth the crunched metal back into a perfect fender, pixel by pixel.

A dent, to a diffusion model, looks like noise. The model removes it. In art, that's a feature. In insurance, it's the automated spoliation of evidence.

This wasn't a bug. The model worked exactly as designed. That's the part that keeps me up at night.

Why Does Generative AI Keep Getting This Wrong?

A comparison diagram showing how generative AI (semantic plausibility) vs. forensic computer vision (physical measurement) process the same damaged car photo, explaining why generative models fail at damage assessment.

I remember a conversation with a potential investor early on — maybe six months into building Veriprajna. He'd just come from a demo of another InsurTech startup, one that was using GPT-4 Vision to classify vehicle damage from photos. "Why aren't you just wrapping GPT?" he asked. "It's faster. It's cheaper. The demo looked great."

I pulled up two images on my laptop. One was a real photo of hail damage on a black sedan — tiny dimples invisible to the untrained eye, but clearly warping the reflections on the hood. The other was a deepfake I'd generated in about four minutes using a consumer-grade image tool: a pristine car with a digitally painted crack across the windshield.

I asked him: "Which one has real damage?"

He pointed to the deepfake.

That's the problem. Generative AI models — the ones powering the vast majority of "AI claims" startups right now — operate on semantic plausibility, not forensic reality. They're trained to understand what things look like, not what things are. A model that's brilliant at generating photorealistic images of cars is, by the exact same mechanism, terrible at determining whether damage in a photo is real, synthetic, or has been digitally erased.

And the companies building on top of these models? Most of them are what the industry calls wrappers — thin interface layers over someone else's API. They don't own the model. They don't control the training data. They can't explain why a decision was made. If OpenAI updates their model weights tomorrow to be more "aesthetically pleasing," a wrapper's damage assessment tool might start repairing cars with greater enthusiasm, and the InsurTech company wouldn't even know it happened.

The insurer, meanwhile, retains 100% of the liability.

I wrote about this dependency problem in more depth in the interactive version of our research, but the short version is: if you don't own the brain making decisions about your claims, you don't control your risk.

What Happens When Fraudsters Get the Same Tools?

Here's the twist that makes this even worse.

While insurers are accidentally using AI to delete damage, fraudsters are using the same technology to manufacture it. The barrier to entry for insurance fraud has essentially collapsed.

Someone can now photograph a perfectly intact vehicle, open a consumer image-generation tool, and prompt it to "add a smashed front bumper" or "simulate fire damage." Modern inpainting handles lighting, shadows, and reflections with terrifying realism. A standard AI image classifier — the kind most carriers use — will look at that deepfake and confirm: yes, this is a smashed car. It fails because it evaluates content, not the structural fingerprint of how the image was generated.

It gets darker. Criminal rings are using generative AI to create synthetic identities — hyper-realistic faces of people who don't exist, fake driver's licenses, fabricated medical records. These digital phantoms purchase policies, pay premiums for a few months to build legitimacy, then file catastrophic claims. In life insurance, AI-generated obituaries and coroner reports. In health insurance, X-rays showing fractures that never happened.

And the traditional defenses are failing. AI-generated images often have scrubbed or synthesized metadata. Human reviewers? Research shows they perform barely better than a coin flip at detecting high-quality deepfakes.

The same technology that lets an insurer "enhance" a photo lets a fraudster fabricate one. And most AI tools in the market can't tell the difference.

This is the arms race nobody in InsurTech wants to talk about honestly.

The Magnifying Glass, Not the Paintbrush

A three-layer architecture diagram showing Veriprajna's forensic analysis pipeline — semantic segmentation, monocular depth estimation, and specular reflection analysis — with what each layer detects.

There was a specific moment when the philosophy behind Veriprajna crystallized for me. My team and I were arguing — genuinely arguing, voices raised — about our technical approach.

One of our engineers wanted to fine-tune a large vision-language model for damage classification. It would have been faster to build, easier to demo, and frankly, it would have looked more impressive to investors. "The market wants generative," he said. "That's where the funding is."

I pulled up the Pristine Bumper case on the conference room screen. "This is where generative gets you," I said. "A lawsuit and a fabricated record."

The room went quiet. Then our lead computer vision researcher — who'd spent years in industrial inspection before joining us — said something I've never forgotten: "An adjuster doesn't need a paintbrush. They need a magnifying glass."

That became our design principle. We don't generate anything. We don't modify a single pixel. We measure.

Our architecture has three layers, and each one treats the image as evidence, not raw material:

Semantic segmentation identifies damage at the pixel level. Not "this car is damaged" — that's useless. Our models classify every individual pixel: this pixel is undamaged paint, this pixel is a scratch, this pixel is a dent, this pixel is rust. The output is a precise mask overlaid on the original, untouched image. Because we know the physical dimensions of specific car parts — a 2024 Toyota Camry bumper is 180cm wide — we can calculate the exact square-centimeter area of damage. That number feeds directly into repair estimation software.

Monocular depth estimation solves the problem that killed the bumper case: understanding 3D geometry from a flat photo. By training on massive datasets of car geometries with LiDAR ground truth, our models learn what the curvature of a wheel arch should look like, what the flatness of a door panel means. A dent shows up as a sinkhole in the depth map. We calculate gradients — a steep gradient means a sharp crease that probably needs panel replacement; a shallow gradient means a soft dent repairable with paintless dent repair. We can estimate the displaced volume of metal. Not a guess. A measurement.

Specular reflection analysis is the layer I'm most proud of, because it catches what everything else misses. Modern cars are shiny. Their surfaces act as mirrors. A dent on a glossy black car might not change the color of the pixels at all — but it warps the reflection. Straight lines in the environment (horizons, power lines, building edges) should follow the car's body curvature when reflected. A dent acts like a funhouse mirror, causing those lines to pinch, swirl, or break. We trained our models to decouple paint color from reflection patterns and reconstruct the surface normal map — a 3D vector representing the angle of the surface at every pixel. This detects hail damage invisible to the naked eye, structural buckling far from the impact site, and even previous repairs where sanding marks disrupt the clear coat's specularity.

For the full technical breakdown of all three layers, see our research paper.

Why Can't Insurers Just Explain Their AI Decisions?

A side-by-side comparison of what a generative AI system vs. a forensic AI system can produce when a regulator or court demands an explanation for a claim decision.

This is the question regulators are now asking, loudly, and most carriers don't have a good answer.

The NAIC — the National Association of Insurance Commissioners — issued a Model Bulletin that fundamentally changed the compliance landscape. It places responsibility for AI outcomes squarely on the insurer, even when the AI is a third-party tool. You cannot hide behind the wrapper excuse. If your vendor's model hallucinates or discriminates, you are liable. The bulletin mandates written governance programs, due diligence on vendor data lineage and model architecture, and — critically — the ability to explain any AI-driven decision to a policyholder.

Try explaining a claim denial powered by a generative model. "The model's probabilistic distribution preferred a smooth bumper" is not going to survive a courtroom.

Now compare that to what our system produces: "The claim was processed based on the detection of damage on the rear-left quarter panel. The system identified a scratch measuring 14cm in length and a dent with a surface area of 45cm², validated by depth map analysis." That's empirically verifiable. That's admissible.

The EU AI Act goes further. AI used for insurance risk assessment involving natural persons is classified as high-risk, triggering mandatory data governance, automatic event logging, and human oversight requirements. Our mask overlay technology — where the adjuster sees the original photo with a togglable analysis layer — is specifically designed for this. We don't replace the human. We augment them. They remain the decision-maker, which is a critical safe harbor under the Act.

And then there's spoliation. In the US legal system, altering evidence relevant to a legal proceeding — even unintentionally — can result in sanctions, adverse inference instructions (where the jury is told to assume the lost evidence was damaging to you), or summary judgment. When a generative AI tool introduces synthetic pixels into a claim photo, that is technically alteration. If the original was overwritten, that's spoliation.

We hash every original image with SHA-256 the instant it arrives. Our AI reads the image buffer but never writes to it. All analysis — masks, depth maps, reports — is saved as separate sidecar files linked to the original hash. Every access is logged. The evidence stays pristine.

If your AI can't prove it didn't alter the evidence, you've already lost the case before it starts.

The Arms Race Nobody Prepared For

People ask me sometimes whether deterministic computer vision is "enough" — whether we're being too conservative by refusing to use generative models.

I think they're asking the wrong question.

The right question is: what happens when your claims system can't distinguish between a real photo and a synthetic one? What happens when a fraudster's deepfake passes your AI classifier with higher confidence than a legitimate claim? What happens when your "enhancement" tool quietly fabricates evidence in a case that ends up in federal court?

Those aren't hypotheticals. They're happening now. And the carriers using general-purpose generative models as their first line of defense are bringing a paintbrush to a forensic investigation.

Our models are deterministic. You cannot prompt-inject a semantic segmentation network. You can't sweet-talk a depth estimation model into ignoring a dent. These systems operate on pixel intensity gradients and texture analysis — they extract features from the physical properties of light hitting a camera sensor. There's no instruction-following mechanism to exploit.

That's not conservatism. That's engineering for a world where the adversary has access to the same generative tools you do.

The Adjuster's Screen

I want to end with an image — not a photo, but a picture of what I think the future looks like.

An adjuster opens their dashboard. They don't see a "fixed" car. They don't see an AI's best guess at what the car might have looked like before the accident. They see the actual photo, taken by the policyholder, with a togglable damage mask showing exactly where the AI detected scratches, dents, and rust. They see a depth heatmap revealing that the dent on the rear quarter panel is 12mm deep with a steep gradient — sharp crease, likely needs replacement. They see the reflection analysis flagging subtle buckling three inches from the impact site that no human eye would catch.

They see an audit trail explaining every finding. And they make the call.

The AI didn't decide. It illuminated. The evidence wasn't altered. It was revealed.

That's the difference between a system that creates plausible fictions and one that measures inconvenient truths. The insurance industry was built on the principle that you pay for what actually happened — not for what a model thinks probably happened. Every pixel in a claim photo is a piece of evidence. The moment you let an AI change even one of them, you've left the domain of truth and entered the domain of probability.

And probability, in a courtroom, is just another word for reasonable doubt.