Healthcare Insurance AI Governance

Your AI Makes Coverage Decisions.
Can You Defend Them in Court?

The Lokken v. UnitedHealth class action proved that a 90% appeal overturn rate is not a technical problem. It is a breach of contract. A federal court is now reviewing nH Predict's internal development documents, training data, and validation reports.

If your Medicare Advantage plan uses AI in utilization management, prior authorization, or claims processing, the question is not whether your algorithms will face scrutiny. It is whether they will survive it.

90%

AI denials reversed on appeal

Lokken v. UnitedHealth litigation filings

$19.7B

Annual provider spending fighting denials

AMA / industry data, 2025

March 2026

CMS PA metrics now publicly reported

CMS-0057-F Phase 2 deadline

How Utilization Management AI Creates Liability

The nH Predict failure was not a software bug. It was an architectural flaw that applies to most AI systems deployed in Medicare Advantage coverage decisions today.

The Mechanics of Algorithmic Denial

Here is how a typical UM AI workflow generates liability. A prior authorization request arrives with a diagnosis code (ICD-10), a procedure code (CPT/HCPCS), patient demographics, and clinical notes. The AI model cross-references this against a training dataset of historical claims to predict length of stay, medical necessity, or approval probability.

The failure point is what the model weights versus what it ignores. nH Predict weighted diagnosis-based recovery timelines heavily but assigned minimal weight to individual clinical indicators like blood oxygen levels, caregiver availability, or comorbidity interactions. A patient with methemoglobinemia (a life-threatening blood disorder) was discharged based on the average recovery timeline for her diagnosis group, not her actual clinical status. Her family paid $16,768 out-of-pocket to prevent premature discharge.

This is not an edge case. It is the predictable outcome of deploying a correlation-driven model in a domain where individual clinical variation determines medical necessity. The model optimizes for population-level throughput. Medicare coverage standards require individual-level clinical judgment.

When NaviHealth managers narrowed the acceptable variance from nH Predict's projections from 3% to 1%, they converted a decision-support tool into an automated gatekeeper. Clinicians who overrode the algorithm faced disciplinary action. At that point, the "human-in-the-loop" became performative, and every denial generated by the system carried the full weight of contractual and regulatory liability.

The Contractual Trap

Your Evidence of Coverage documents promise that coverage decisions are made by "clinical services staff" and "physicians." If your AI makes the determination and a human rubber-stamps it, you have the same breach-of-contract exposure the Lokken court identified. Review your EOC language against your actual UM workflow. If they diverge, opposing counsel will find the gap.

The Discovery Problem

The March 2026 discovery order in Lokken (2026 WL 658883) granted plaintiffs access to AI development documents, training data specifications, and validation reports. Every MAO should now assume their AI documentation is discoverable. If your model lacks structured decision logs, version-controlled training data records, and documented validation results, you cannot defend what you cannot reconstruct.

The Regulatory Timeline You Cannot Ignore

Three regulatory forces are converging on healthcare AI governance simultaneously. Each has specific deadlines, specific requirements, and specific penalties.

CMS-0057-F: The Prior Authorization Final Rule

Jan 1, 2026 (in effect)

72-hour expedited PA turnaround. 7-day standard. No reopening approved inpatient admissions except fraud.

Mar 31, 2026 (current)

Public reporting of 8 PA metrics: denial rates, turnaround times, appeal overturn rates at contract level.

Jan 1, 2027

HL7 FHIR Prior Auth APIs required (CRD, DTR, PAS). Full electronic PA transaction trail.

State AG Enforcement

Texas AG settled the first healthcare generative AI investigation (Pieces Technologies, September 2024) and the Texas Responsible AI Governance Act took effect January 2026, granting broad civil investigative demand power. Pennsylvania introduced legislation requiring human provider review before any AI-driven denial, mandatory insurer disclosure of AI use, and annual compliance statements.

Multi-state MAOs face a patchwork: each state may impose different AI transparency, audit, and disclosure requirements. A single governance architecture must satisfy all of them.

EU AI Act (for plans with global operations)

Healthcare AI classified as "high-risk" under Annex III. Full compliance obligations by August 2027. Penalties up to 6% of global annual turnover. Requirements include risk management plans, training data documentation, human oversight mechanisms, and continuous post-deployment monitoring.

The convergence risk: CMS is simultaneously scaling its own AI-powered audit capability. Payment Year 2020 RADV audits began in February 2026 using anomaly detection to flag unsupported diagnoses and statistical outliers. CMS audits your AI while requiring you to govern it. The plans that build governance infrastructure first turn compliance from a burden into a competitive advantage.

Who Else Solves This (and Where They Stop)

Every MAO evaluating AI governance has five options. Each addresses part of the problem. None addresses all of it.

Approach What You Get Where It Stops Typical Cost
AI Governance Platforms
Credo AI, Holistic AI, IBM Watsonx
Policy packs, compliance dashboards, bias monitoring, automated evidence collection Monitors existing models but does not rebuild flawed decision architecture. If your UM AI is fundamentally wrong (like nH Predict), monitoring it better does not fix it. $150K-500K/yr platform license
PA Automation Vendors
Cohere Health, FinThrive, Availity
Faster PA processing, reduced admin cost (47% claimed by Cohere), improved turnaround times Optimizes throughput, not defensibility. Does not produce per-decision explanations, demographic disparity analysis, or litigation-ready audit trails. $200K-1M/yr depending on volume
Big 4 / Large SIs
Deloitte, Accenture, McKinsey
Strategy, governance framework design, platform selection, implementation management They deploy packaged governance platforms (Credo AI, Watsonx) and write policy documents. They do not build custom explainability middleware for your specific Facets/QNXT configuration. Engagements run $500K-5M+ and take 6-18 months. $500K-5M+ per engagement
Claims Platform Vendors
Cognizant/TriZetto (Facets), HealthEdge
AI add-ons native to their claims platform, integrated analytics, UM modules Conflict of interest: the same companies maintaining your claims platform sell AI add-ons for it. They are not incentivized to surface governance gaps in their own systems. Vendor lock-in compounds the problem. Bundled with platform contract
Internal Build Full control, no vendor dependencies, customized to your specific claims workflows Requires specialized talent (ML engineers who understand CMS regulations, claims adjudication workflows, and legal defensibility simultaneously). Most MAO data science teams are optimized for analytics, not governance architecture. Build timeline is 12-24 months if the team exists. $1-3M+ in talent + infrastructure
Veriprajna Algorithmic audit + explainability middleware + CMS compliance architecture + litigation readiness, custom-built for your claims stack We are a consultancy, not a platform. We build and hand off. If you need a permanent SaaS monitoring dashboard, you still need a governance platform (we help you select and integrate the right one). We do not replace your clinical operations team's judgment. Scoped per engagement

What We Build for Medicare Advantage Organizations

Each capability is custom-built to integrate with your existing claims processing stack. We do not sell a platform. We build the specific governance infrastructure your plan needs.

Algorithmic Decision Audit

We reverse-engineer your UM AI to map every decision pathway. SHAP attribution analysis across a representative denial sample produces a feature importance map: which inputs drive denials, which clinical indicators are underweighted, and where demographic proxies (zip code, dual-eligible status) introduce disparity.

The output is a court-defensible audit report with feature attribution maps, demographic disparity analysis, and a risk-ranked list of decision pathways most likely to fail on appeal. For vendor black-box models, we include a vendor transparency assessment documenting what your vendor can and cannot produce under discovery.

Typical timeline: 6-10 weeks for a single UM model.

Explainability Middleware

A decision explanation layer that sits between your claims platform (Facets, QNXT, HealthEdge) and your UM AI. Every coverage determination gets a structured explanation: which input features drove the decision, the model's confidence score, and a natural-language rationale a physician reviewer can read in under 30 seconds.

For low-confidence predictions or cases with comorbidities not well-represented in training data, the system routes to human review with pre-populated clinical context. This is not a monitoring dashboard. It is an architectural intervention that makes every individual decision auditable and explainable.

Integration points: REST API, HL7 FHIR-compatible, batch and real-time modes.

CMS Compliance Architecture

We design the technical infrastructure for CMS-0057-F compliance: PA metric collection pipelines mapping to all 8 required metrics, demographic fairness monitoring aligned with NIST AI RMF's MEASURE function, and an immutable audit trail for every AI-assisted coverage determination.

For the January 2027 FHIR API mandate, we build the CRD/DTR/PAS integration layer so your PA workflow produces a complete electronic transaction record by design. Plans that build this now can turn the compliance burden into operational intelligence: real-time visibility into PA patterns, bottlenecks, and denial hotspots before CMS sees them.

Scope: middleware that plugs into your existing claims stack. Not a platform replacement.

Litigation Readiness Engineering

After the March 2026 Lokken discovery order, every MAO should architect AI systems for legal defensibility from day one. We build tamper-evident decision logging with append-only storage and cryptographic hashing, version-controlled model documentation, and structured explanation records that meet the evidentiary standards emerging from the case.

We also run red-team exercises simulating plaintiff discovery requests. Our team walks through exactly what opposing counsel would request, what your systems can currently produce, and where the gaps create exposure. The goal is to identify defensibility gaps before litigation forces you to confront them under time pressure.

Deliverable: discovery readiness report + technical remediation plan.

How an Engagement Works

Every engagement starts with the audit. The audit findings determine what to build. We do not prescribe a solution before understanding your specific claims architecture, UM workflows, and regulatory exposure.

1

Algorithmic Audit (6-10 weeks)

We map your AI decision pathways, run SHAP attribution on a representative sample of denials, analyze demographic disparity patterns, and assess your vendor's documentation against discovery standards. Output: a risk-ranked report identifying which decision pathways carry the highest litigation and regulatory exposure.

Requires: access to model predictions and input features (not source code), 12-24 months of denial data with outcomes, claims system architecture documentation.

2

Architecture Design (4-6 weeks)

Based on audit findings, we design the explainability middleware, compliance pipelines, and litigation readiness infrastructure specific to your claims stack. This phase produces detailed technical specifications, integration diagrams, and a phased implementation plan.

Joint working sessions with your data science, clinical operations, and compliance teams. We need to understand not just the technology but the human workflow around it.

3

Build and Integration (8-16 weeks)

We build the governance middleware, integrate it with your claims platform, validate explanation quality against clinical reviewer feedback, and stress-test the system against edge cases identified during the audit. Deployment is incremental: one decision category at a time, starting with the highest-risk pathways.

Caveat: integration timelines depend heavily on your claims platform's API maturity. Facets (TriZetto) and QNXT have different middleware requirements. HealthEdge's API layer is generally more accessible. We scope realistically.

4

Handoff and Governance Operationalization (4-6 weeks)

We transfer ownership to your team with full documentation, runbooks, and monitoring protocols. We help establish or restructure your AI governance committee with a defined charter, escalation procedures, and a model change management process. The system is yours to operate.

Optional: quarterly governance review retainer for ongoing model validation, regulatory change assessment, and audit trail verification.

Total engagement timeline: 22-38 weeks from audit kickoff to full handoff. The audit phase (Phase 1) can run as a standalone engagement if you need to understand your exposure before committing to a build. Many plans start there.

Medicare Advantage AI Governance Readiness Assessment

Answer six questions about your current AI governance posture. The assessment produces a readiness score with specific next steps you can act on immediately, whether or not you engage Veriprajna.

1. Can you produce a per-decision explanation for any coverage determination your AI assisted with in the last 12 months?

2. Do you track denial rates segmented by patient demographics (age cohort, geography, dual-eligible status)?

3. What is your clinical reviewer override rate when the AI recommends a denial?

4. Do you have an AI governance committee with documented authority to halt or modify AI deployments?

5. Are you compliant with CMS-0057-F Phase 2 (PA metric reporting due March 31, 2026)?

6. If opposing counsel issued a discovery request for your AI's decision logic, training data, and validation results tomorrow, could you produce them within 30 days?

Questions Medicare Advantage Plans Ask About AI Governance

How do we audit AI algorithms used in prior authorization for Medicare Advantage?

Start with a decision pathway decomposition. Your UM AI makes coverage determinations based on inputs (diagnosis codes, procedure codes, patient demographics, historical utilization patterns). The audit traces every pathway to identify which features drive denials. We run SHAP attribution analysis across a representative sample of recent denials to produce a feature importance map.

The critical output is a disparity analysis: denial rates segmented by age cohort, geographic region, dual-eligible status, and diagnosis group. If your algorithm denies post-acute care at 22% for one demographic segment and 9% for another, that gap needs an explanation that will survive a plaintiff deposition.

The audit also examines your model's training data vintage. If your UM AI was trained on 2019-2021 claims data, it learned denial patterns from a period when CMS oversight was lighter. Those patterns may no longer reflect current medical necessity standards or the clinical guidelines CMS references in its audit protocols. We flag stale training data as a litigation risk factor and recommend retraining schedules aligned with CMS guideline update cycles.

For plans running vendor black-box models (which describes most MAOs), the audit includes a vendor transparency assessment: what documentation does your vendor provide about model architecture, training data composition, and validation methodology? After the Lokken discovery order, this documentation is discoverable. If your vendor cannot produce it, that gap is your liability.

What does the nH Predict class action mean for other health plans using AI?

The Lokken v. UnitedHealth case established two precedents that apply to every MAO using AI in coverage decisions. First, the court ruled that substituting AI for the physician review promised in policy documents constitutes a potential breach of contract. If your member-facing materials say coverage decisions are made by "clinical staff," but your workflow routes determinations through an algorithm before (or instead of) physician review, you have the same contractual exposure UnitedHealth faces.

Second, the March 2026 discovery order (2026 WL 658883) granted plaintiffs access to internal AI development documents, training data specifications, and validation reports. This means every MAO should assume their AI documentation is discoverable in future litigation.

The practical implications: review your Evidence of Coverage documents and Summary of Benefits for language about how coverage decisions are made. If they reference "clinical review by physicians," your AI workflow must demonstrably support (not replace) that review. Implement decision logging that captures the AI recommendation, the human reviewer's assessment, and whether the human agreed or overrode the algorithm. Plans that can show a genuine human-in-the-loop process with documented override rates have a fundamentally different litigation posture than plans where the AI output is rubber-stamped.

How do we make AI coverage decisions court-defensible?

Court-defensibility requires three layers. The explanation layer produces a structured rationale for each coverage determination that a non-technical audience (judge, jury, CMS auditor) can understand. This is not a raw SHAP plot. It is a natural-language statement like: "Coverage for 14 additional days of skilled nursing was denied because the model weighted diagnosis recovery timeline (42% influence) and prior utilization pattern (31% influence) above the patient's reported functional limitations (8% influence)." When opposing counsel asks why a specific patient was denied, you produce this record in minutes.

The audit trail layer captures decision metadata with tamper-evident logging: model version, input features, confidence score, routing decision (auto-approve, auto-deny, or human review), reviewer identity, and final determination. We use append-only storage with cryptographic hashing so the record cannot be altered after the fact. In Lokken, one of UnitedHealth's vulnerabilities was the inability to reconstruct exactly how nH Predict reached specific determinations for specific patients.

The override documentation layer tracks every instance where a human reviewer disagreed with the AI recommendation. Courts will examine your override rate. If it is near zero, it suggests the human review is performative. If it is 15-25%, it demonstrates genuine clinical judgment. We help you establish thresholds and escalation protocols that produce a defensible override pattern.

What does CMS-0057-F require for AI in prior authorization by 2027?

CMS-0057-F unfolds in three phases. Phase 1 (January 1, 2026, now in effect): MA plans must process expedited PA requests within 72 hours and standard requests within 7 calendar days. Plans cannot reopen previously approved inpatient admissions except for fraud or clear error. This operational change affects AI-assisted workflows because models optimized for throughput now face hard turnaround deadlines that may conflict with human review requirements.

Phase 2 (March 31, 2026, the current deadline): Plans must publicly report 8 PA metrics at the contract level, including approval and denial rates, average turnaround times, and appeal overturn rates. This reporting makes your AI's denial patterns visible to regulators, plaintiff attorneys, the media, and competitors. If your denial rate is significantly above the MA average (15.7% as of 2025 data), expect scrutiny.

Phase 3 (January 1, 2027): Plans must implement HL7 FHIR-based Prior Authorization APIs, specifically Clinical Decision Rules (CRD), Documentation Templates and Rules (DTR), and Prior Authorization Support (PAS). This is a significant IT investment. The FHIR mandate effectively creates a standardized electronic record of every PA transaction, making your AI decision pipeline more transparent and auditable by design.

Plans that build their compliance architecture now, rather than scrambling in Q3 2026, can turn this mandate into a governance advantage. CMS suspended certain transparency requirements (health equity expertise on UM committees, plan-level metric breakdowns) in June 2025, but the core reporting and API mandates remain.

How do we set up an AI governance committee for a health plan?

The governance committee must bridge three domains that rarely talk to each other inside an MAO: clinical operations (who understands medical necessity criteria and CMS coverage guidelines), technology (who understands the AI models, their training data, and their failure modes), and legal/compliance (who understands the litigation and regulatory exposure).

We recommend a 7-9 person committee with defined roles: a Chief Medical Officer or VP of Clinical Operations as chair, a data science lead who can explain model behavior in plain language, a compliance officer tracking CMS and state regulatory requirements, legal counsel with health insurance litigation experience, a member services representative who sees the downstream impact of denial decisions, and 2-3 rotating clinical reviewers who interact with the AI daily.

The committee should meet monthly with a standing agenda: review AI decision metrics (denial rates by segment, override rates, appeal outcomes), assess any model changes or retraining events, evaluate new regulatory requirements, and triage any flagged incidents.

What makes a governance committee effective versus performative is authority. The committee needs a documented mandate to halt AI deployments, require retraining, or mandate human review for specific decision categories. If the committee can only recommend but not enforce, it exists for optics. After the Lokken case, a committee with enforcement authority is a litigation defense asset. One without it is a liability because it demonstrates awareness of risks without action.

What is the real cost of AI denial litigation for a Medicare Advantage plan?

The cost model has four layers. Direct litigation costs for a class action of Lokken's scope run $5-15M in legal fees over 3-5 years, depending on whether the case settles or goes to trial. That figure does not include potential damages, which in a class of millions of Medicare beneficiaries could reach hundreds of millions.

Regulatory remediation costs follow litigation. CMS can impose civil monetary penalties, require corrective action plans, and in extreme cases suspend enrollment. The average corrective action plan implementation costs MAOs $2-8M in technology, process redesign, and independent monitoring.

Operational disruption is the hidden cost. The Lokken discovery order required UnitedHealth to produce internal AI documents, diverting engineering and legal teams from operational work. For a mid-size MAO (500K-2M members), comparable discovery compliance would consume 6-12 months of a data science team's capacity.

Reputational damage affects Star Ratings, member retention, and broker relationships. MA plans compete on Stars; a public AI governance failure that generates media coverage depresses member satisfaction scores (CAHPS), which flow into Star calculations. A one-star drop costs approximately $500 per member per year in bonus payments. For a plan with 1M members, that is $500M annually. The business case for governance is straightforward: a comprehensive algorithmic audit and compliance architecture costs a fraction of any single component of litigation exposure.

Technical Research

Our analysis of algorithmic governance in healthcare insurance, including the full nH Predict case study and the regulatory compliance framework.

The Governance Frontier: Algorithmic Integrity, Enterprise Liability, and the Transition from Predictive Wrappers to Deep AI Solutions

Technical deep-dive on the nH Predict failure, causal AI alternatives, FDA credibility framework mapping, and the NIST AI RMF operationalization for healthcare payers.

Your AI's Denial Patterns Are Now Public Record

CMS-0057-F Phase 2 requires public reporting of PA metrics as of March 31, 2026. Regulators, plaintiff attorneys, and the media can see your numbers.

A corrective action plan after a CMS audit costs $2-8M. A class action defense runs $5-15M before damages. A comprehensive algorithmic audit and governance architecture costs less than either and prevents both.

Algorithmic Audit & Risk Assessment

  • ✓ Decision pathway mapping for UM AI models
  • ✓ SHAP-based feature attribution and disparity analysis
  • ✓ Vendor transparency assessment (discovery readiness)
  • ✓ Risk-ranked exposure report with remediation priorities

Governance Architecture & Build

  • ✓ Explainability middleware for Facets/QNXT/HealthEdge
  • ✓ CMS-0057-F compliance infrastructure (metrics + FHIR APIs)
  • ✓ Tamper-evident decision logging and audit trails
  • ✓ Governance committee charter and operationalization