Why Companies Lose AI Liability Cases

In The AI Liability Squeeze, we covered the insurance industry’s new AI exclusions and what they mean for your business. The court rulings now coming down explain why insurers are pricing the risk this way. Companies are losing AI liability cases because their leaders couldn’t manage the boundary where the AI’s job ends and a human’s accountability begins. Having a governance policy statement or slide deck is not enough. The only protection is a documented, enforced boundary that holds up when lawyers ask hard questions under oath.

We came to this conclusion after looking at many recent cases, but we’ll focus on three of them here.

Empty courtroom with judge's bench and gallery seating

The Pattern

Kisting-Leung v. Cigna (E.D. Cal., March 2025) exposed what happens when “human oversight” becomes a theatrical performance. Cigna’s PxDx algorithm denied over 300,000 patient claims in a two-month period. Company physicians spent an average of 1.2 seconds “reviewing” each denial, apparently without opening patient files. The court was unimpressed by the argument that a physician technically pressed the button, ruling that Cigna’s effective interpretation of plan language requiring a “medical director” to make case decisions as allowing for an algorithm to do the analysis, even if a human director technically “pushes the button,” conflicts with the plain language of the plan.

1.2 seconds, no file review. That’s a rubber stamp with a medical degree.

Estate of Lokken v. UnitedHealth Group (D. Minn., Feb. 2025) followed a similar script. UHG’s nH Predict AI tool determined Medicare Advantage coverage, allegedly overriding the judgment of treating physicians who had examined patients in person. Employees who deviated from the AI’s recommendations were allegedly disciplined and terminated. The result: 90% of AI-generated denials were overturned on appeal. The human review layer hadn’t just been weakened, it had collapsed. The organization built a system that punished the very oversight it claimed to prioritize.

Then came Pieces Technologies (TX AG settlement, Sept. 2024), the first state attorney general enforcement action against a healthcare AI company. Pieces claimed its clinical AI tool had a hallucination rate of less than 0.001%. Apparently based on that assurance, hospitals deployed it on live patient data with no verification infrastructure to test the claim. The Texas AG’s office was blunt: “AI companies offering products used in high-risk settings owe it to the public and to their clients to be transparent about their risks, limitations, and appropriate use.” It wasn’t enough to trust the vendor’s numbers. The hospitals are required to have an independent process for checking whether the AI was right.

In each of these cases, humans were nominally involved. But none of these organizations could produce documentation showing who was responsible for reviewing AI outputs, when escalation was required, or what “good enough” verification meant. The humans were present. The boundaries were absent.

Insurers see the same problem the courts do. They price risk based on predictability, and when a company can’t articulate its human-AI boundaries, they can’t model the risk. When they can’t model the risk, one of two things happens: blanket exclusions, or premium increases that make the coverage meaningless anyway.

In the first post, we highlighted WR Berkley’s exclusion clause covering “inadequate or deficient policies, practices, procedures, or training relating to AI.” Read that clause again in light of the cases above. It’s the insurer essentially saying, “if you can’t define the boundary, we won’t cover either side of it.”

The trend has accelerated. Munich Re has made its underwriting position explicit. They need to understand how AI was implemented before they’ll write a policy. Undocumented AI governance is unmanaged operational risk, and insurers price it accordingly.

Willis Towers Watson highlighted a related concept in December 2025: scholars have referred to having a human in the loop as a “liability sponge” that absorbs the blame but has little to no ability to mitigate risks or prevent harms. This is exactly what happened at Cigna and UHG. Physicians and employees were positioned as oversight, but the system gave them neither the time nor the authority to oversee anything. From an insurance perspective, nominal oversight fails for the same reason it fails in court. It creates liability without creating control.

The industry is drawing a clear line between “AI present” and “AI governed.” Underwriters want evidence — documented processes, defined roles, enforced escalation paths. Assurances and org charts don’t count. Only evidence of human review and verification counts.

The Standard

If the losing cases show what breaks, a few recent developments show what holds.

Walters v. OpenAI (May 2025, Gwinnett County Superior Court, GA) is the clearest example. ChatGPT hallucinated that a radio host had embezzled funds from a nonprofit. The host sued. The court granted summary judgment for OpenAI, crediting three specific elements: extensive on-screen disclaimers warning users that outputs may be inaccurate, terms of use explicitly disclosing hallucination risk, and expert testimony that OpenAI “leads the AI industry in attempting to reduce and avoid mistaken output.”

This is the first decided case we know of where documented AI safety practices and disclosure directly won a legal defense. OpenAI won by demonstrating what it had done to define and manage the boundary between AI output and user reliance.

The DOJ’s updated Evaluation of Corporate Compliance Programs (Sept. 2024) sends the same signal from the enforcement side. Federal prosecutors must now evaluate a company’s approach to AI governance when deciding charges and penalties, including whether the company has conducted risk assessments for AI use, what governance structures and controls it has implemented, and how it trains employees on new technologies. Documented governance is now explicitly a mitigating factor in federal prosecution decisions.

The insurance market is responding in kind. Armilla AI and A-LIGN launched an ISO 42001 certification program in December 2025, under which organizations that complete certification “gain preferential access to Armilla’s affirmative AI insurance, with coverage tailored to their risk profile and validated controls.” The market is no longer just penalizing the absence of governance, it is rewarding its presence with better pricing.

The thread is consistent. What wins is artifacts — written processes, signed-off escalation paths, training records — that prove the boundary between AI and human action has been documented and enforced.

Loss vs. win comparison: same technology, opposite outcomes — the difference is documented human oversight

Courts, regulators, and insurers are converging on the same expectation. The CFPB stated it plainly: “There is no ‘advanced technology’ exception to Federal consumer financial laws.” The tool is new; the accountability isn’t.

A February 2026 article in the Harvard Journal of Law and Technology drew the distinction between “meaningful oversight” and “warm body roles where an operator lacks influence.” The EU AI Act’s Article 14 codified the same principle, requiring that human oversight be substantive and addressing “automation bias” — the documented tendency for humans to rubber-stamp AI output rather than genuinely review it.

The standard is the same across every domain: if you use AI, you must be able to show a documented, enforced process for human oversight. A human in the room and a name on an org chart aren’t proof of anything. The proof is a process with defined triggers, clear responsibilities, and a paper trail.

Every organization deploying AI in a consequential decision should be able to answer one question on demand: who reviewed this output, against what criteria, and where is the record? If the answer is a job title and a verbal assurance, the boundary doesn’t exist yet — and a court, a regulator, or an insurer will eventually point that out. Reach out if you want help building that boundary before you need to defend it.

Why Companies Lose AI Liability Cases

The Pattern

The Standard

More to read

The AI-Beats-Doctors Study Didn't Measure ER Medicine

Rentahuman.ai is a stunt, but the architecture is real

Refactoring Agents

Where does AI belong in your processes?