A radar satellite image with two near-identical dark patches: floodwater and a mountain radar shadow
Artificial IntelligenceInsuranceMachine Learning

When a $2M Flood Payout Rests on Telling Water From a Shadow

Ashutosh SinghalAshutosh SinghalMay 19, 202612 min read

The first time it really landed for me, I was looking at a single grayscale image on a second monitor at close to midnight. It was a Sentinel-1 radar frame — the free European satellite data everyone in flood work starts with — and somewhere in that frame was a flood. The radar paints smooth water as dark, almost black, because a flat water surface bounces the signal away from the satellite instead of back to it. So I was looking for the dark patch.

The problem was that there were two dark patches. One was floodwater. The other was the shadow a mountain ridge casts in radar geometry, where the terrain blocks the beam and nothing comes back. On the screen, at that hour, they looked like twins. And I remember the specific, uncomfortable thought: somewhere there is an insurer who is going to pay or not pay a claim based on which of these I call water.

That is the whole problem of satellite flood intelligence for parametric insurance, compressed into one image. A parametric policy doesn't wait for an adjuster to inspect a flooded warehouse. It pays automatically when a measured trigger fires — a satellite says the flood reached a certain extent, and the money moves. It's a beautiful idea for a world where catastrophe is accelerating. It also means the satellite's classification is the claim. There is no human standing in the water to overrule it.

A parametric trigger turns one reading of one satellite pixel into a wire transfer.

And the satellite gets it wrong in both directions.

The physics nobody wants on the slide deck

Here is the part that took me longest to accept, because I came in assuming it was a software problem I could out-engineer. Cloud shadows and floodwater look nearly identical in optical satellite imagery — not similar, identical in the bands that matter. Both absorb near-infrared and shortwave-infrared light. Both have soft, amorphous boundaries. Both flatten the texture of whatever is underneath them into a uniform smear.

The standard water-detection indices — NDWI and MNDWI, the math every flood pipeline leans on — flag both as "water-like," because the underlying physics really is the same: reduced reflectance across the infrared bands. A single optical frame captured during a storm cannot reliably tell a shadow from a flood, and no amount of clever thresholding fixes a signal that is genuinely ambiguous at the source.

I spent a month convinced the answer was a better model. We had the deep-learning literature on our side, and it's genuinely impressive — a modified DeepLabV3 network fed multi-polarization radar and terrain data hit 98.4% overall accuracy in published research; a U-Net with transformer attention reported 95.5%. Those numbers are real. I quoted them in an early pitch. What I didn't say out loud, because I didn't yet understand it, is that those accuracies are measured on curated benchmark scenes, and that the disaster training sets they learn from are deliberately tilted.

They're tilted for a humane reason. Missing a real flood can kill people, so the labels and loss functions penalize a missed flood far more harshly than a false alarm. The result is a generation of classifiers that are systematically trigger-happy — when the signal is marginal, they lean toward "flood." That is exactly the right bias for a humanitarian early-warning system and exactly the wrong bias for a $2M payout authorization, where a false alarm is not a rounding error. It's a wire transfer.

The demo that flagged a mountain as flooded

We built the obvious thing first. A single-frame classifier, radar-led so it could see through clouds, tuned hard, demoed clean on the scenes we'd picked. I was proud of it. Then a colleague who actually understood radar geometry ran it over a mountainous test region, and it lit up a slope as flooded with high confidence. Not a marginal call — a confident one. It had found a radar shadow, the same kind of dark patch that had unsettled me on that midnight frame, and called it water.

I still think about the imaginary email that scene would have generated if it had been a live trigger. An actuary, three time zones away, looking at our automated determination that a hillside was under two meters of water, writing back the only question that matters: are you sure that's water and not a shadow? And our system having no answer beyond a confidence score it had no right to.

A confidence number is not evidence. It's an opinion with decimal places.

That was the failure that paid for everything after it. Because the lesson wasn't "tune the model harder." The lesson was that no single frame and no single sensor can carry the proof, and I had been trying to win an argument that the physics had already lost.

Radar is sold as the cure — it sees through clouds, after all, which optical sensors can't. But radar backscatter drops over smooth water and over terrain shadow, layover, and foreshortening in rough country. It has its own false positives in arid ground, over freeze-thaw cycles, in industrial crops, and worst of all in cities, where the double-bounce of the signal off buildings can mask standing water entirely. NASA's own global near-real-time flood product makes the point at planetary scale: its one-day composite has so many false positives that NASA refuses to publish it in their public Worldview viewer. Only the two- and three-day composites — the ones that use time, not a single look — are considered operational. New reservoirs read as floods in those products for up to three years, until the permanent-water mask catches up.

The pattern was sitting there the whole time. The fix for false positives was never a sharper image. It was a second look, and a third — the same scene across time.

What Actually Separates a Flood From a Shadow?

A 3x3 temporal matrix showing how floodwater, cloud shadow and radar shadow each behave across before, during and after

What separates a flood from a shadow is not how it looks in one frame. It's how it behaves across many. A mountain's radar shadow is in the same place every single pass, year-round — it's geometry, not weather. Floodwater appears against a dry baseline and recedes. A cloud shadow is gone on the next clear day. The discriminating signal lives in the temporal stack, not the pixel.

So we stopped asking "is this pixel water?" and started asking "did this place change in a way only a flood explains, against everything we know about how it normally looks?" That reframing is the entire architecture of what we ended up building, and it's why we lean on multiple sensors rather than betting on one. A radar frame to see through the storm clouds. Optical frames before and after to establish the dry baseline and confirm the recession. A terrain model to rule out the geometry that fakes water in both. Fusing radar and multispectral optical has been shown to improve detection in cloud- and shadow-affected areas by more than 23% over single-sensor methods — not because the sensors are individually better, but because each one's failure mode is something the others can see through.

This is the layer we build at Veriprajna: a satellite flood verification system that separates shadows from water using temporal SAR-optical fusion, and produces a forensic-grade evidence trail for every trigger event — not a verdict, an argument you can hand to an actuary or a court.

The honest engineering caveat, the one I'd want a buyer to hear before signing anything: the free data has a clock. The restored two-satellite Sentinel-1 constellation gives a six-day revisit. A flood peak that crests and drains between passes can be missed by data that's free but slow. So part of the craft is knowing when free Sentinel data is enough and when an event demands tasking a commercial radar satellite — which runs somewhere between a thousand and five thousand dollars a scene depending on resolution and urgency. That tradeoff is a per-event judgment, not a subscription tier.

Why Not Just Buy the Flood Report?

We had a real internal argument about whether any of this was worth building. The case against was simple: companies already sell flood reports. Why not resell those?

It's a fair challenge, and the vendors are genuinely good. ICEYE owns the largest commercial radar constellation in the world — 60-plus satellites — and during Hurricane Helene it pushed 150-plus images through the storm clouds and mapped 80,000-plus buildings in Florida. That's extraordinary. But you are buying their product, not building an intelligence capability, and per-event forensic verification at portfolio scale runs into their pricing fast. Floodbase, backed by Munich Re and partnered with the Capella radar constellation, has built a genuine end-to-end parametric solution — trigger design, pricing, payout certification. But you get their methodology, tuned to their sensor partnerships, not a system shaped to your specific book of business.

And the free gold standard has its own ceiling. Europe's Copernicus Emergency Management Service is the system a continent trusts for disaster mapping. Yet its rapid-mapping service tier runs 8am to 8pm Brussels time, on working days only. I keep coming back to Valencia, October 2024 — a year's worth of rain in eight hours, more than 227 dead. Copernicus took three to four days to publish the flood extent. When it came, it confirmed 15,633 hectares and roughly 190,000 people affected. The delay wasn't a bug. The critical first 24 hours fell overnight and into the evening, outside the service window. The continent's flood-intelligence system was, functionally, closed during the hours it was needed most.

So the argument resolved itself. The vendors are excellent at what they're built for. None of them is built to be a vendor-neutral verification layer that fuses whichever sensors the event actually requires and hands you defensible evidence for your trigger. That gap was the company.

The failure that hurts more: the flood that doesn't pay

Side-by-side of the two parametric failure modes: false positive draining reserves vs false negative leaving a real flood unpaid

For a long time I thought of this work as a false-positive problem — keep the system from paying for floods that didn't happen. Then I read about Nagaland.

A parametric flood scheme in Nagaland, India failed to trigger despite heavy rainfall and confirmed flooding on the ground. The satellite-derived threshold had been set too high relative to ground reality. People were flooded. The policy was supposed to pay. It didn't. That is the other failure mode, and in human terms it's the worse one.

This is what the industry calls basis risk — the gap between what the trigger measures and what the policyholder actually experiences. A false positive drains an insurer's reserves and invites fraud. A false negative destroys the policyholder's trust and breeds litigation. Both failures attack the same thing: the credibility of the parametric model itself — which makes it harder to sell and harder for regulators to approve. You cannot solve one direction by leaning harder the other way. Tightening the threshold to stop false payouts is exactly how you manufacture a Nagaland.

You don't fix basis risk by choosing which way to be wrong. You fix it by being able to prove what was actually under water.

That's why the output of our system isn't a yes/no. It's a per-pixel confidence map with an audit trail — what was observed, by which sensors, at what times, with what certainty, and what was ruled out. Evidence that survives actuarial review and legal scrutiny, because in this business the trigger decision is eventually going to be argued by someone who wasn't there.

Why Is Parametric Flood Cover Suddenly Everyone's Problem?

The reason any of this matters more each year is brutally simple: the water is winning. Global insured natural-catastrophe losses hit $129B in 2025. And the protection gap — the share of catastrophe losses nobody insured — sits around 52 to 56% globally. In 2024, climate disasters caused $368B in losses, with $223B of that uninsured, a 61% gap. In the US, annual flood losses to single-family homes run $24.4B, and roughly 70% of that is uninsured.

Parametric insurance exists to close that gap, and it's growing fast — a $21-24B market in 2026, projected toward $38.7B by 2030 — precisely because it can pay in days instead of months and reach people traditional insurance never touches. In March 2026, a parametric flood scheme designed by a consortium including AXA Climate, Swiss Re and ICEYE went live in Lagos, Nigeria, covering four million people. Four million people whose claims will be decided by a satellite's read of where the water was.

That scale is the argument for getting the verification right. When the trigger covers four million people, "probably flooded" is not a defensible standard. The thing standing between a parametric program and a reputational catastrophe is whether someone can prove, frame by frame, that the trigger fired on water and not on a shadow.

People ask me whether AI weather prediction makes the satellite layer redundant — the predictive flood models are getting genuinely good. They don't. A forecast tells you what might happen; a parametric payout demands proof of what did. Others ask whether ground sensors, the physical depth gauges some insurers use to cut basis risk, make satellites unnecessary. They help enormously where they exist, but you cannot put a gauge on every kilometer of floodplain on Earth. Satellites are how you see the places no one instrumented — which, during a catastrophe, is most places.

The midnight frame with its two dark patches is still the truest picture of this work I have. One was water. One was a shadow. The entire discipline comes down to building a system that can tell you which — and then show its work to the person about to move the money. If you're designing or underwriting a parametric flood program and that distinction keeps you up at night, this is the problem we build for.

The water isn't going to get easier to read. The proof is the product.

Related Research