Split illustration: a flattering catalog render of jeans beside the same jeans as a 3D measurement mesh.

Artificial IntelligenceFashionE Commerce

Returns Cost U.S. Retail $849 Billion. Most of the AI Trying to Fix Fashion Solves the Wrong Problem.

Ashutosh Singhal May 17, 202612 min read

A few years ago I had a body-measurement demo that worked beautifully. A shopper stands against a wall, takes two photos with a phone, and a few seconds later we hand back dozens of body measurements accurate to within a centimeter or two. In our office, against good light and a blank wall, it was uncanny. I remember watching the numbers come back and thinking we had it.

Then we ran it in real bedrooms. Cluttered backgrounds, a lamp off to one side, a shopper in a baggy hoodie, the phone held at a tired angle after a long day. The same pipeline that hit one-to-two centimeters in the lab drifted to three-to-five centimeters in the wild. For a sizing system, that gap is the whole ballgame. Three centimeters at the waist is the difference between a size that fits and a size that comes back.

That month — watching demo-grade accuracy fall apart on contact with actual customers — is where I stopped believing in any single magic fix for fashion returns. It taught me the thing this entire essay is about: the fit problem is mechanical, not visual, and it does not have one solution. That's also the premise behind the AI fit prediction system we now build at Veriprajna.

The Most Expensive Number in Retail

Start with the number that should keep every e-commerce operator up at night: U.S. retail returns totaled $849.9 billion in 2025 (National Retail Federation). Fashion is the worst category by a wide margin — apparel return rates run 26 to 40 percent, against an e-commerce average around 20.8 percent.

And it is almost all the same failure. Sizing, fit, and color drive about 45 percent of all returns, and within apparel specifically, 53 to 70 percent of returns are fit-related. Seventy percent of online fashion shoppers say the inability to try something on is their biggest worry before buying. They are right to worry.

Fashion e-commerce loses more money to returns than to marketing, logistics, or fraud combined. The root cause is almost never the customer's mistake. It's that the garment didn't fit.

The economics get worse the closer you look. Processing a single return costs between $10 and $65, and in some cases consumes up to 65 percent of the item's original price. So a brand can "sell" a $40 top, eat the outbound shipping, eat the return shipping, eat the inspection and repackaging, and end up underwater on a transaction it counted as revenue. Return fraud alone costs retailers over $100 billion a year on top of all that.

Most teams I talk to are trying to solve this with a size chart and, lately, a slick virtual try-on. I'll tell you why both of those are aiming at the wrong target.

Why Couldn't a Size Chart Ever Solve This?

A size chart asks four one-dimensional numbers — bust, waist, hip, inseam — to describe a three-dimensional surface. That's the original sin. It was never going to be enough information.

It gets worse because the industry has no standardized grading system. A "Medium" in one brand maps to a completely different body geometry than a "Medium" in another. I've seen the data: a women's size 6 waistline can vary by up to five inches across brands. Ninety-one percent of shoppers need a different size depending on which brand they're buying. Vanity sizing makes it deliberately worse — brands quietly shift their labels to flatter people, which means cross-brand comparison is essentially meaningless.

So shoppers do the only rational thing. They bracket. Sixty-three percent of online shoppers now order multiple sizes of the same item intending to return all but one — and among Gen Z it's 51 percent. Sizing uncertainty alone drives about 42 percent of that behavior.

Bracketing is a quiet catastrophe for a retailer. It doubles your outbound shipping, locks up inventory during the return cycle, and guarantees that at least half of what you ship is coming straight back. The customer isn't being difficult. They've correctly concluded that your size chart can't be trusted, so they've turned their own living room into a fitting room at your expense.

The Trap I Almost Walked Into

After our measurement pipeline stumbled, the advice I kept getting was some version of "just bolt on a virtual try-on." Generative AI had made it cheap and gorgeous — Google rolled out Shopping virtual try-on across the US, UK, and India; Zalando launched a pilot with Levi's across fourteen European markets. The images these systems produce are genuinely beautiful. For a while I was tempted to believe a beautiful image was the answer.

It isn't, and understanding why is the hinge of this whole business.

Generative virtual try-on works by predicting statistically likely pixels — it paints a photorealistic picture of a garment on your body. But it has no idea whether it's showing you a size M or a size L. It cannot tell you the hip is two centimeters too narrow for the fabric's stretch limit. The diffusion model doesn't know whether it's draping four-way-stretch ponte or 14-ounce raw selvedge denim with zero give. It's a rendering, not a measurement.

A virtual try-on makes the guess look convincing. It doesn't make the guess correct.

And the data backs this up uncomfortably. Generative try-on demonstrably lifts conversion and engagement — people love playing with it. But there is no published evidence that GenAI-only try-on actually reduces fit-related returns. Worse, these models carry a slimming bias: they tend to render the garment more flatteringly than reality, which can increase the gap between expectation and the box that arrives. You can raise conversion and raise returns at the same time. I've watched teams celebrate the first number and never connect it to the second.

The Pair of Jeans That Explains Everything

Diagram of seated jeans on a 3D mesh: red strain at the thigh, calm waist, with fit callouts.

Let me make this concrete with the case that finally clarified it for me — denim, the highest-return category in all of apparel, at 20 to 25 percent.

Picture a shopper buying premium jeans. She measures her waist, matches the size chart perfectly at 71 centimeters, and orders a size 28. The jeans arrive. The waist fits exactly as promised. But the thigh binds the moment she sits down, because the 14-ounce raw denim has no stretch and the size chart never had a thigh measurement at all. The generative try-on, meanwhile, had shown her a flattering image of jeans that looked great standing still.

Neither tool captured the actual physics: this fabric's tensile stiffness means it cannot bridge the difference between standing hip geometry and seated hip geometry. The fabric was always going to lose that fight, and no amount of better photography would have warned her.

This is what I mean by mechanical. A physics-based approach simulates the garment as a material object. It knows the fabric's bending rigidity — how it drapes — its tensile stiffness — how it stretches — and its shear behavior — how it conforms to a curve. It drapes the digital pattern onto a 3D body mesh and computes the strain at every point. High strain at the thigh shows up as a red zone on the mesh before a single unit ships. That's not a guess based on what other shoppers experienced. It's a calculation on the actual fabric and the actual body.

When you've stared at a strain heatmap glowing red exactly where a returns reason-code would later say "too tight in the thigh," you stop thinking of fit as a recommendation problem and start thinking of it as an engineering one.

So Why Doesn't Everyone Just Simulate?

Four-panel comparison of fit-tech families: statistical, photo-based, generative try-on, physics.

Because physics simulation is expensive, slow to set up, and only pays off if you have the inputs to feed it. This is the part vendors gloss over, and it's the part that matters most when you're spending real money.

There are, honestly, four different families of fit technology, and the reason none of them is "the answer" is that each solves a different slice of the problem.

The cheapest to deploy is statistical size recommendation — True Fit, Bold Metrics, Fit Analytics. These match a shopper to a size using purchase history, returns data, and collaborative filtering across enormous brand networks. True Fit alone draws on twenty years of data spanning $616 billion in transactions and 91,000-plus brands, and in March 2026 it launched an agentic shopping agent that exposes its fit intelligence to AI assistants. Bold Metrics partnered with Gap on an agentic sizing protocol and claims around a 34 percent reduction in returns. The trade-off is that these are black boxes: they tell you which size, never why, and they struggle on brand-new products where there's no behavioral data yet.

The family I personally worked in was photo-based body measurement. 3DLOOK's YourFit extracts 86 measurement points from one or two smartphone photos; in a six-month study with a swimwear brand it reported a 47 percent lower return rate and cut bracketing-related returns to 2 percent. But the trade-off is exactly the one that bit me — accuracy degrades outside controlled conditions, it asks the shopper to do work, and the underlying body models skew toward "average" builds. And then there's privacy, which I'll come back to.

Generative virtual try-on is the third family — beautiful, conversion-positive, and fit-blind, for all the reasons above.

The only family that actually reasons about the fabric is physics-based simulation: CLO3D (used by 860-plus companies), Style3D, and newer entrants running finite-element cloth simulation on a 3D body mesh. Perfitly, which combines a 3D avatar with full fabric draping, reported pulling a partner brand's returns from 28 percent down to around 10 percent. The catch is that it only works if you already have digitized patterns and fabric data, so it's realistic mainly for brands with mature 3D design workflows.

There is no best fit technology. There is only the right one for your SKU count, your data maturity, and the specific shape of your returns.

A brand with a 50,000-SKU fast-fashion catalog and no 3D patterns should not be buying physics simulation; statistical recommendation will move the needle faster and cheaper. A premium denim label with digitized patterns and a brutal thigh-fit problem is leaving money on the table if it doesn't simulate. The mistake is treating these as competitors rather than as different tools for different economics.

The Feature My Own Lawyer Killed

Nobody puts this part in the sales deck. The day we were ready to ship the body-scan feature, it ran into a wall I hadn't taken seriously enough: biometric privacy law.

Under GDPR, a 3D body scan is "special category" biometric data — the most heavily protected tier there is. In Illinois, BIPA imposes strict rules on collecting, storing, and sharing exactly this kind of data, with real teeth, and more US states are expanding similar laws every year. I watched a consent-flow screen go up that effectively blocked our most impressive feature for a chunk of the country until we redesigned how it worked.

That redesign turned out to be a feature, not a bug. The answer is on-device processing — do the measurement math on the shopper's phone and never let the raw scan leave it. It's now emerging as the privacy-first standard, and it happens to also be the thing that makes shoppers comfortable enough to actually use the tool. But you have to design for it from the start. Bolt a body-scanner onto a fashion site without thinking about BIPA, and you haven't built a fit tool; you've built a class-action lawsuit with a try-on button.

Is AI Fit Prediction Real, or Just Hype?

People ask me this constantly, and the honest answer is: the underlying market is real and growing fast — the size-and-fit prediction segment is projected to roughly triple from about $1.05 billion in 2024 toward $2.95 billion by 2029 — but most of the deployments are hype, because they pick a technology before they understand the problem.

So we don't sell a single product. We start with a brand's own returns data and ask which failure is actually costing them money — wrong size selected, or wrong fit expectation. Then we match the approach to the economics: statistical recommendation for high-SKU catalogs, photo-based measurement pipelines for fit-sensitive categories, and physics-based simulation for the brands whose 3D design maturity can support it — vendor-neutral, privacy-compliant, and built around the specific return patterns in their data. That's the entire philosophy behind what we built, and it comes directly from the month I spent watching a one-size-fits-all answer fall apart.

The other question I get is whether any of this is worth the integration cost. I used to dread that one until I started watching the after-numbers land: shoppers who engage with well-built AI sizing convert at meaningfully higher rates, and return reductions of 25 to 50 percent in the first year are achievable. There's a regulatory tailwind too — the EU is moving to ban the destruction of unsold stock, and reverse logistics already burns hundreds of thousands of tonnes of CO2 a year. Every return you prevent is money, carbon, and inventory you keep.

The lesson that demo taught me has held up for years now. A flattering picture will sell the garment. Only physics — or the right data, honestly applied — will make sure it stays sold. Fashion has spent a decade trying to photograph its way out of a problem that was always going to be solved with a measurement.