Generic computer vision fails at the edges: bald heads mistaken for soccer balls, dust particles flagged as kill defects, shadows triggering phantom brakes. We build physics-constrained vision systems that reject the impossible before it becomes expensive.
Whether you're running automated cameras in stadiums, inspecting wafers at 10nm, or classifying defects on a production line, the problem is the same: your detector finds patterns, but it doesn't understand physics. A ball can't teleport. A defect has parallax. A shadow has no depth. We embed these physical constraints directly into your vision pipeline, closing the gap between detection and understanding.
In October 2020, Pixellot's automated camera system at Inverness Caledonian Thistle tracked a linesman's bald head for an entire match instead of the ball. The system used a standard CNN detector (likely YOLO-family) that processed each frame independently. Under stadium floodlights, the linesman's head produced specular highlights with pixel gradients statistically indistinguishable from a white soccer ball. The detector assigned 98% confidence to "ball" on the head, while the actual ball (moving fast, blurring through shadows) scored 80%. The system followed the highest-confidence signal. It had no mechanism to check that a "ball" moving at 3 mph at a constant height of 1.7 meters, attached to a vertical cylindrical object, violates every kinematic constraint of a soccer ball in play. The fix isn't better training data. It's physics.
KLA dominates semiconductor inspection with 63% market share and their 2900 series can detect features as small as 10nm. But detection isn't the bottleneck. The nuisance defect problem is: at advanced process nodes, a broadband scan captures thousands of anomalies per wafer. Most are surface artifacts, dust particles, or pattern noise that won't affect yield. Each one requires classification. A 1% yield loss at advanced nodes translates to millions in lost revenue because a single wafer can cost tens of thousands of dollars. The industry standard is deep learning classifiers trained on historical defect libraries, but these classifiers have no model of how light physically interacts with a pit versus a stain versus a process residue. When the fab transitions to a new process node (say, gate-all-around at 2nm), the classifier's training data is obsolete and the nuisance rate spikes. Physics-based defect models that understand parallax, material reflectance, and topographic scattering separate real defects from noise regardless of process node.
On production lines using AI-based quality control, you rarely know when a CV model is wrong. Without real-time ground-truth labels, drift builds quietly while production continues. A lighting angle shifts after maintenance. A lens hazes over weeks. A fixture wears. False rejects rise (rework loops, throughput friction) or false accepts creep in (escape risk, warranty exposure). By the time a quality escape surfaces, it triggers broad containment, widened quarantine, re-inspection, and manual review. The cost of poor quality runs approximately 20% of total sales for average manufacturers. Physics constraints serve as invariant anchors: the physical properties of a correctly manufactured part don't change when the lighting shifts. A physics-informed system measures whether the observed image is consistent with the known geometry and material properties, not just whether it "looks like" a good part compared to historical training images.
| Provider | Domain | What They Ship | Physics Integration | Where They Fall Short |
|---|---|---|---|---|
| Pixellot | Sports broadcasting | AI automated cameras, auto-tracking, multi-angle. 150+ leagues, partnership with GameChanger. | Basic Kalman filtering for track smoothing. Multi-hypothesis tracking in V4 largely fixed the bald-head class of errors. | New failure modes: jersey OCR under motion blur, offside projection on non-flat pitches. Physics is post-hoc smoothing, not a constraint layer. |
| Hawk-Eye (Sony) | Sports officiating | Multi-camera triangulation, skeletal tracking (29 points per player). NFL, MLB, ATP. | Strong geometric constraints via multi-camera calibration. | Expensive ($1M+ per venue). Proprietary and closed. Requires dedicated infrastructure (6-8 4K/8K cameras per venue). |
| KLA Corporation | Semiconductor inspection | 2900 series broadband inspection, 10nm sensitivity. 63% market share in process control. | Rule-based defect physics models baked into specific process nodes. | Models are process-node-specific. New node transitions cause nuisance rate spikes. $2.3B R&D investment signals they know the gap exists. |
| Cognex | Manufacturing QA | VisionPro ViDi deep learning, edge learning on-camera (5-10 training images). | None at inference. Traditional machine vision handles measurement/metrology. | Data-driven only. Susceptible to silent drift. 90% reduction in setup time but no physics grounding. |
| NVIDIA | Platform/infrastructure | Metropolis ecosystem (1,000+ companies), Omniverse for digital twin simulation, Cosmos for synthetic data. | Physics at training time (rendering), not inference. Omniverse simulates physics for synthetic data generation. | Platform, not solution. Physics stops at training. The deployed model is still purely data-driven. |
| Veo | Sports (grassroots) | D2C AI cameras, 40,000+ clubs, 100 countries, 4M+ matches filmed. | Minimal. Consumer-grade tracking. | Not physics-constrained. Consumer price point means limited compute for constraint layers. |
| Big 4 / Large SIs | Cross-industry | Platform implementations (NVIDIA, cloud APIs), integration services, change management. | Implement vendor physics tools. Don't build custom constraint layers. | They deploy platforms. Building a custom Kalman filter pipeline tuned to your specific physics isn't in their playbook. Engagements run $500K-$5M+ and take 6-18 months. |
| Cloud APIs | General purpose | Pre-trained detection/classification, easy API integration, pay-per-call. | None. Frame-independent inference by design. | No temporal consistency. No physics constraints. The "90% trap": fast to 90% accuracy, impossible to close the last 10% without domain-specific physics. |
The gap is consistent across every segment: physics is either absent, confined to training, or locked inside a proprietary system. Nobody offers custom physics constraint layers as a service, integrated into your existing pipeline, tuned to your specific domain physics. That's what we build.
We add a deterministic verification layer between your detector and your action system. Every detection passes through three gates before it's accepted: a Kalman filter kinematic gate (is this motion physically possible given the object's mass and the time delta?), an optical flow gate (does the pixel motion inside the bounding box match the expected velocity profile?), and a geometric gate (does the object size satisfy 3D perspective constraints relative to the camera position?). We tune the physics model to your domain. Projectile dynamics for ball tracking. Parallax geometry for wafer inspection. Road-plane constraints for autonomous navigation. The gates reject false positives that visual confidence alone cannot catch.
For semiconductor fabs and precision manufacturing, we build defect classifiers that model how light physically interacts with surface anomalies. A real pit scatters light differently than a dust particle. A process residue has different reflectance than a short circuit. We use multi-view geometry and physics-based rendering models to characterize each anomaly by its physical properties, not just its visual appearance. This means the classifier generalizes across process nodes because the physics of light-material interaction doesn't change when you move from FinFET to gate-all-around.
Model drift is the silent killer of production CV. We build architectures that use physics invariants as stability anchors. The physical geometry of a correctly manufactured part doesn't change when a lighting angle shifts or a lens hazes. We encode these invariants into the system so that environmental variation affects the raw signal but not the physics-verified output. This reduces emergency retraining cycles from monthly to quarterly or less, and catches drift before it causes quality escapes.
When physics-informed neural networks (PINNs) make sense for your application, we build the training pipeline. PINNs add a physics loss term to the standard data loss: the network is penalized not just for missing the target, but for violating the governing equations (Navier-Stokes, projectile motion, conservation of energy). The result is a model that needs less training data, generalizes better to unseen conditions, and produces physically plausible outputs. We handle the hard parts: lambda tuning (the physics loss weight), convergence stabilization, and discontinuity handling (ball hitting a post, wafer edge effects) that cause naive PINN implementations to fail.
Here's exactly what happens when a physics-gated system processes the Inverness match scenario, frame by frame.
The detector finds the ball at coordinates (512, 380) with 92% confidence. The Kalman filter initializes: position (512, 380), velocity estimated at 18 m/s eastward from previous frames. State uncertainty is low. The optical flow at the detection region shows strong rightward motion consistent with a kicked ball. All three gates pass. The system accepts the detection and updates the track.
The detector returns two candidates:
The filter predicted the ball would be near (531, 376) based on its velocity and gravity. Candidate A's innovation (residual) is 1.4 pixels. Candidate B's innovation is 669 pixels. The Mahalanobis distance for B is 47 standard deviations. Anything above 3 sigma is rejected. B is eliminated before it reaches the next gate.
Candidate A shows a flow field of 450 pixels/second rightward, consistent with a ball at 18 m/s. Even if B had passed Gate 1, its flow field shows near-zero motion (stationary head). A "ball" with zero velocity in mid-play violates the expected profile. Second rejection.
Candidate A subtends 22 pixels at this distance, consistent with a 22cm ball at 12 meters from the camera. Candidate B subtends 45 pixels. A 22cm ball at 12 meters cannot subtend 45 pixels. Third rejection.
The system follows Candidate A (the actual ball) with 80% visual confidence, rejecting Candidate B despite its 98% confidence. Physics overrules pixels.
This same architecture applies to any domain where objects obey physical laws. In a semiconductor fab, the "Kalman gate" becomes a parallax consistency check across inspection angles. In manufacturing QA, the "optical flow gate" becomes a surface reflectance model. The framework is the same; the physics changes.
We instrument your existing CV pipeline to measure exactly where it fails: false positive rates by category, latency per inference step, edge-case frequency. We identify which physical constraints apply to your domain and which detection failures they would prevent. Deliverable: a constraint specification document with projected false positive reduction and a go/no-go recommendation. If physics constraints won't meaningfully improve your system, we tell you.
We build the physics layer and integrate it into your pipeline. This isn't a separate system; it's a verification layer that sits between your existing detector and your action logic. We tune the Kalman filter state model to your object dynamics, calibrate optical flow thresholds to your camera setup, and validate geometric constraints against your physical environment. Timeline depends on complexity: a single-camera sports tracker is 8 weeks. A multi-view semiconductor inspection system with custom physics models is 16.
We deploy to production with monitoring. We instrument every gate to log rejection reasons, measure false positive and false negative rates against your acceptance criteria, and verify that the physics constraints don't add unacceptable latency to your pipeline. We tune thresholds based on production data, not lab conditions. Deliverable: a production system with documented performance baselines and a drift monitoring dashboard.
What takes longer
Multi-camera calibration in venues with non-standard layouts. Process-node transitions in semiconductor (the physics model needs characterization data from the new node). Integration with legacy PLCs or SCADA systems that don't expose real-time data feeds.
Answer six questions about your current CV deployment. Get a specific analysis of which physics constraints would help and what false positive reduction to expect.
1. What does your vision system track or inspect?
2. What is your current false positive rate?
3. Does your system process frames independently or maintain temporal state?
4. How often do you retrain your models due to environmental drift?
5. What is your latency budget per frame?
6. Do you have physics models for your domain (kinematic equations, material properties, geometric constraints)?
Traditional false positive reduction works by raising the confidence threshold: require 95% confidence instead of 80%. This reduces false positives but inevitably increases false negatives because legitimate detections with lower confidence get rejected too. Physics constraints work orthogonally. They don't touch the confidence threshold. Instead, they verify whether a detection is physically possible regardless of its visual confidence score. A bald head at 98% confidence is still physically impossible as a ball, so it's rejected. A ball at 75% confidence that matches the kinematic prediction is accepted. The false positive rate drops because physically impossible detections are eliminated. The false negative rate holds or improves because legitimate detections at lower confidence pass the physics check. In semiconductor inspection, this means catching real defects that a high confidence threshold would miss (faint but physically real pits) while rejecting nuisance signals that happen to look like defects (surface particles with high visual similarity but wrong parallax behavior).
Yes, and that's the standard approach. The physics layer sits between your detector and your action system. Your existing detector (YOLO, EfficientDet, a custom CNN, a cloud API) continues to generate candidate detections. The physics layer evaluates each candidate against kinematic, optical flow, and geometric constraints before passing it downstream. Integration points depend on your architecture: if you're running inference on-device, the physics layer runs on the same hardware (Kalman filter updates are computationally cheap compared to CNN inference). If you're using a cloud API, the physics layer can run at your edge or in your processing pipeline. Typical integration adds 1-3ms per frame for the Kalman filter and optical flow gates. Geometric gate latency depends on the complexity of your 3D model but rarely exceeds 5ms. Total added latency: 2-8ms. For systems already running at 25-60fps (16-40ms per frame), this fits within the budget.
Retraining addresses drift but not the fundamental problem: a retrained model can still make physically impossible predictions because it has no concept of physics. Expanding training data helps with coverage but has diminishing returns on edge cases (you can't train away the laws of physics). A physics constraint pipeline build runs $80K-$250K depending on complexity. Single-camera single-object tracking (sports) is at the low end. Multi-view semiconductor inspection with custom physics models is at the high end. Compare that to the ongoing cost of the problem: a semiconductor fab where each scrapped wafer costs tens of thousands of dollars and nuisance-driven manual review burns engineer hours at $150-200/hr. A sports broadcaster whose automated camera misses key plays loses subscribers. A manufacturer spending a fifth of revenue on quality costs, much of it driven by false rejects that physics constraints would prevent. The physics layer is a one-time build with low maintenance cost because physics doesn't drift. The laws of projectile motion won't change next quarter.
Pixellot's V4 multi-hypothesis tracking largely fixed the "bald head" class of errors. Hawk-Eye's multi-camera triangulation with skeletal tracking is the gold standard for officiated sports. But the market has moved beyond the top tier. The FIFA World Cup gets Hawk-Eye's $1M+ per venue setup. The 40,000+ clubs using Veo's consumer cameras don't. The gap is in mid-tier and grassroots sports: leagues that need automated broadcasting with better-than-consumer accuracy but can't afford Hawk-Eye infrastructure. Physics constraints on a single-camera setup close a meaningful portion of that accuracy gap at a fraction of the cost. Specifically: occlusion handling through physics-based prediction (maintaining track when a player blocks the ball), multi-object disambiguation (two overlapping players distinguished by kinematic profiles, not just appearance), and camera motion compensation (separating camera pan from object motion using inertial constraints).
This is exactly the scenario where physics constraints have the highest impact. Node transitions break data-driven classifiers because the training data is from the old node. The visual signatures change: new materials, new geometries, new etching patterns. But the physics of defect imaging doesn't change at the same rate. A real pit still scatters light based on its depth and sidewall angle. A particle still shows parallax between inspection angles based on its height above the surface. A process residue still has a reflectance profile determined by its material composition. We build defect classifiers that use these physics-based features alongside visual features. During node transitions, the physics features remain discriminative even when the visual features lose their predictive power. Practical timeline: 2-3 weeks for the domain physics audit to characterize the new node's imaging physics, 12-16 weeks for the classifier build including validation against your defect library from the new node.
Every physics model is an approximation. A Kalman filter assumes Newtonian dynamics, which breaks down for objects with complex aerodynamics (a knuckleball swerves unpredictably due to turbulent airflow separation). An epipolar geometry model assumes rigid surfaces, which breaks for flexible materials. We handle this in three ways. First, every gate has a configurable confidence threshold. If the Mahalanobis distance is borderline (between 3 and 5 sigma), the detection is flagged for downstream verification rather than hard-rejected. Second, we use the Unscented Kalman Filter (UKF) instead of the Extended Kalman Filter (EKF) for nonlinear dynamics. UKF propagates sigma points through the actual nonlinear function rather than linearizing, which handles moderate nonlinearity (spin, drag, uneven surfaces) without the EKF's Taylor-series approximation error. Third, for genuinely complex physics (turbulent flow, novel materials), we use PINNs to learn the governing equations from data while constraining the solution space. The physics model isn't a hard cage. It's a guardrail that flexes at the edges but prevents catastrophic errors at the center.
Explore the technical foundations behind our physics-constrained vision methodology.
A deep technical exploration of why generic computer vision fails in production environments and how physics-constrained architectures (Kalman filters, optical flow, PINNs) close the gap between detection and understanding.
Read the whitepaperEdge cases consume 80% of engineering time, 90% of support costs, and 100% of liability exposure.
A physics-constrained vision system doesn't eliminate edge cases. It eliminates physically impossible edge cases, which is most of them. The engineering time you spend debugging false positives, retraining for drift, and reviewing nuisance defects goes to building features instead.