We build custom verification pipelines that wrap fine-tuned open-weight LLMs around your existing formal engine (JasperGold, VC Formal, Questa Formal, or SymbiYosys) and run entirely on your own hardware. No RTL leaves your network. No vendor lock-in. Opinionated about SystemVerilog assertions, honest about what formal can and cannot prove, and fluent in RISC-V, AXI4, and 3nm tape-out economics.
14%
first-silicon success
Wilson / Siemens 2024
$10–40M
mask set, 5nm to 3nm
SemiAnalysis 2024
70%
respins caused by spec drift
Wilson / Siemens 2024
The 2024 Wilson Research Group / Siemens EDA Functional Verification study put first-silicon success at 14%, the lowest number in twenty years of tracking. In 2020 it was 32%. The cause is not lazy engineering. It is complexity outpacing the verification tools, a spec that mutates faster than the testbench, and a new class of failure that generalist LLMs introduce into RTL. We see five hallucination modes in HDL code the industry has not yet named cleanly.
Code that does not compile. Caught by Verilator, Icarus, or the synthesis front-end in seconds. This is the class the industry already knows how to handle.
LLMs trained on Python and C write Verilog as if statements execute sequentially. They use blocking assignments (=) inside clocked always_ff blocks where non-blocking (<=) is required. The simulator may schedule events in an order that masks the race. Synthesis produces different logic. Silicon deadlocks.
The code compiles and passes 90% of directed tests. Then it asserts WVALID before AWREADY, or holds VALID high while flipping data, or violates a sub-clause buried on page 84 of the AMBA spec. The chip works on the internal test harness and hangs the moment it is connected to a third-party memory controller. We catch this with pre-verified SVA libraries for each protocol, not with more simulation cycles.
The LLM generates an SVA property. The formal engine proves it. You ship. The property was trivially true because the antecedent never fires. This is worse than no verification, because you have a certificate that says "proven" on a buggy design. Any formal flow that does not run vacuity checks is theater. Siemens has been warning about this since 2017 and the field still ships tools without it.
LLMs see signal names, not clock domains. They connect a 2 GHz CPU domain signal directly to a 400 MHz peripheral domain flop, skip the double-flop synchronizer, and simulation cannot catch it because RTL sim does not model metastability. Accellera opened a CDC/RDC/Glitch interoperability standard in 2024 precisely because the fragmentation across SpyGlass, Questa CDC, and Conformal CDC was breaking sign-off.
Why this matters in dollars: 70% of respins are caused by spec changes, not pure logic bugs (2024 Wilson / Siemens data). So a verification flow that only catches logic bugs addresses a subset. Classes 2 through 5 above are the subset that still blow tape-outs, because they bypass simulation and only show up in silicon. A 5nm respin is $10M in masks plus a 3 to 6 month schedule slip. On an 18-month product cycle, a 6-month slip can erase half of lifetime revenue.
Your real alternatives are not theoretical. They are the three EDA giants (who you almost certainly already pay), six well-funded agentic AI startups pitching you at DVCon and DAC, Big 4 systems integrators, and the specialist formal consultancies. We have no product to sell against them. We help you pick, integrate, and operate the right combination.
| Option | What they actually do | Strengths | Honest gaps |
|---|---|---|---|
| Cadence JasperGold, Cerebrus AI Studio, ChipStack Super Agent |
Gold-standard formal engine. Multi-block RL-driven digital implementation. Agentic AI super agent announced Feb 2026. | JasperGold is the reference formal tool. Deep foundry integration. ~30% of EDA market. | Historical JasperGold baseline pricing ($225K base + $45K/seat) is out of reach for most early-stage RISC-V / AI accelerator startups. Cloud-first agentic features do not meet IP-sensitive on-prem requirements. |
| Synopsys VC Formal, DSO.ai, AgentEngineer |
L4 agentic workflow (AgentEngineer, March 2026), claimed 2 to 5x productivity. RL-based design space exploration. $35B Ansys acquisition adds multiphysics. | Deepest customer base. Every large fabless already has a VC Formal contract. AgentEngineer is the most credible vendor agentic stack today. | Opinionated custom flows are not their business. They will not tell you to use an open-weight model or SymbiYosys. Small shops get templated attention. |
| Siemens EDA Questa Formal, Questa CDC, Catapult HLS |
Strong Questa formal and CDC franchise. Publishes the Wilson study. Deepest automotive ISO 26262 track record. | Automotive qualification expertise. Good CDC / RDC story. Tool qualification packages ready. | Agentic AI story lags Cadence and Synopsys. Less RISC-V ecosystem focus. |
| ChipAgents $74M total, Feb 2026 |
Multi-agent RTL design and verification. DVCon 2026 demo of multi-agent Root Cause Analysis with no human in the loop. | Strongest pure-play agentic story. Matter Ventures (TSMC-backed), Bessemer, Micron, MediaTek, Ericsson on the cap table. | Cloud platform. On-prem / air-gapped deployment pathway is unclear for IP-sensitive customers. Integration into an existing Jenkins/CI sign-off flow is still DIY. |
| Normal Computing $85M+ total, Mar 2026 |
Auto-formalization: LLM translates engineer intent into formal properties and proves them. Samsung Catalyst led the last round. ARIA Scaling Compute programme. | Closest peer on the LLM + formal thesis. Claims half of the top 10 semiconductor design firms are using Normal EDA. Delivered real silicon (CN101). | Product, not consultancy. Not a fit if you need custom fine-tuning on your proprietary RTL corpus or integration into a legacy flow you will not rip out. |
| Axiomise Specialist formal consultancy |
formalISA app deployed across Ibex, CVA6, cheriot-ibex, 0riscy, cv32e40p, WARP-V. Found 65+ bugs in Ibex including six debug-unit branch bugs. | The most credible RISC-V formal verification track record in the industry. Real, publishable bug finds. Deep ISA expertise. | Small team. Formal methods only; no LLM-assisted SVA generation, no on-prem LLM story, no integration with the agentic AI wave. |
| Big 4 / large SIs Accenture, Deloitte, Wipro, HCL |
Large VLSI / verification services practices. Headcount on the shelf. | Scale. Offshore delivery. Existing MSA with your procurement. | Body-shop economics. Opinionated AI verification architecture is not their business. The partner who sold you the engagement has never written an SVA property in their life. |
| Veriprajna Vendor-neutral custom build |
Fine-tune an open-weight coder LLM on your RTL corpus, wrap it around whichever formal engine you already own, wire it into your Jenkins/CI, add vacuity and coverage metrics. All on your hardware. | No product to push. On-prem / air-gapped by default. RISC-V, AXI4, RISC-V debug, and formal coverage economics are our comfort zone. Honest about what formal can and cannot do. | We do not replace your formal engine. We do not ship a qualified ISO 26262 tool of our own. Spec drift and organizational change are problems consulting cannot solve; we can only design around them. |
Pricing, funding, and product information reflect public disclosures through early 2026. Always verify current terms directly with each vendor.
Every engagement is custom. These are the five shapes most fabless customers end up asking for, and the opinionated choices we make inside each.
A fine-tuned open-weight coder model (Qwen 2.5 Coder, DeepSeek Coder, Llama 3.3, or Mistral Large) running on your own H100 or H200 cluster, wrapped around whichever formal engine you already own. Zero RTL ever leaves your network.
What we reach for: vLLM for inference, LoRA adapters per IP family so the base weights stay shared, local RAG over your spec documents and past bug history, a thin orchestration layer that calls JasperGold, VC Formal, Questa Formal, or SymbiYosys through their Tcl/Python APIs. The LLM never runs the solver. It writes properties and interprets counter-examples.
Why this not a hosted API: because your RTL is crown-jewel IP and your CISO is not signing a data processing agreement with a US or EU startup founded last year.
Pre-built SystemVerilog assertion libraries for AXI4, AXI4-Lite, APB, AHB, and TileLink compliance, plus RISC-V pipeline hazard detection, Load-Store Unit scoreboarding, debug unit correctness, and CSR access checking, tuned to your custom extension ISA.
The reference point: Axiomise found 65+ bugs in the Ibex core through formal, including six debug-unit branch bugs that simulation missed. Formal works on RISC-V. The bottleneck is the scarcity of engineers who can write the assertions. We build the library so your team does not have to.
Honest caveat: a curated assertion library is more reliable than LLM-from-scratch generation but still cannot prove the absence of every bug class. We pair it with COI (cone of influence) and mutation-based coverage analysis.
Your DV lead is getting pitched by ChipAgents, Normal Computing, MooresLabAI, Silimate, Bronco AI, and the in-house Cadence and Synopsys agentic products. Six products, six different claims, zero independent benchmarks on your actual RTL.
What we do: run a structured four-week bake-off on your codebase under NDA. Same test suite, same bug budget, same coverage targets. Honest report comparing bug-finding rate, false-positive rate, setup effort, integration debt, and the pricing terms each vendor actually offered you.
Why buyers trust us with this: we do not resell any of these products. If the right answer is "stay with JasperGold and add a thin LLM assist," we will say so.
Every pull request that touches RTL gets reviewed by a multi-agent pipeline before a human looks at it. One agent lints and checks style. A second runs a formal property set derived from the changed files. A third checks CDC and RDC paths. A fourth generates a human-readable summary with counter-example traces where properties failed.
Opinionated choice: we run the agents inside your existing CI (Jenkins, GitLab, BuildKite, whichever). We do not replace your CI with a new platform. The agents are services the pipeline calls. When you fire us, you keep the pipeline.
What we refuse to build: an agent that auto-merges RTL without a human review. Silicon is not a microservice. You cannot ship a hotfix to a chip.
This is the one place we think reinforcement learning for placement is actually worth deploying. The incumbents (Cadence Cerebrus, Synopsys DSO.ai) are tuned for monolithic 2D SoCs. The chiplet / UCIe wave has opened up a new class of floorplanning problem (inter-chiplet wire length, thermal stacking, bump pitch constraints) where the public tooling is immature.
What we build: a hybrid simulated-annealing + RL floorplanner on top of OpenROAD for the chiplet partitioning phase, with thermal constraints as a first-class reward term. Benchmarked against published ISPD / ICCAD results before we touch your design.
We acknowledge the AlphaChip controversy directly. Igor Markov's 2023 critique showed Google Circuit Training taking 32 hours where a tuned simulated annealing run took 12.5 hours and a Cadence commercial tool took 0.05 hours. We do not pitch RL as a replacement for tuned SA on well-understood problems. We use it where the design space is genuinely new and human intuition has no priors to draw on.
Every engagement starts with a two-week scoping phase on a small block of your RTL before we touch anything larger. We would rather walk away at week two than burn your schedule on a bad fit. Typical cadence for a full build.
Read your spec, walk through your existing flow, pick one representative block (often a bus interface, arbiter, or a single RISC-V pipeline stage) and run our baseline formal harness on it. Output: a written report with the bug classes we see, the assertions we would build, and a cost estimate for the full engagement. If the answer is "you should keep doing what you are doing," we say so and bill for the two weeks only.
On-prem LLM stack deployed on your cluster. Base model fine-tuned with LoRA adapters on your RTL corpus. RAG indexed over your specs and past bug database. Hooks into your formal engine, your Jenkins/CI, and your issue tracker. We instrument everything with proof coverage, vacuity, and bounded-depth metrics from day one.
We port or write the SVA library (protocol compliance, pipeline, CDC) for your top 3 to 5 IP blocks. We run the formal regression. We triage findings with your DV lead. Your team owns every assertion by the end of the phase. No black boxes.
Your engineers run the flow for two full sprints with us watching. We document every opinionated choice we made so the next person can understand why. We exit. Optional retainer for regression tuning if you prefer.
Timelines are honest ranges, not sales numbers. A 2-stage pipeline block can be done in three weeks. A full RISC-V core with custom extensions runs closer to five months. We say so up front and we do not squeeze to hit an artificial date.
Three inputs. Tells you the mask cost exposure, the expected schedule slip, and the revenue-at-risk on one silicon respin at your node. The numbers come from the 2024 Wilson Research Group / Siemens study, recent SemiAnalysis mask cost data, and typical 18-month product cycles. Use it in your next tape-out readiness review. The result recommends specific actions you can take without hiring us.
Mask cost exposure
per respin, one layer set
Schedule slip
typical range
Revenue at risk
from missed market window
Recommended actions (in order)
Mask set cost ranges from SemiAnalysis and public TSMC / Samsung disclosures. First-silicon success base rate (14%) from the 2024 Wilson Research Group / Siemens Functional Verification Trend Report. Revenue impact assumes a 18-month product cycle where a 6-month slip erodes approximately 50% of lifetime revenue.
These are real questions from fabless and RISC-V customers. Each answer adds depth not covered in the sections above.
No. Every deployment architecture we ship runs on your hardware. Fine-tuned model weights live on your cluster. LoRA adapters with your IP-specific tuning live behind your firewall. vLLM inference runs on your GPUs. RAG indexes your spec documents from your own document store. Our engineers access the environment through your standard VPN and SSO with audit logging. For defense, aerospace, and SCIF customers we ship the entire stack on signed offline update bundles and do not require any outbound connection from the environment. The one exception is the initial base-model download, which is done on an unclassified system and then transferred in. If you need a stricter air gap than that, we have done it.
Vacuity is the failure mode we worry about most, and it is the reason every formal flow we ship runs a three-layer check. First, the formal engine's native vacuity check (JasperGold and VC Formal both have one; SymbiYosys needs a wrapper we provide). Second, a mutation-based sanity check where we inject a bug into the design and confirm the assertion fires. An assertion that passes vacuity but does not catch injected bugs is not buying you anything. Third, a COI (cone of influence) report showing exactly which signals each property reaches. If a property has an empty COI it is dead code and we delete it. These are the same metrics Siemens has been publishing about in Verification Horizons since 2017 and we treat them as table stakes.
Not directly for sign-off, and we will not pretend otherwise. ISO 26262 requires tool qualification (TCL2 or TCL3 depending on how you use the tool) with a documented qualification package. Synopsys, Cadence, and Siemens all ship qualified flows; a custom LLM-assisted tool is not on that list. What we do build for automotive customers is an AI-assist layer that runs alongside the qualified tool, not in place of it. The qualified tool still produces the sign-off evidence. Our layer accelerates assertion authoring, reviews properties for vacuity, and flags CDC paths for human inspection. The qualification chain on your signed-off tool is untouched. ASIL D customers should also plan on a documented independence review between the assist layer and the qualified verification, which we help you structure.
You might. Both are well-funded, technically credible, and have real customers. The reason teams come to us after evaluating them is usually one of three things. First, the cloud deployment model did not clear their security review (common). Second, they needed fine-tuning on a proprietary custom-extension ISA that the product team could not prioritize. Third, they wanted a custom integration into an existing Jenkins / regression / sign-off flow that the product team cannot support without a six-figure professional services engagement. If none of those apply to you, the product is probably the right answer and we will say so. If they do apply, we build the custom layer and leave you with a system your own engineers can maintain. On pilots, we recommend putting all three options on the same RTL for four weeks. The bake-off is cheap compared to a wrong bet.
We think Igor Markov's critique was technically correct on the specific numbers. Google Circuit Training at 32 hours versus tuned simulated annealing at 12.5 hours and a Cadence commercial tool at 0.05 hours is not a story of RL winning placement for mainstream SoCs. That does not mean RL is useless for silicon. It means the 2020 framing was wrong. The places where we think RL placement earns its compute today are chiplet and 3D-IC floorplanning where the design space is genuinely new, thermal-aware analog layout where existing tools are weak, and transfer learning across closely related RISC-V IP families where an agent trained on your previous generation gives you a warm-start. We do not pitch RL placement against DSO.ai or Cerebrus on a monolithic digital SoC at 5nm. That is a fight we would lose and you would pay for.
Honestly, this is the hardest problem in verification and no AI tool solves it cleanly. What we do is treat the spec as a first-class input to the verification flow. The LLM watches the spec repo (Confluence, Google Docs, Git, whichever you use) and flags properties whose underlying assumption has changed. When a reviewer marks a section of the spec as revised, the dependent properties get re-run automatically and the delta report goes to the DV lead before the next regression closes. This does not eliminate spec drift. Nothing does. It makes the drift visible in hours instead of in silicon. The single biggest win we see on this is catching "spec changed two sprints ago and nobody re-ran the affected formal properties" before it propagates through the hierarchy.
No. JasperGold is the best commercial formal engine and we use it when the customer already owns it. What we add is the LLM-assist layer on top of it (assertion generation, counter-example interpretation, vacuity sanity checks) and a CI integration that most teams have not taken the time to build cleanly. The return on your existing JasperGold investment goes up, not down. If you do not own JasperGold and cannot justify the base + per-seat pricing, we will typically recommend a hybrid of Questa Formal (cheaper per seat) for bulk regression and SymbiYosys (open-source) for automated property debug. We have shipped this stack to RISC-V IP startups where a JasperGold purchase was not an option.
We have built useful flows for a 6-person RISC-V IP startup and we have built for a 400-person AI accelerator company. The lower bound is the presence of at least one engineer who is comfortable reading SVA and interpreting a formal counter-example trace. If nobody on the team can read an SVA property, no LLM-assisted flow is going to close that gap, and you should hire or contract for that skill before engaging us or anyone else. Beyond that baseline, the engagement scales with how much RTL is in scope. A single bus-interface block is a six-week job. A full RISC-V core with custom extensions and an interconnect fabric is four to six months.
The interactive whitepapers that inform this page. Each is the deeper technical treatment of a single thesis, written for the DV lead who wants to see the math, the references, and the opinionated choices we made.
The "Formal Sandwich" architecture, LLM-generated SVA with SMT-based proof, counter-example guided refinement, and why neuro-symbolic beats wrapper-copilots for hardware. Our anchor paper on LLM + formal verification for fabless semiconductor design.
The honest version of the RL-for-placement story. Where reinforcement learning earns its compute (chiplets, 3D-IC, analog, transfer learning) and where simulated annealing and commercial tools still win. Includes our direct reading of the AlphaChip / Markov controversy.
On-prem LLM + formal engine integration, RISC-V assertion libraries, and vendor-neutral tool selection for fabless teams at 7nm through 2nm.
Two-week paid scoping on a block of your RTL before any larger commitment. If we do not see value, we say so and bill only the scoping phase.