The Problem
In December 2025, PJM Interconnection — the largest grid operator in the United States, serving 65 million people across 13 states — announced something that had never happened before. Its capacity auction came up 6,623 megawatts short of what the grid needs to keep the lights on in 2027/2028. That gap is roughly the output of six large power plants, simply missing from the system.
The auction hit the federal price cap of $333.44 per megawatt-day across the entire region. The total cost: $16.4 billion. Yet even at that ceiling price, there was not enough power to meet reliability targets. The reserve margin — your safety buffer against blackouts — dropped to 14.8%, well below the 20% standard designed to prevent large-scale outages.
Meanwhile, in Texas, the ERCOT grid faces a different but related crisis. Its queue of large customers waiting to connect — mostly data centers — has ballooned to 233 gigawatts. That is nearly three times the entire peak demand of the Texas grid. But ERCOT only added 23 gigawatts of new generation in 2025. The math simply does not work.
If your organization operates in PJM territory or depends on Texas power, these numbers should keep you up at night. The era of cheap, abundant electricity is ending.
Why This Matters to Your Business
This is not an abstract infrastructure problem. It hits your balance sheet directly.
A new analysis found that data center growth in the PJM region could drive $163 billion in cumulative capacity costs from 2028 through 2033. In the ComEd territory of Northern Illinois alone, that translates to $21.4 billion. Residential customers there face a projected $70-per-month increase. If you run industrial operations or large facilities in these zones, your exposure is proportionally larger.
Here is what your leadership team needs to understand:
- Your energy costs are rising structurally. The PJM auction maxed out at the FERC price cap. That cap masked the true scarcity price. When regulators eventually adjust it, prices will jump again.
- Your reliability risk is growing. A 14.8% reserve margin means elevated risk of load-shedding — controlled blackouts — starting June 2027. If your operations cannot tolerate interruption, you need contingency plans now.
- Your expansion plans may stall. In ERCOT, 77% of large-load interconnection requests come from data centers. If you are planning a new facility in Texas, you are competing with 233 GW of requests for 23 GW of available capacity. Roughly 35% of proposed gas plant projects have already withdrawn, citing turbine shortages and permitting delays.
- Regulatory pressure is intensifying. Texas passed Senate Bill 6, which mandates standardized interconnection rules and allows regulators to curtail data center power during emergencies. FERC Order 2023 requires new cluster study processes. Your compliance obligations are expanding.
The bottom line: your energy budget, your operational continuity, and your growth timeline are all at risk.
What's Actually Happening Under the Hood
The core problem is a retirement cliff. Between 2011 and 2023, PJM lost 54.2 gigawatts of thermal power plants — coal and gas facilities that run on demand, regardless of weather. Another 24 to 58 gigawatts could retire by 2030. That is up to 30% of installed capacity.
Here is why replacements are not keeping pace. Think of it like replacing a reliable car with bicycles. You would need a lot more bicycles to carry the same load. PJM's own analysis shows that replacing 1 megawatt of retiring thermal generation requires about 5.2 megawatts of solar or 14 megawatts of wind to deliver the same reliability. Solar and wind only produce power when the sun shines or the wind blows. A gas plant runs whenever you need it.
PJM adjusted its calculations to reflect this reality. It lowered the reliability credit — called Effective Load Carrying Capability, or ELCC — for intermittent resources like solar and wind. As more of these resources connect, each additional unit contributes less to keeping the grid stable during peak demand.
At the same time, demand is surging in specific locations. PJM reported a 5,250-megawatt increase in its demand forecast, driven almost entirely by data centers. In some zones, summer peak growth forecasts reach 6.4% annually. This concentrated load creates localized stress points where substations and transmission lines face overload risk.
The grid cannot grow physically fast enough. So it must become smarter. That is where AI enters — but not all AI is equal to this task.
What Works (And What Doesn't)
Let us start with what falls short when you apply AI to a problem this complex.
Generic chatbots and LLM wrappers. Bolting a large language model onto your operations dashboard gives you a conversational interface, not grid intelligence. It does not understand the physics of electricity flow.
Basic regression forecasting. Simple statistical models that predict load based on historical averages miss the non-linear dynamics of a grid where 54 gigawatts of dispatchable power disappeared in twelve years.
One-size-fits-all predictive analytics. Standard machine learning treats grid data as a flat spreadsheet. It ignores the physical structure — which substations connect to which, and how a failure at one node cascades through the network.
What actually works is AI that respects the physics and structure of the grid. Here is how it functions in practice:
Input: Real-time physical data, not just historical trends. Dynamic Line Rating (DLR) systems use IoT sensors and weather data to measure the actual capacity of transmission lines in real time. Traditional ratings assume worst-case conditions and leave 20-40% of capacity on the table. In Indiana and Ohio, AES deployed DLR on 345 kV lines and increased transfer capacity by 61% — at a cost of $0.39 million versus $1.63 million for traditional upgrades. That is a 76% cost reduction.
Processing: Physics-aware AI models. Physics-Informed Neural Networks — or PINNs — embed the actual laws of electrical systems into the AI model itself. Instead of learning patterns from data alone, they enforce the equations that govern how generators, lines, and loads interact. These models solve stability analyses up to 87 times faster than conventional methods. Graph Neural Networks (GNNs) treat the grid as what it actually is — a network of connected nodes — and predict how failures propagate across the topology. GNN-based models have achieved an F1 score of 0.8935 for identifying substations at risk of failure within 30 days.
Output: Explainable decisions with audit trails. When the AI identifies a cascading failure risk, it generates a graph-based explanation showing exactly which lines and substations are driving the risk. Your operators see the reasoning before they act. Your compliance team gets a documented logic trail. For energy and utilities organizations, this transparency is not optional — it is what regulators and boards require.
This approach extends beyond grid operations to the interconnection bottleneck. AI agents can automatically screen the 233 GW ERCOT queue, assign a likelihood-of-completion score to each project, and flag speculative "phantom" loads — requests from companies that filed at multiple sites just to hedge their options. This moves the queue from first-come-first-served to first-ready-first-served.
You can also apply these tools to predict which power plants will retire next. Models using stacked LSTM and gradient boosting have achieved a mean absolute percentage error of just 1.072% in forecasting plant retirement timing. That gives your planning team years of lead time instead of months.
The key architectural principle behind all of this: simulation, digital twins, and optimization systems that mirror your physical infrastructure in software, test scenarios at machine speed, and feed results into decision layers that humans can verify and trust. Combined with explainability and decision transparency, you get AI that your board, your regulators, and your operators can all stand behind.
For a deeper technical exploration, you can read the full technical analysis or explore the interactive version.
Key Takeaways
- PJM's first-ever capacity shortfall — 6,623 MW — signals that America's largest grid cannot keep up with demand, putting reliability at risk by June 2027.
- Data center growth in PJM territory could drive $163 billion in cumulative capacity costs through 2033, directly raising electricity prices for all customers.
- Dynamic Line Rating technology increased transmission capacity by 61% on existing lines at 76% lower cost than traditional upgrades — no new construction needed.
- Physics-aware AI models solve grid stability analyses 87 times faster than conventional methods and can predict substation failures 30 days in advance.
- ERCOT's 233 GW interconnection queue can be triaged by AI that scores project feasibility and filters out speculative requests, cutting years off the process.
The Bottom Line
The PJM shortfall and ERCOT queue crisis are not temporary disruptions — they are structural shifts that will raise your energy costs and threaten your operational continuity for years. AI can help, but only if it understands the physics of the grid, not just patterns in spreadsheets. Ask your AI vendor: when your model recommends a dispatch decision during a grid emergency, can it show my operators and regulators exactly which physical constraints it evaluated and why?