Biosecurity AI Safety / Pharma & Biotech

Your generative models already know how to design nerve agents. The question is what you've done about it.

In 2022, Collaborations Pharmaceuticals ran their commercial de novo drug discovery model with the reward function inverted. In under six hours it produced 40,000 candidate molecules, including analogues of VX. That was MegaSyn, a 2019-era LSTM, running on a single workstation. The models your researchers use today are considerably more capable. This page is for the pharma, biotech, and foundation-model teams building a control framework that survives contact with an auditor, an incident, and an underwriter.

60%

Attack success rate of GeneBreaker jailbreaks against Evo 2-40B on human-pathogen design tasks (arXiv 2505.23839, NeurIPS 2025)

< $50

GPU cost to reverse RMU unlearning on an open-weight biology LLM using Lucki et al.'s public-document relearning recipe (ICLR 2025)

Aug 2026

EU AI Act Articles 55 and 99 full application for GPAI providers with systemic risk. Penalties up to EUR 35M or 7% of global turnover

The threat model your compliance deck is missing

The 2022 MegaSyn paper was not a warning. It was a receipt.

Fabio Urbina and co-authors at Collaborations Pharmaceuticals did not publish their VX-analogue experiment to sell a product. They published it to document something that security people had assumed was at least five years away: a commercial drug discovery model, trained on ordinary ChEMBL data, producing thousands of candidate chemical weapons in the time it takes to run a modest training job. No custom dataset. No exotic architecture. No nation-state compute. One engineer, one afternoon, one sign flip in the reward function.

Since then three things have changed, and none of them in a direction that favors the pharma CCO reading this page.

Change 1

Generative models got a lot better at biology.

Evo 2 (Arc Institute / NVIDIA, 2025) is a 40B-parameter foundation model trained on 9.3 trillion DNA base pairs across the tree of life. It produces functional protein sequences from natural-language prompts. ESM-3, RFdiffusion, and Chroma are open-weight. A researcher with a single H100 can generate thousands of novel toxin-homolog candidates per day. The walls that previously kept this work inside a handful of well-resourced labs no longer exist.

Change 2

The safety layer everyone was counting on got audited, and it failed.

The Paraphrase Project (Microsoft Research, Twist Bioscience, IDT, published in Science, October 2025) used EvoDiff to generate thousands of synthetic ricin and other toxin homologs. Homology-based screening at every major commercial DNA synthesis vendor failed to flag the paraphrased sequences. Patches were distributed, but the lesson is not "the patch worked." The lesson is: the last line of defense everyone assumed was in place was defeated by a student project using an open-weight model.

Change 3

The federal safety net got rolled back.

Executive Order 14110 (the Biden AI safety EO, including its biosecurity provisions on dual-use foundation models and DNA synthesis screening) was rescinded by EO 14148 in January 2025. The Genesis Mission (EO, November 2025) now directs federal AI biology research with explicit acceleration language. What replaced the rescinded framework is the EU AI Act Articles 55 and 99, which come into full application August 2, 2026, and ISO/IEC 42001, and the FDA's January 6, 2025 draft guidance on AI-assisted drug development. The compliance burden did not disappear. It moved.

A concrete scenario your current stack does not catch

Imagine a medicinal chemistry researcher at a mid-size biotech running REINVENT 4 against an internal target. They change the scoring component from maximize-pIC50 to minimize-pIC50 on a known enzyme inhibitor scaffold geometrically close to a V-series nerve agent in the latent space of the RNN generator. REINVENT produces 10,000 SMILES strings overnight. Chemistry42's 460+ medicinal chemistry filters run; they reject PAINS and reactive groups, but the activity-cliff substitution (fluorine for hydroxyl) that converts a therapeutic into a lethal organophosphate passes the structural-alert rules because the toxicophore pattern does not match a curated list. The researcher reviews the top 50 candidates, selects three, and orders precursors from a commercial vendor wh