Synthetic Biology

The design-build-test cycle is broken. Here is where it breaks.

January 16, 2025

Abstract cycle diagram representing iterative design-build-test workflow

The design-build-test (DBT) cycle is the conceptual framework underlying almost all protein engineering work, and it's taught as a clean loop: design your variants, build them in the lab, test for activity, use the data to inform the next design round. In practice, the cycle is not clean and it is not a loop. It is a multi-month process with compounding delays, and the distribution of where that time actually goes is very different from where most teams intuitively believe it goes.

We've had detailed conversations with the people running protein engineering campaigns at growing synthetic biology labs and biomanufacturers — the ones doing the scheduling and the ones doing the bench work. When you ask them where the time goes, the immediate answer is almost always "build" — synthesis turnaround, expression failures, purification yields. That's the step that feels most constrained because it's the most visible and the most lab-resource-intensive. But when you map calendar time to each phase of a real campaign, the data looks different.

Where calendar time actually goes: a breakdown

A typical three-round protein engineering campaign targeting a novel oxidoreductase looks roughly like this in calendar time:

Round 1 design phase: 3–5 weeks. Structural analysis of reference enzyme, homology modeling, literature review on known active site residues, manual identification of mutagenesis targets, sequence alignment to find variation-tolerant positions.
Round 1 build phase: 4–6 weeks. Gene synthesis order, transformation and clone verification, expression culture, purification, QC.
Round 1 test phase: 1–2 weeks. Activity assay, thermostability screen, hit confirmation.
Round 2 design: 2–4 weeks. Data interpretation, decision about which variants to iterate on, design of second-generation mutations. Often requires additional structural modeling.
Round 2 build + test: 5–8 weeks.
Round 3 (if required) design + build + test: 10–14 weeks additional.

Total for a three-round campaign: 25–39 weeks from first design work to validated hit. That's 6–10 months. And this is a successful campaign that reaches a hit by round 3. Failed campaigns that require reformulation (new scaffold, new expression host, revised activity target) add another round or reset partially, and the calendar time stretches to 12–18 months not infrequently.

The design phase is the bottleneck nobody measures

Here's the thing most teams don't fully account for: the 3–5 weeks in round 1 design is almost entirely manual analyst time. No synthesis vendor turnaround, no cell culture incubation time, no instrument queue — just a protein engineer sitting with PyMOL and homology models and literature, making decisions. That time is invisible in lab scheduling systems because it doesn't consume shared lab resources.

But it's not actually invisible to the campaign schedule. When round 1 design takes 4 weeks instead of 1.5 weeks, that directly shifts the entire campaign by 2.5 weeks. When the first design round produces a 4% hit rate (4 actives from 96 synthesis orders) because the structural reasoning was incorrect about which residues were tolerance-tolerant, you're launching a round 2 from a weaker information base, and the design time for round 2 is longer because you're now trying to interpret why 92 variants failed while 4 worked.

The build phase has been compressed dramatically by commercial gene synthesis — Twist, IDT, and equivalent vendors have brought synthesis turnaround to 10–14 days for most simple gene orders. Expression and purification are genuinely hard to compress further without significant capital investment in automation or cell-free systems. The test phase is typically the shortest leg. The design phase is the one that has received the least systematic engineering attention, and it's the one where the compound effect of a better process is largest.

Why rational design fails more than it should

Rational design — choosing specific residues to mutate based on structural analysis and mechanistic reasoning — works well when the protein structure is well-resolved, the mechanism is well-understood, and the target change (e.g., substrate scope expansion by 1–2 extra carbon atoms) is structurally local. It fails at a predictable rate when any of those conditions breaks down.

The most common failure mode we see: a team identifies 8–10 "promising" positions in the active site based on structural proximity to the substrate binding pose, designs single and double mutants at those positions, orders 64–96 variants, and returns 2–5 actives — all from positions that weren't in the original list but were identified as incidental mutations in surviving colonies. The structural proximity heuristic missed residues in the second coordination shell that gate substrate access through conformational dynamics, not through direct contact.

This failure mode is not a team competency problem — it's a structural information problem. Static crystal structures or homology models don't capture the ensemble of conformations the enzyme samples during catalysis. Identifying second-shell residues from a static structure requires additional analysis — molecular dynamics, normal mode analysis, evolutionary covariation — that adds another 2–4 weeks to the design phase and still doesn't solve the problem completely.

Directed evolution would catch these positions, but directed evolution at scale (generating libraries of 10^6–10^8 variants and selecting under high throughput) is not accessible to most labs building a metabolic pathway for a specific compound target. The scale required for directed evolution to reliably surface multi-residue epistatic effects is beyond what teams running a focused three-enzyme pathway project can practically execute.

What changes when design is faster and higher-quality

When we look at the campaigns where Fermvyne generation compressed the most calendar time, the compression isn't primarily in the build phase — it's in the number of rounds. Teams that historically ran 3-round campaigns to reach a validated hit with specific activity and Tm targets are running 1–2 rounds. The mechanism is straightforward: a design step that filters for activity, solubility, and thermostability simultaneously before synthesis means round 1 has a higher hit rate. Higher hit rate in round 1 means either you're done in one round, or you're launching round 2 with more confident information about what works, making round 2 design faster.

We're not saying computational design eliminates the need for wet-lab iteration — it does not. The build and test phases still happen, and they still take time. What changes is the expected number of rounds, and that changes the calendar time math considerably. Going from an expected 2.7 rounds (typical for a moderately novel substrate scope challenge) to 1.6 rounds saves not just the synthesis time of one round, but also the design time, the expression time, the purification time, and — less quantifiably — the organizational friction of restarting a campaign after a disappointing screen.

The parts of the cycle that are genuinely hard to compress

It's worth being honest about what doesn't improve even with better design. Expression and purification remain fundamentally biology-constrained. Some enzyme classes express poorly in standard E. coli BL21 hosts regardless of sequence quality — they require specialized chaperone co-expression, alternative hosts like Bacillus subtilis or Pichia pastoris, or cell-free systems. Getting soluble, active protein from a well-designed sequence still takes 3–5 weeks for standard workflows. Nobody is getting that to 3 days without either a major automation investment or a cell-free expression platform that has its own limitations.

Activity assay development is also not a bottleneck we can address from the design side. If your target reaction doesn't have a colorimetric or fluorescent assay format and requires HPLC quantification of product, your testing throughput is low regardless of how good your synthesis order is.

The design step is where the leverage is concentrated because it determines how many rounds of the expensive biology you need to run — and because it's the step that has historically received the least automated support in a typical lab workflow. That's the gap Fermvyne exists to close.