🧬🦠 Nucleic Acid Synthesis: Understanding the starting point

I still occassionally run into very informed people in bio who are surprised to hear about the horsepox synthesis story.

Some context

  • Poxviruses are the elephants among human pathogenic viruses, with genome size on the order of 100 kb (compare with SARS-CoV-2 or polio)
  • Yes, you can just synthesize horsepox DNA from scratch, divide into 4 parts (we can’t yet reliably do 100 kb chunks), stir it in culture, DNA uses cell machinery for transcription and translation, and horsepox magically assembles from the constituent proteins.
  • This was a couple of years ago, synthesis is getting better by the year, especially for long chunks of the genome.
  • Horsepox is the same order of complexity as smallpox. Physical samples of smallpox are contained in two places in the world (one of which is, not very reassuringly, a virology institute in Siberia that formerly worked on weaponizing both smallpox and filoviruses), but the full genome is publicly known and available to anyone.
  • Multiple labs pulled of successful de novo synthesis of SARS-CoV-2 during Covid.

⚜️⚜️⚜️

Below are some questions I asked GPT-4o (search enabled). Please let me know if any of the below seems wrong.

Which labs were able to produce analogs of SARS-CoV-2 using nucleic acid synthesis alone?

Several labs have synthesized SARS-CoV-2 or close analogs using DNA synthesis alone (i.e., without using physical viral samples). Key examples:

  • Christian Drosten’s lab (Charité, Berlin) – Early 2020. Developed a reverse genetics system based on synthetic DNA ordered from GenScript. Enabled full recovery of infectious SARS-CoV-2 from cloned DNA fragments.
  • Stanley Perlman & Volker Thiel (University of Bern) – Built synthetic SARS-CoV-2 using a bacterial artificial chromosome (BAC) system. Assembly relied entirely on synthetic cDNA fragments and in vitro transcription.
  • University of North Carolina (Ralph Baric’s lab) – Used a synthetic reverse genetics system previously developed for SARS-CoV-1 and MERS-CoV to generate SARS-CoV-2 analogs for functional studies and vaccine testing.
  • Synthetic Genomics (Venter Institute spinoff) – Collaborated with Baric’s group; capable of full-length coronavirus genome synthesis and recovery of live virus.

By early 2020, several groups had shown that SARS-CoV-2 could be reconstructed entirely from sequence data using commercial DNA synthesis, assembly in yeast or bacteria, and in vitro RNA transcription.

How many labs with such capabilities are there in the world? Order of magnitude estimate is sufficient.

Order-of-magnitude estimate: ~100–300 labs worldwide have the capability to synthetically reconstruct SARS-CoV-2 or similar RNA viruses from sequence alone.

Breakdown:

  • ~10–30 top-tier BSL-3/BSL-4 virology labs with proven experience in synthetic reverse genetics for coronaviruses or similar. ~50–100 molecular biology/synthetic biology labs with access to: commercial gene synthesis, expertise in large-genome assembly (yeast, BAC, etc.), and high-containment facilities (or collaborators who have them). ~50–100 biotech companies and defense-related institutions with advanced synthetic biology infrastructure.
  • The bottleneck isn’t gene synthesis or assembly (both widely available), but safe recovery of infectious virus, which requires both BSL-3 containment and virological expertise.

Is it fair to say that life coronavirus is not needed and the only thing necessary to produce life coronavirus is the DNA fragments and appropriate culture?

Yes—that’s fair. To produce live SARS-CoV-2, you do not need an existing viral sample. What’s required:

  • DNA fragments encoding the viral genome, typically synthesized commercially.
  • Assembly system (e.g. yeast, BAC, or in vitro ligation) to construct the full-length cDNA.
  • In vitro transcription of the cDNA to generate infectious viral RNA.
  • Appropriate cell culture (e.g. Vero E6) to launch replication from the RNA.
  • BSL-3 containment for handling the resulting live virus.

This route—synthetic reverse genetics—was used for SARS-CoV-2 in early 2020, before many labs had physical virus. Genome data alone suffices if you have the tools and know-how.

How much would it cost to set up a new lab with such capabilities?

Fermi estimate: €1M–€5M to set up a lab capable of synthesizing and recovering live SARS-CoV-2.

Breakdown:

  • BSL-3 lab buildout: €1M–€3M depending on local regulations, HVAC, decontamination systems, and engineering controls.
  • Core equipment:
    • Cell culture: incubators, biosafety cabinets, centrifuges — ~€100k–€300k.
    • Molecular biology: thermocyclers, gel systems, nanodrop/qPCR — ~€50k–€150k.
    • In vitro transcription setup + RNA handling — ~€50k.
    • Optional: electroporators, transfection systems — ~€20k–€100k.
    • Synthetic DNA costs: Full SARS-CoV-2 genome in fragments — ~€5k–€25k depending on provider and complexity.
    • Personnel and training: Salaries for 3–6 staff (virologists, molecular biologists, safety officers) — ~€200k–€600k annually.
    • Compliance and certification: Varies by country but often adds €100k+ in consulting, legal, and procedural costs.

This assumes a fresh build. Retrofitting or using shared infrastructure could bring it under €1M.

What security checks did the 4 labs mentioned in first response have to pass to synthesize the genomes

The labs mentioned (Drosten, Thiel, Baric, Synthetic Genomics) used commercial DNA synthesis providers, which are subject to gene synthesis screening protocols, primarily:

IGSC Screening Framework (International Gene Synthesis Consortium): Screens all orders against databases of select agents and pathogens (e.g., U.S. Select Agents List, EU control lists). Orders flagged if sequences resemble dangerous pathogens or toxins. Requires verification of the customer’s identity and legitimacy of research. Provider-specific checks: GenScript, IDT, Twist Bioscience, etc., perform internal risk assessments and may require proof of institutional biosafety approval before shipping full-length or high-risk gene fragments. Institutional Biosafety Committee (IBC) approval: Labs must present the project plan, facilities (e.g., BSL-3), and personnel training for internal review. Approval is usually needed before synthesis orders are placed or virus recovery begins. National or regional regulations: E.g., U.S. requires compliance with NIH Guidelines and possibly CDC/DoD oversight. Germany, Switzerland, and other countries have pathogen-specific handling regulations and import/export controls.

In practice, recovery of full viral genome is typically split across multiple fragments to stay below red-flag thresholds, then assembled in-house. Security checks are strongest at the synthesis and institutional levels, not centralized globally.

what was the reasoning behind approving the synthesis at all four labs?

The reasoning behind approving synthesis at all four steps—commercial synthesis, institutional biosafety, national regulation, and internal risk assessment—relies on scientific necessity, precedent, containment, and intent:

  • Scientific necessity:
    • Early in the pandemic, live virus wasn’t widely available. Synthesis was the only way for many labs to begin developing diagnostics, studying viral replication, or testing vaccines/antivirals.
    • Reverse genetics systems are essential tools in virology; SARS-CoV-1 and MERS-CoV had already been synthetically reconstructed in high-containment labs.
  • Precedent and legitimacy:
    • All four labs had a long record of responsible work with high-risk viruses, published methods, and clear public health goals (not dual-use or gain-of-function with nefarious intent).
    • Their institutions (e.g., UNC, Charité) had existing approvals for similar work.
  • Containment and oversight:
    • Synthesis was permitted only with BSL-3 containment and biosafety protocols in place.
    • Work had to be reviewed and approved by local biosafety committees (e.g., IBCs) and often required reporting to national authorities.
  • Intent and transparency:
    • Projects were publicly disclosed in peer-reviewed publications and framed as responses to the emerging threat.
    • Fragmented synthesis (non-infectious parts) ensured no individual order posed a risk; full virus recovery occurred only in approved, high-security environments.

In short: synthesis was allowed because it was necessary, controlled, and performed by trusted actors with proven capabilities and transparent goals.

Published by