Tomorrow, at the #SystemsVirologyJournalClub, @samuelhking.bsky.social will present his work with @brianhie.bsky.social using AI genome language models to generate novel, high-fitness bacteriophages.
Tomorrow, at the #SystemsVirologyJournalClub, @samuelhking.bsky.social will present his work with @brianhie.bsky.social using AI genome language models to generate novel, high-fitness bacteriophages.
What if we could autocomplete DNA based on function?
Today in @Nature, we share semantic design—a strategy for function-guided design with genomic language models that leverages genomic context to create de novo genes and systems with desired functions. 🧵
www.nature.com/articles/s41...
In a new preprint from @brianhie.bsky.social's lab, the team reports the first generative design of viable bacteriophage genomes.
Leveraging Evo 1 & Evo 2, they generated whole genome sequences, resulting in 16 viable phages with distinct genomic architectures.
A landmark paper from Brian Hie’s group at the Arc Institute. The de novo design of the synthetic genome of an entirely novel biological entity
www.biorxiv.org/content/10.1...
Genome foundation models, Evo 1 and Evo 2, have now generated viable bacteriophage genomes, demonstrating experimental validation of whole genomes designed by AI!
@arcinstitute.org @brianhie.bsky.social @samuelhking.bsky.social
Read more at GEN:
www.genengnews.com/topics/artif...
Also, check out our blog post giving a concise overview of the technical developments required for phage genome design arcinstitute.org/news/hie-kin.... Thanks to Arc Institute, Stanford Bioengineering, and all the other amazing people who supported this work 🧬
We’re beyond excited for a new era of genome design and to see where researchers might take this. Read more in our preprint, and reach out if you have questions or thoughts! www.biorxiv.org/content/10.1...
To explore the utility of our genome design method for creating resilient phage therapies, we evolved a generated phage cocktail against three different ΦX174-resistant E. coli strains. The generated cocktail rapidly overcame resistance against all strains while ΦX174 did not.
By directly competing the phages against each other, we observed several generated phages that outcompeted ΦX174 or showed faster lytic dynamics, highlighting the ability of our method for designing high fitness mutations.
The viable generated phages harbored hundreds of novel mutations, many of which do not map to any sequence seen in nature. The cryo-EM structure of one phage revealed a genome packaging mechanism designed by Evo that was previously found lethal in rational engineering attempts.
We synthesized and tested 285 generated phage genomes in E. coli C. 16 generated phages inhibited growth in E. coli C but showed no off-target infection in E. coli strains outside of ΦX174’s natural range, demonstrating the intended host specificity.
By fine-tuning Evo 1 and Evo 2 on Microviridae sequences, we honed the models’ understanding of ΦX174-like genomes, which allowed us to generate sequences fulfilling our design criteria with a high success rate.
ΦX174 is a small Microviridae phage that infects its host E. coli C. It has a very intricate genetic architecture, making it a challenging template. We established our design criteria on ΦX174 and Microviridae sequences, including a “tropism constraint” for host specificity.
We first needed clear design criteria to guide our genome generation process. As a design template, we chose ΦX174, a classic phage in molecular biology, which was the first genome ever sequenced and synthesized.
But can DNA language models generate complete, viable genomes? To investigate this, we developed a modular framework for designing phages targeting a chosen bacteria, to maximize benefit for phage-based biotechnologies and therapeutics.
DNA language models such as Evo 1 and Evo 2, trained on millions of genomes, learn complex features of genomes at an unfathomable scale. These models work much like ChatGPT, except for DNA. We’ve previously shown that they can generate novel CRISPR-Cas systems, amongst others.
Designing a genome is an incredibly complex task. The overwhelming number of considerations has limited what we’ve previously been able to achieve in synthetic biology.
We chose to generate bacteriophage genomes, given their utility in biotechnology and therapeutics, and because they are safe and feasible to test in the lab. Phages are viruses that infect and kill bacteria, and are emerging as a promising strategy to combat rising antibiotic resistance.
@claudiadriscoll.bsky.social @david-li.bsky.social @danguo.bsky.social @adititm.bsky.social Garyk Brixi @maxewilkinson.bsky.social
I’ll start by recognizing that this work wouldn’t have been possible without the incredible support of my PhD advisor @brianhie, and the brilliant labmates and scientists who I had the honor of working with:
Many of the most complex and useful functions in biology emerge at the scale of whole genomes.
Today, we share our preprint “Generative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧵
We trained a genomic language model on all observed evolution, which we are calling Evo 2.
The model achieves an unprecedented breadth in capabilities, enabling prediction and design tasks from molecular to genome scale and across all three domains of life.
Excited to have the first project of my PhD out!! By leveraging genomic language model Evo’s ability to learn relationships across genes (i.e., "know a gene by the company it keeps"), we show that we can use prompt-engineering to generate highly divergent proteins with retained functionality. 🧵1/N