A relatively new startup called EvolutionaryScale has secured a massive tranche of cash to build AI models to generate novel proteins for scientific research.
EvolutionaryScale today announced that it raised $ 142 million in a seed round led by ex-GitHub CEO Nat Friedman, Daniel Gross and Lux Capital with participation from Amazon and NVentures, Nvidia’s corporate venture arm. The company also released ESM3, an AI model it describes as a “frontier model” for biology — one that can create proteins for use cases like drug discovery and materials science.
“ESM3 takes a step toward a future of biology where AI is a tool to engineer from first principles, the way we engineer structures, machines, and microchips and write computer programs,” EvolutionaryScale co-founder and chief scientist Alexander Rives said in a statement.
Rives, along with Tom Secru and Sal Candido, began developing generative AI models to explore proteins while at Meta’s AI research lab, FAIR, in 2019. After their team was disbanded, Rives, Secru and Candido left Meta to continue building on the work they’d started.
Characterizing proteins can reveal the mechanisms of a disease, including ways to slow it or reverse it, while creating proteins can lead to entirely new classes of drugs, tools and therapeutics. But the current process for designing proteins in the lab is costly, both from a computational and human resource standpoint.
Designing a protein entails coming up with a structure that could plausibly perform a task inside the body or a product, then finding a protein sequence — the sequence of amino acids that make up a protein — likely to “fold” into the structure. Proteins must correctly fold into three-dimensional shapes in order to carry out their intended function.
Trained on data set of 2.78 billion proteins, ESM3 can “reason over” the sequence, structure and function of proteins, Rives says — enabling the model to generate new proteins a la Google DeepMind’s AlphaFold. EvolutionaryScale is making the full 98-billion-parameter model available for non-commercial use through its cloud Forge developer platform and releasing a smaller version of the model for offline use.
EvolutionaryScale claims that it used ESM3 to generate a new variant of green fluorescent protein (GFP), the protein responsible for the glowing of jellyfish and luminescent colors in coral. A preprint paper on the company’s website details its work.
“We’ve been working on this for a long time, and we’re excited to share it with the scientific community and see what they do with it,” Rives continued.
EvolutionaryScale isn’t a charity, of course — the roughly-20-employee company tells TechCrunch that it plans to make money through a combination of partnerships, usage fees and revenue sharing. EvolutionaryScale might work with pharmaceutical companies to integrate ESM3 into their workflows, for example, or revenue-share with researchers for breakthrough discoveries commercialized using ESM3.
To this end, EvolutionaryScale says that it’ll soon bring ESM3 and its derivatives to select AWS customers via AWS’ SageMaker AI dev platform, Bedrock AI platform and HealthOmics service. ESM3 will also be available to select customers using NVIDIA’s NIM microservices, supported with an Nvidia enterprise software license.
EvolutionaryScale says that both AWS and Nvidia customers will be able to fine-tune ESM3 using their own data if they wish.
It could be a while before EvolutionaryScale turns a profit. In the company’s pitch deck, a copy of which Forbes managed to obtain last August, EvolutionaryScale repeatedly emphasized that it could take a decade for generative AI models to help design therapies. The firm will also have to fend off competition like DeepMind’s spinoff Isomorphic Labs, which already has contracts with big pharma companies, as well as Insitro, publicly-traded Recursion and Inceptive.
EvolutionaryScale’s big bet is scaling up its model training to incorporate data beyond proteins and create a general-purpose AI model for biotech applications.
“The incredible pace of new AI advances is being driven by increasingly large models, increasingly large data sets and increasing computational power,” an EvolutionaryScale spokesperson said. “The same holds true in biology. In research over the last five years, the ESM team has explored scaling in biology. We find that as language models scale, they develop an understanding of the underlying principles of biology, and discover biological structure and function.”
Sounds wildly ambitious to this reporter — but having deep-pocketed investors surely helps.