MIT scientists have released a strong open-source AI model called Boltz-1 that might significantly speed up biomedical research and drug development.
Developed by a team of researchers on the MIT Jameel Clinic for Machine Learning in Health, Boltz-1 is the primary fully open-source model to realize state-of-the-art performance at the extent of AlphaFold3, Google DeepMind's model predicts the 3D structures of proteins and other biological molecules.
MIT graduate students Jeremy Wohlwend and Gabriele Corso were the lead developers of Boltz-1, together with MIT Jameel Clinic research partner Saro Passaro and MIT electrical engineering and computer science professors Regina Barzilay and Tommi Jaakkola. Wohlwend and Corso introduced the model at an event at MIT's Stata Center on December 5 and said their ultimate goal is to advertise global collaboration, speed up discoveries and supply a sturdy platform for the advancement of biomolecular modeling.
“We hope this can be a place to begin for the community,” Corso said. “There's a reason we call it Boltz-1 and never Boltz. This just isn’t the tip of the story. We want the community to contribute as much as possible.”
Proteins play an important role in just about all biological processes. The shape of a protein is closely related to its function. Therefore, understanding protein structure is crucial for developing latest drugs or developing latest proteins with specific functionalities. But due to the extremely complex process by which a protein's long amino acid chain is folded right into a three-dimensional structure, accurately predicting this structure has been a serious challenge for many years.
DeepMind's AlphaFold2, which won Demis Hassabis and John Jumper the 2024 Nobel Prize in Chemistry, uses machine learning to quickly predict 3D protein structures so accurate that they’re indistinguishable from those derived experimentally by scientists. This open source model has been utilized by academic and business research teams world wide and has driven many advances in drug development.
AlphaFold3 improves on its predecessors by incorporating a generative AI model, a so-called diffusion model, that may higher handle the uncertainties related to predicting extremely complex protein structures. However, unlike AlphaFold2, AlphaFold3 just isn’t completely open source and just isn’t available for business use, which gave rise to this criticism from the scientific community and began a global race to construct a commercially available version of the model.
For their work on Boltz-1, the MIT researchers initially took the identical approach as AlphaFold3, but explored possible improvements after examining the underlying diffusion model. They incorporated those who increased the model's accuracy essentially the most, corresponding to latest algorithms that improve prediction efficiency.
Along with the model itself, they’ve open-sourced their entire training and tuning pipeline in order that other scientists can construct on Boltz-1.
“I’m very happy with Jeremy, Gabriele, Saro and the remainder of the team at Jameel Clinic for making this release possible. This project has taken many days and nights of labor with unwavering determination to realize this goal. There are many exciting ideas for further improvements and we sit up for sharing them in the approaching months,” says Barzilay.
It took the MIT team 4 months of labor and plenty of experiments to develop Boltz-1. One of their biggest challenges was overcoming the anomaly and heterogeneity of the protein database, a group of all of the biomolecular structures that 1000’s of biologists have solved over the past 70 years.
“I wrestled with this data for a lot of nights. A whole lot of it’s pure specialist knowledge that you simply simply have to accumulate. There are not any shortcuts,” says Wohlwend.
Ultimately, their experiments show that Boltz-1 achieves the identical accuracy as AlphaFold3 for a variety of complex biomolecular structure predictions.
“What Jeremy, Gabriele and Saro have achieved is nothing wanting remarkable. “Your exertions and persistence on this project have made biomolecular structure prediction more accessible to the broader community and can revolutionize advances in molecular science,” says Jaakkola.
The researchers plan to further improve Boltz-1's performance and reduce the time it takes to make predictions. They also invite researchers to try Boltz-1 on their device GitHub repository and connect with other Boltz-1 users Slack channel.
“We imagine it’ll take many, a few years to enhance these models. We are very involved in collaborating with others and seeing what the community does with this tool,” adds Wohlwend.
Mathai Mammen, CEO and president of Parabilis Medicines, calls Boltz-1 a “game-changing” model. “By open-sourcing this advance, the MIT Jameel Clinic and its collaborators are democratizing access to cutting-edge tools in structural biology,” he says. “This groundbreaking effort will speed up the event of life-changing medicines. Many due to the Boltz-1 team for driving this profound step forward!”
“Boltz-1 might be of tremendous profit to my lab and your complete community,” adds Jonathan Weissman, an MIT professor of biology and a member of the Whitehead Institute for Biomedical Engineering, who was not involved within the study. “We will see a wave of discoveries made possible by the democratization of this powerful tool.” Weissman adds that he expects the open source nature of Boltz-1 to steer to quite a lot of creative latest applications.
This work was also supported by a US National Science Foundation Expeditions grant; the Jameel Clinic; the US Defense Threat Reduction Agency's Discovery of Medical Countermeasures Against New and Emerging Threats (DOMANE) program; and the MATCHMAKERS project, supported by the Cancer Grand Challenges partnership funded by Cancer Research UK and the US National Cancer Institute.