Bio-IT World Keynote Highlights Collaborative Intelligence in AI-Driven Drug Discovery

Juni 2, 2026 - 21:10
 0  0
Bio-IT World Keynote Highlights Collaborative Intelligence in AI-Driven Drug Discovery

BOSTONA critical part of the conversation around the use of artificial intelligence (AI) in drug discovery focuses on the development of the foundation models that underpin AI-based applications and workflows. There are also discussions about federated learning and how it provides a secure path to accessing critical training data for AI models. These two themes underpinned the keynote panel that kicked off the second day of Bio-IT World Conference 2026, which took place last month in Boston. 

Through presentations and a group discussion, the six-person panel painted a picture of the different types of foundation models and federated learning approaches, as well as ways to optimize AI’s performance for specific projects. Importantly, they discussed the AI Structural Biology (AISB) initiative, which provides a platform for pooling proprietary protein-ligand structure data to train OpenFold3, an AI model designed to precisely predict molecular interactions. In fact, several members of the panel were either directly or indirectly involved in the OpenFold consortium.   

Woody Sherman, PhD Founder and Chief Innovation Officer, PsiThera [Uduak Thomas]
Woody Sherman, PhD, founder and chief innovation officer, PsiThera [Uduak Thomas]

That group included Woody Sherman, PhD, founder and chief innovation officer at PsiThera, who serves as chair of the OpenFold executive committee. Sherman reiterated the benefits of open-source platforms and how they are making inroads into the drug discovery space.   

“We’re going to need these open platforms that we can all build on,” he said. “We can’t all be building our own foundation models from scratch. It just doesn’t make sense as an ecosystem. It is important to have these open platforms so that we can interact precompetitively, build the best foundation models, and then “we can get into federated learning.”    

From AlphaFold to OpenFold 

The non-profit OpenFold Consortium consists of scientists from over 40 technology companies, startups, pharma companies, and academic institutions. It builds on a lot of the progress made in the 2010s in terms of predicting protein structures from sequences, “a foundational problem in biochemistry.” That progress was quantified at least in part by efforts like the Critical Assessment of Structure Prediction (CAS) competitions, said Mohammed AlQuraishi, PhD, an assistant professor of systems biology at Columbia University, during his presentation.  

The image shows Mohammed AlQuraishi, PhD Assistant Professor, Systems Biology Columbia University, one of the keynote speakers at the recent Bio-IT World Conference
Mohammed AlQuraishi, PhD,
assistant professor, systems biology,
Columbia University [Uduak Thomas]

AlphaFold and later iterations of the platform “compressed decades of progress in about four years,” he said. Besides reliably predicting protein structures, AlphaFold provided “calibrated predictions” that gave biologists a sense of the accuracy of its predictions. But there were limitations.   

“It did not really have an understanding of anything other than just protein structure,” meaning it missed things like ions that were also part of the structure, he said. It also struggled to handle things like protein complexes, ligands, and cofactors. Another challenge was that although the computational models could predict targets that were closer to the training dataset used, their ability to make viable predictions dropped the further away the targets were from the training dataset. Additionally, “these models have limited ability to capture conformational changes,” AlQuraishi noted. “This becomes a major bottleneck in being able to reliably model allosteric modulators or cryptic pockets or similar types of systems.” Besides the technical limitations, there were also licensing limitations to consider.  

AlQuraishi positioned OpenFold as an open source, high-performance, and reproducible alternative to AlphaFold that serves as a common platform for innovation for the community. “Partly it’s a code base, essentially a set of tools that allow [scientists] to build these types of models and extend them and apply them,” AlQuraishi explained. “It’s also an academic-industry consortium that provides a steerable mechanism for industry to support science that is open source and that’s broadly useful, but it’s also in tune with the needs of industry.”

Federated learning and foundation models 

The next set of presentations made the argument for using federated learning to leverage proprietary biopharma datasets to train AI models. The presentation from Jonathan Gilbert, PhD, senior director, ecosystem growth and contributor partnerships at Eli Lilly, offered an example of how the pharma company has used federated learning to improve model predictions in different contexts.  

“It’s not surprising that companies are very sensitive to the proprietary data that they’ve spent incredible investments generating,” he said. With federated learning, models are trained in the environment where the data is housed, making it possible to “improve model performance while maintaining the privacy of the individual training sets.”   

Eli Lilly launched the TuneLab platform in 2025, through which it provides access to its own AI and machine learning models to biotech companies at no costalthough those that choose to use the models are expected to contribute datasets to help improve them. “These are the same models that we use every day,” he said. “These models have been trained on decades of internal data sets. That’s maybe over a billion dollars in data that have been brought into models by Lilly.” 

Jonathan Gilbert, PhD Senior Director, Ecosystem Growth and Contributor Partnerships, Eli Lilly and Company. [Uduak Thomas]
Jonathan Gilbert, PhD, senior director, Ecosystem Growth and Contributor Partnerships, Eli Lilly and Company. [Uduak Thomas]

Gilbert noted that since its launch, the appetite for TuneLab has been quite strong. At the time of the presentation, there were more than 75 partners in TuneLab, and it was being used in dozens of countries across three continents. Furthermore, during the meeting, Eli Lilly and Collaborative Drug Discovery (CDD), a provider of data management solutions for pharma and biotech, announced an agreement to integrate TuneLab into both the core and AI modules within the CDD Vault platform.  

For now, TuneLab is focused on models for small molecules and antibody development, but there are plans to release additional models in the near future. Lilly is also working on additional partnerships similar to the one with Collaborative Drug Discovery. “This is an active work in progress and [we are] thinking [about] how we can scale this,” Gilbert said. And how can “[we] build a community to improve those models such that we can create medicines faster for more people.” 

The presentation from José-Tomás Prieto, PhD, director of AI programs at Apheris, built on Gilbert’s presentation but focused on the complexities of implementing industrial federated learning setups. The key takeaway from his talk was that successfully implementing federated learning at an industrial scale is not a plug-and-play capability but rather a process that requires engineering rigor, data preparation without centralization, and enterprise-level deployment strategies. His company, Apheris, has experience with this process as they provide solutions that power federated networks for drug discovery. 

José-Tomás Prieto, PhD Director of AI Programs Apheris. [Uduak Thomas]
José-Tomás Prieto, PhD, director of AI programs, Apheris [Uduak Thomas]

One of the networks that they support is the AI Structural Biology Network, a collaboration that brings together several of the top 20 biopharma companies. Its intent is to allow AI models designed to predict the 3D structure of molecule complexes to be trained on proprietary protein structure data. The common denominator for these and other networks that Apheris supports is that the models are trained on proprietary data in a secure way, so the data never leaves the environments of any of the nodes in the network.  

“It’s obvious that there’s a lot of public data, but the public data skews toward well-characterized targets,” Prieto said. “The industry data complements that view, with more diverse data and sometimes higher quality data. And if there is something to learn about the AI world today is that you cannot necessarily model your way out of a data problem, and you can’t buy this data either.” Federated learning provides a solution to that problem. “It’s quite remarkable that a couple of years ago … it was mostly IT people making the decision of whether to use federated learning products,” he noted. “Today, we have business leaders trying to get ahold of this technology and leverage the power.” 

Prieto also discussed some important considerations for building federated networks. To provide a sense of the complexity involved, “each one of these companies have their own network constraints, their own firewall rules, their own compute window that they have to negotiate with the cloud providers to make sure that the compute comes online at the right time.”  

It is also important to consider “that data preparation without centralization is a new paradigm,” he continued. “Each company has [their] own ways of organizing the data or harmonizing your data,” as well as their own standards, but at the same time you have to have comparable training setups so that the foundation models can actually learn from this.” Furthermore, “your federated learning partner has to be able to work with your processes, has to be able to understand how to streamline the reviews, the security [and] the privacy requirements” among other things before projects can move forward. 

A key point that both Prieto and Arman Zaribafiyan, PhD, head of strategic alliances, AI simulation at SandboxAQ, emphasized was that while federated models provide broad generalizations, fine-tuning them on specific, project-level data is crucial for translating model performance into practical impact for drug programs.  

SandboxAQ has worked with the OpenFold consortium on its models and co-folding models, among other projects. “We are really living in exciting times when it comes to ML-accelerated drug discovery,” Zaribafiyan said. “There’s really an explosion of new models we see every day. And what we see at Sandbox with our partners in large pharma and biotech companies is that it’s getting a little bit overwhelming and harder to put these models into good use.” Furthermore, “a lot of these models are amazing in achieving great results on benchmarks, and it’s fascinating for publishing papers, but they fail to generalize to real drug discovery use cases.”

Arman Zaribafiyan, PhD, Head of Strategic Alliances, AI Simulation SandboxAQ. [Uduak Thomas]
Arman Zaribafiyan, PhD, head of strategic alliances, AI simulation, SandboxAQ [Uduak Thomas]

Commenting on some of the lessons SandboxAQ has learned through its partnerships, Zaribafiyan noted that “fine-tuning could help a lot to bridge this gap.” SandboxAQ and others have published data showing that “even a small fine-tuning effort can dramatically change the predictive accuracy of these models.” Over the next few years, he believes these are going to become the norm: “We’re going to see more and more of these federated platforms we use for both pooling data but also for fine-tuning these models on project-specific data.”   

Zaribafiyan closed his presentation with an announcement of a new platform from SandboxAq that connects quantitative models for drug discovery to large language models, allowing scientists to launch and run simulations and workflows using plain English, much like prompts written for ChatGPT. “No code required,” he said. 

Foundational models at work in crop science and drug development 

Christina Taylor, PhD, senior science fellow and computational molecular design lead at Bayer, focused on how her company has leveraged foundational models and AI to drive decisions in crop science and pharma.

Christina Taylor, PhD Senior Science Fellow and Computational Molecular Design Lead Bayer. [Uduak Thomas]
Christina Taylor, PhD, senior science fellow and computational molecular design lead, Bayer [Uduak Thomas]

“I think that this community-driven software has really allowed faster innovation in the field overall,” and “sharing some of these foundational architectures allows everyone to be able to drive biomolecular AI work,” and “ has driven some of the very quick advancements we’ve seen in the field over the past few years.” Community projects like this also save time and are more sustainable since “everybody doesn’t need to be training their own foundational models.”   

To date, these models have helped Taylor and her team better solve crystal structures. “One of the big problems with solving crystal structures is actually determining the phase, and by doing protein, we’re able to actually solve these structures faster and more efficiently,” she said. “Another thing is taking these foundational models and fine-tuning them … we’re using that quite regularly to improve our development of biomolecular pharmaceuticals as well as some of our crop science traits.” Other applications that Taylor and her team have used the models for include studying protein-protein interaction as well as for modeling enzyme catalysis.

The post Bio-IT World Keynote Highlights Collaborative Intelligence in AI-Driven Drug Discovery appeared first on GEN - Genetic Engineering and Biotechnology News.

Apa Reaksi Anda?

Suka Suka 0
Kurang Suka Kurang Suka 0
Setuju Setuju 0
Tidak Setuju Tidak Setuju 0
Bagus  Bagus 0
Berguna Berguna 0
Hebat Hebat 0
Edusehat Platform Edukasi Online Untuk Komunitas Kesehatan Agar Mendapatkan Informasi Dan Pengetahuan Terbaru Tentang Kesehatan Dari Nasional Maupun Internasional. || An online education platform for the health community to obtain the latest information and knowledge about health from both national and international sources.