Antibody developability: How to effectively navigate challenges and opportunities in predicting antibody properties?

For the fourth year in a row, Discngine organized its virtual annual gathering called Discngine Meetup with customers and partners from the drug discovery industry. The event was dedicated to the early biotherapeutics discovery, focusing on “Innovative Strategies for Biotherapeutic Developability Assessment.”

One of the event highlights was a very interesting panel discussion Cutting-edge technologies and tactics for advancing biotherapeutics drug discovery where industry and academic experts exchanged thoughts on the status and impact of new technology that supports antibody developability and speculated on future trends for the field.

 The panel was chaired by Per Greisen (President & Head of Protein Design at BioMap), who led the discussion of the panelists:

  • Essam Metwally (Principal Scientist at MSD Merck)

  • Nels Thorsteinson (Director of Biologics at CCG)

  • Matthew Raybould (Postdoctoral Researcher Oxford Protein Informatics Group at University of Oxford)

In this blog article, we will share insights from the panel including the current landscape, challenges, and transformative potential of new technologies in supporting antibody developability.

Introduction

Antibody developability properties

In the antibody discovery process, after the initial screening of candidates to select the most promising ones in terms of target affinity, the next critical step is developability assessment. This optimization process evaluates key parameters—such as immunogenicity, solubility, and stability—to ensure antibodies are suitable for large-scale production and therapeutic use. The implementation of effective developability assessments from the early stages of antibody discovery increases the likelihood of producing safe, effective, and manufacturable therapeutics for clinical application.[1]

However, producing antibodies with the necessary developability features is time-consuming and often challenging. Recent advances in artificial intelligence (AI) offer the potential to accelerate biotherapeutics’ entrance into the clinic by enabling the prediction of critical properties. Integrating machine learning approaches can generate and analyze large datasets to inform antibody design, evaluate candidate molecules more efficiently, and facilitate rapid, iterative optimization cycles.

 

Current Landscape and Challenges

In the field of antibody discovery, significant efforts have been made to improve developability predictions by combining various in silico methodologies.[2] AI methods excel in analyzing large-scale datasets and uncovering underlying data patterns, often leading to high predictive accuracy of machine learning (ML) algorithms. However, many descriptors used to predict antibody developability do not always correlate well, resulting in inaccurate predictions [3].

While AI models may not deliver precise quantitative predictions in descriptor values, they are still highly valuable for developability assessments. In fact, AI's primary strength in biologics development lies in ranking and flagging the most promising candidates rather than achieving absolute accuracy in parameter determination, and this is where these technologies are proving successful and could be further exploited in the future.

 I don’t think that AI, as it stands today, is truly generating novel ideas.
What it is doing, and it is really good at, is exploiting the underlying patterns that are inherent in the data, and predicting which is the best compound to develop next.
— Essam Metwally (Merck)

Beyond prediction accuracy, a significant challenge with AI methods is posed by data accessibility and quality. Ideal datasets are often siloed in the industry and not publicly available, and the scarcity of data inherently limits the application of these technologies. Even data from the public domain is often dispersed across different studies or repositories and with limited standardization, complicating curation and integration efforts. This lack of uniformity introduces variability in AI model outputs, ultimately impacting prediction reliability.

Furthermore, in the antibody developability field there is a greater abundance of data on successful antibodies over failed ones, providing a partial and biased dataset for the AI model.

 
There are challenges around data availability: the datasets that would be really great to have, are often the proprietary ones
— Matthew Raybould (University of Oxford)

Opportunities and Transformative Potential

Despite these significant challenges, novel AI tools hold great promise for accelerating drug discovery and have been increasingly included in research pipelines to supplement traditional techniques.

For instance, to surpass the limit of prediction accuracy for antibody descriptors, a successful approach was to combine ML with physics-based methods such as molecular dynamics (MD). This strategy improved prediction accuracy in areas such as thermostability,[4] surpassing the results of either method alone. AI methods are, therefore, already catalyzing new ideas and timelines, enabling the generation of novel solutions, previously difficult to obtain.

The higher adoption of these tools is also restructuring the decision-making process in drug discovery, with Modelers engaging in discussions with Chemists and Biologists much earlier in the discovery process. This early collaboration allows for more extensive knowledge and data to be gathered from the beginning of the drug design project, thereby moving from ideation to implementation sooner and more efficaciously.

Recent advancements in AI and experimental techniques (such as Next-Generation Sequencing and cryo-EM) are converging to tackle the challenge of predicting antibody properties, enhancing prediction accuracy and efficiency. Leveraging the synergy between AI and experimental techniques is crucial for addressing the complexity of biotherapeutics and accelerating the development of novel therapies.[5]

AI tools gave us the ability to generate new solutions that with previous methods would have been very hard to obtain, catalyzing ideation and timelines.
We know that these technologies need to converge, so you need the experiment to supplement the AI-powered theoretical studies
— Per Greisen (Biomap)
 

Future Trends

The increased availability of data (both experimental and in silico) and the refinement of AI models will likely drive antibody discovery and development in the near future.

The contribution of antibody-antigen 3D structure predictions given by AI tools such as AlphaFold3 [6] and RoseTTAFold [7] could immensely impact the quality of antibody property predictions. Especially in data-poor domains, the 3D representation offers extensive valuable information for calculating properties, facilitating the identification of trends within the dataset, and ultimately aiding in the prediction of antibody developability.

Another key area that could greatly benefit from the ability of AI algorithms to extract patterns from large amounts of data is the prediction of Complementarity-Determining Regions (CDRs) in antibodies, particularly CDR3. The prediction of CDR3 is in fact especially challenging due to its variability and complexity; this loop is often the most diverse region of the antibody and plays a critical role in antigen recognition and binding affinity. AI predictions could inform the CDR3 loop design with greater accuracy and authority.

 
 

How accurate are antibody-antigen AF3-predicted complexes? Download PD-1 case study to discover:

  • Where AF3 models showed the biggest accuracy

  • Importance of post-translational modifications when generating models

  • Limitations of the AI-predictions

Additionally, new technologies in antibody sequence optimization are advancing to provide comprehensive information on both the variable domains and the entire binding site of antibodies.[8] Traditionally, researchers have focused more on heavy chains due to their greater sequence and structural diversity, often overlooking the role of light chains. However, recent research indicates that light chains are crucial in determining binding site specificity and diversity. Gaining insights into the complete binding site, including both heavy and light chains, will offer new insights and have a significant impact on antibody research.

Concerning data accessibility, even if pharmaceutical companies are unlikely to release their proprietary data, they could significantly contribute to accelerating the antibody development field by sharing the methods developed using this data. Such advancement could enable smaller organizations and academic institutions to leverage valuable insights behind corporate firewalls. However, the highest impact could be given by a community's benchmarking initiative, not just limited to the industry, to sharing which methods are working and which ones are not, to ultimately solving the developability issues [9].

Finally, even if the AI models will surely have a significant impact on drug discovery, it is key that the improvement in these technologies goes alongside advancements in experimental techniques, to provide cheaper and more accessible ways of producing experimental data. For instance, significant advancements in high-resolution cryo-EM could provide information not limited to a single conformation but a range of structures for each compound, opening unprecedented insights into antibody research.

We model the data because we don’t have the experimental ones.
If I had a wish to improve biologics research, it would be that we have this experimental data.
— Essam Metwally (Merck)

Conclusion

The integration of AI and machine learning into antibody property prediction is already transforming the landscape of biotherapeutic development. Despite the challenges of data accessibility and the need for improved prediction accuracy, the synergy between AI tools and experimental techniques is paving the way for more efficient and effective drug discovery processes.

As data availability continues to grow and AI models become more refined, the potential for these technologies to revolutionize antibody developability assessments is immense. However, it is crucial that advancements in AI are paralleled by progress in experimental data generation. By combining AI with robust experimental techniques, we can fully exploit the capabilities of both, leading to more accurate predictions and innovative solutions.

By fostering collaboration between industry and academia and encouraging the sharing of methodologies and insights, the field can overcome current limitations and accelerate the development of safe, effective, and manufacturable therapeutics.


Did you find our blog post insightful?

Check out the full recording of the panel discussion for even more in-depth details and engaging exchanges of ideas.


References

[1] Zhang, W. et al. Developability assessment at early-stage discovery to enable development of antibody-derived therapeutics. Antibody Therapeutics 6, 13–29 (2023). https://doi.org/10.1093/abt/tbac029

[2] Li, B. et al. PROPERMAB: an integrative framework for in silico prediction of antibody developability using machine learning. bioRxiv (2024). https://www.biorxiv.org/content/10.1101/2024.10.10.616558v2

[3] Park, E., Izadi, S. Molecular surface descriptors to predict antibody developability: sensitivity to parameters, structure models, and conformational sampling. mAbs 16, 2362788 (2024). https://doi.org/10.1080/19420862.2024.2362788

[4] Rollins, Z.A. et al. AbMelt: Learning antibody thermostability from molecular dynamics. Biophys J 123 (17), 2921-2933 (2024). https://doi.org/10.1016/j.bpj.2024.06.003

[5] Wolf Pérez, A.-M., Lorenzen, N., Vendruscolo, M., Sormanni, P. Assessment of therapeutic antibody developability by combinations of in vitro and in silico methods. Methods in Molecular Biology 2313, 57–113 (2021). https://doi.org/10.1007/978-1-0716-1450-1_4

[6] Abramson, J., Adler, J., Dunger, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024). https://doi.org/10.1038/s41586-024-07487-w

[7] Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871-876 (2021). https://doi.org/10.1126/science.abj8754

[8] Gallo, E. The rise of big data: deep sequencing-driven computational methods are transforming the landscape of synthetic antibody design. J Biomed Sci 31, 29 (2024). https://doi.org/10.1186/s12929-024-01018-5

[9] Erasmus, M.F., Spector, L., Ferrara, F. et al. AIntibody: an experimentally validated in silico antibody discovery design challenge. Nat Biotechnol (2024). https://www.nature.com/articles/s41587-024-02469-9

Next
Next

Evaluating protein-protein interactions in AF3 predicted complexes: a PD-1 case study