OR5K1
January 2024
OR5K1
Artificial Intelligence (AI)-based 3D protein structure prediction methods, such as AlphaFold (AF), have deeply impacted the field of structural biology. However, there is still much debate on whether AI models could be game changers for drug discovery. Many studies have been carried out in the last year to assess the reliability and applicability of AF predictions.
Even if AF models have their strengths and weaknesses, they have surely raised considerable interest in the drug discovery sector, leading to significant investments in the field. For instance, Isomorphic Labs (DeepMind's spin-off aiming to use AF to search for drugs) signed substantial deals with Novartis and Eli Lilly at the beginning of 2024.
An example of the applicability of AF models into drug discovery workflows has been reported in a recent paper. The scientists developed a model refinement protocol which integrates AF-predicted models, for a protein with scarce structural information, OR5K1, obtaining valuable information on the orthosteric binding site of the receptor. This protocol could also be applicable to a broad range of other targets lacking structural knowledge.
OR5K1 is a class A G protein-coupled receptor (GPCR) belonging to the subfamily of odorant receptors (ORs), specialized in detecting pyrazine-based volatile compounds that determine certain aroma in food. Even if ORs are the largest subfamily of class A G-Protein Coupled Receptors (GPCRs), they are structurally poorly characterized, with only one recently solved experimental structure from this family. In addition, ORs have low sequence similarity to other structurally characterized GPCRs (less than 20%). This lack of structural information makes it particularly challenging to predict ligand binding for this target.
To overcome this issue, the researchers computationally predicted the OR5K1 3D structure with a multi-template homology modeling and AlphaFold2, to compare and use the two obtained models for further binding site studies. The OR5K1 AF2 model was obtained from the public AF database. Most regions of the model (except for the N-terminus and the ECL3) had either a high (between 70 and 90) or very high (>90) confidence score value (average predicted local distance difference test, pLDDT, Image 1), thus providing a good confidence model for the largest part of the structure. The OR5K1 homology model (HM) was produced with a multi-template approach, since OR5K1 had a very low sequence similarity with other experimentally solved GPCRs to build a good homology model, multiple proteins covering different portions of the sequence were used as a template for modeling.
Image 1. AF2 model of OR5K1 colored by pLDDT confidence score (on the right, legend of the pLDDT score in terms of confidence of the predicted structure). The picture of the AF model is produced with the 3decision® software.
The AF2 model represents an inactive state of the OR5K1, while the HM is modeling an active state, so some differences in the binding site were expected. Still, the largest one was the folding of the extracellular loop 2 (ECL2, Image 2), which is the most structurally diverse extracellular loop of GPCRs - making its modeling highly challenging. Since the first cryo-EM structure of an odorant receptor (OR51E2) with a high sequence similarity with OR5K1 was recently solved, an experimentally produced model for this region was available for comparing the HM and AF2 models. While the HM model had a different folding of ECL2 (likely because of the low sequence similarity of the template for this region), AF2 prediction showed a folding with remarkably high similarity with the OR51E2 ECL2 solved by cryo-EM
Image 2. Comparison of the OR5K1 AF2-predicted model of ECL2 (colored by confidence score pLDDT) and the experimentally determined structure of ECL2 of the receptor OR51E2 (in pink, PDB: 8F76). The superposition and the picture are produced with the 3decision® software.
The scientists then focused on modeling the orthosteric binding site of OR5K1. Both AF2 and HM models have binding sites that were not accessible for ligand binding and needed optimization before their use for docking. AF2 models have an intrinsic limitation since they are built as apo structures, and thus, the modeling of the binding pocket is not guided by ligand information (even if some studies have been trying to overcome this issue, as discussed in our Discngine Labs on AlphaFold). The optimized AF2 and HM models identified the same key residues for binding and receptor activity, which were then experimentally confirmed by functional studies.
This work shows how using AF2 models in a computation workflow can provide valuable structural information for targets that lack structural knowledge, and that can be successfully used as a starting point for further model refinement. These types of studies have been growing recently, providing new protocols based on AI-produced models that can bring new input for drug discovery efforts.
