In June 2024, I joined a computational chemistry team at AZ, specifically working with the molecular glues interest team. A lot of my work was based around assessing the accuracy of molecular glue-induced protein-protein interaction predictions—which quickly proved to be a highly complex and nuanced problem. I thought that I would write a quick post highlighting the potential for computational approaches to aid in solving this problem, as well as the limitations our current tools face. This will be Part 1 of two posts covering protein-protein interaction structure prediction!
Over 80% of proteins interact with another protein as part of their main biological function, and elucidating the identity, structure, and dynamics of protein-protein interactions (PPI) can not only provide us with an acute understanding of the mechanisms of common subcellular functions, but also insights into the signaling pathways of disease. These pathways are often addressed through structure-based drug design (SBDD), where ligands are designed and optimized with an understanding of the cavities on the protein surface. This process is frequently aided by holo structures (protein crystal structures that are bound to a ligand) that provide crucial information about the interactions (or clashes!) that can form between the ligand and the amino acid residues of the protein.
Recently, there has been increased interest in the development of small molecular stabilizers, or “molecular glues”, as a new modality for inhibiting or degrading pathogenic proteins. Molecular glues are distinct from traditional small molecule inhibitors or degraders in that they do not directly inhibit or degrade their target through specific binding. A traditional ligand tightly binds to a specific target, while molecular glues are bifunctional compounds that can cooperatively bind two proteins simultaneously. Notably, molecular glues do not require tight binding of either protein partner, since the ternary complex (two proteins + molecular glue) is largely stabilized through protein-protein contacts. The most common way for the ternary complex to form is through a presenter protein initially binding the molecular glue, creating a “neo-protein” interface that is distinct from the protein’s unbound form, which then facilitates target protein recognition and binding. Often, the protein-protein contacts may be weak on their own but enhanced through conformational changes in the protein interface upon molecular glue binding, often through increased shape complementarity and hydrogen bonding.
Proteolysis targeting chimeras, or PROTACS, function similarly through a proximity-induced degradation pathway, but are chemically distinct from molecular glues as their binding mechanism does not typically depend on protein-protein interaction stabilization. PROTACS can be understood as two warheads bridged by a chemical linker. Each warhead selectively binds a distinct protein target (one of which is typically an E3 ligase). The goal of a PROTAC is to bring a target protein in the proximity of the E3 ubiquitination machinery, labelling it for downstream degradation. PROTACS are often large, which means that they run into permeability and bioavailability issues due to their size, and they have the same design problems as traditional small molecule ligands: the warheads have to bind selectively and tightly on the individual protein surfaces (which any medicinal chemist can attest is no simple task)! Because of their PPI-dependent binding mechanism, molecular glues dodge these issues and have become an increasingly hot topic in the pharmaceutical industry as a way of targeting the undruggable proteome.[i][ii] Note that although molecular glues can degrade target proteins by gluing them to E3 ligases, they can also inhibit protein function by gluing them to other proteins (e.g. preventing a pathogenic protein from binding its downstream targets in a signaling cascade). This frames molecular glues as a versatile and highly customizable modality to work with.
It's surprisingly difficult to find good figures comparing PROTACS and molecular glues, here’s an exceptionally pretty one from this review.
However, if an SBDD approach were to be taken to design molecular glues, it is imperative to have a structure of the PPI beforehand. And this is quite tricky. As mentioned, molecular glues are often stabilizing weak, pre-existing PPI’s, or inducing interactions between two proteins that have no affinity for each other normally. How do you crystallize two proteins that don’t (or can’t) typically interact with each other without having the glue for it first? Even proteins that have a weak affinity for each other will be difficult to crystallize, since crystallization conditions are much harsher and significantly different than what the proteins experience in the cell. Again, a glue would be helpful here to crystallize these complexes! And so we find ourselves in a chicken or the egg situation, where it seems that the only way to find these molecular glues is through pure serendipity (and for decades, this was the only way molecular glues were discovered).[iii]
This is where computational modeling really has its time to shine. Developing high-confidence models of PPIs, especially those that are informed by some form of experimental data, provides a strong starting point for rationally designing molecular glues because it can provide a rough idea of the pocket formed between the two proteins that the ligand should occupy. However, there are a lot of caveats to this process:
Modeling the potential interaction between two proteins without the ligand means that the conformations of the protein faces may be wrong, and this could bias your models towards certain conformations that are ultimately inaccurate. Some proteins undergo significant conformational changes upon ligand binding, meaning the interaction face of the protein can look completely different (certain residues may become more or less exposed, certain pockets or grooves may form or disappear) between its bound and unbound forms. A great example of this is the Cereblon (CRBN) E3 ligase, which undergoes significant conformational rearrangement upon allosteric binding of a molecular glue ligand.[iv] Upon molecular glue binding, CRBN goes from an “open” conformation to a “closed” conformation, adopting a new interface necessary for facilitating the PPI (see figure below). Ultimately, the complementarity between protein faces that is allosterically enabled by ligand binding is very hard to predict and this remains a major pitfall of most ligand-less modeling workflows.
Another consideration for modeling molecular glue-induced PPI’s is if there is no pocket formed between surfaces in the PPI model. Sometimes proteins have very high complementarity and the models may predict a very large, tight interaction between the two proteins, the interfaces being so close that no there is no cavity for a molecular glue to occupy. What if one protein interactor is predicted to perfectly fill the active site of the other? Oh where, oh where should our little glue go? In a similar vein, the model could swing the other way and predict very minor inter-residue interactions, producing a very large—and likely inaccurate—cavity between the two proteins. In this case, it would not be fruitful to try and design a giant monstrosity of a ligand to fill that void, as large ligands are generally less desirable (poor cell permeability, complex synthesis, decreased stability).
And the final pitfall has to do with crystal structures (or increasingly, AlphaFold models) themselves. A crystal structure is only a single frame of the movie that is a protein wiggling and jiggling. Visualizing these movements is possible through molecular dynamics simulations, but these simulations are computationally expensive and need to be run for a very long time to see major conformational changes in protein structure. Crystallizing proteins isn’t easy: sometimes multiple mutations and truncations have to be made to the native protein sequence in order to get the protein to behave under crystallization conditions (which just means be stable and fold well). You can imagine how this can greatly skew the quality and scope of our PPI models if a potential interface is located on the truncated section of our protein. There are other hesitations to be had regarding crystal structures: this review does an excellent job of describing them.[v]
AlphaFold models, which may be used when there are no experimentally derived structures for a certain protein, can use the full native protein sequence — unlike many crystal structures — but have their own unique problems. The most notable issue is the inaccurate structural prediction of intrinsically disordered regions or flexible loop regions (see figure below). The unstructured regions of proteins are usually truncated in order to ease the crystallization process, meaning only the well-folded regions of the protein are crystallized. However, flexible loop regions are often major hubs for protein-protein interaction and common sites for post-translational modifications, making them instrumental for modulating protein activity and function. Molecular glues can stabilize PPI’s with flexible loop regions, ultimately enabling their partial crystallization and structural elucidation. But screening for new interactors containing disordered regions poses the same challenges I addressed earlier: flexible regions can adopt many different conformations, but some of these conformations are energetically unfavorable without a facilitating ligand. Chicken, egg, etc.
The caveats I’ve just described may paint a pretty grim picture of computational modeling, but there’s a lot of work being done to address all of these problems right now. With the advent of highly accurate, single-chain ML-based structure prediction (see Nobel Prize in Chemistry, 2024), the focus has rapidly shifted to using ML to predict PPI’s, biomolecular binding, and larger systems like antibody/antigen complexes. In my next post, we’ll take a look at some ML-based tools that address the issues surrounding conformational sampling, ligand inclusion, and using experimental distance restraints. And, more importantly, we’ll discuss the current state of affairs in the molecular glue chicken-egg fiasco. On a personal note, I’ve absolutely loved working on these problems and their solutions definitely take a certain type of creativity. I really hope that I can continue to work on similar problems in the future.
AS 01/01/25
[i] Wu, Hongyu, et al. "Molecular glues modulate protein functions by inducing protein aggregation: A promising therapeutic strategy of small molecules for disease treatment." Acta Pharmaceutica Sinica B 12.9 (2022): 3548-3566.
[ii] Garber, Ken. "The glue degraders." Nature Biotechnology (2024).
[iii] Schreiber, Stuart L. "The rise of molecular glues." Cell 184.1 (2021): 3-9.
[iv] Watson, Edmond R., et al. "Molecular glue CELMoD compounds are regulators of cereblon conformation." Science 378.6619 (2022): 549-553.
[v] Shoemaker, Susannah C., and Nozomi Ando. "X-rays in the cryo-electron microscopy era: structural biology’s dynamic future." Biochemistry 57.3 (2018): 277-285.