I wanted to highlight some small innovations in mass-tagging for proteomics, both as a writing exercise for myself but also to share some new tools that maybe you haven't seen before. To be clear, these are far from being the *most* important or the *most* impactful innovations in proteomics — I am choosing them because they made me stop and go Oh, huh, that's pretty interesting. And to rein in the scope of this post, I'll be focusing on tandem mass tag (TMT) labelling advancements, as it is one of the most popular commercial tools available to proteomics scientists and the one I happened to work most extensively with. This post will assume a basic understanding of mass-spectrometry and fundamentals in proteomics. If you’re new to tandem mass tags or isobaric tags generally, I recommend reading this recent review for a more thorough introduction to the topic.
Introduction
Last year I was introduced to the awesome world of targeted proteomics and I grew to appreciate the quantitative power of parallel reaction monitoring (PRM) and its ability to zero in on targets (especially useful for monitoring protein turnover or post-translational modifications on specific proteins). For many proteomics subfields, including targeted proteomics, technical advancements in instrumentation, labelling modalities, or data analysis software are a constant ongoing effort by academia and industry alike. The ultimate goal is to make highly sensitive experiments cleaner and higher throughput, and to make highly complex data analysis faster and more accurate. In this post we'll be looking at the latter: making experiments better, faster, stronger...
...and there are really three major ways of doing this:
1. Creating new covalent labels or mass tags
2. Improving sample preparation workflows (decreasing sample loss)
3. Increasing multiplexing capabilities (decreasing instrument acquisition time)
Background
I’ll go over how TMT works at a very high level and introduce the ratio distortion problem (an issue that makes it difficult to achieve quantitative accuracy with TMT). Tandem mass tags are isobaric tags, meaning all tags (in a set) have the same mass going into the mass spectrometer. The tags are then differentiated inside the instrument, where they experience dissociation forces (MS2) that cause them to break up into smaller fragments. These fragments are distinctly different masses. In the diagram below we can see a few examples of TMT tags. They all have the same three components:
amine reactive handle (purple) — this part of the tag will react with the N-terminus and lysine residues (K) of our peptides (digested protein). This part of the tag remains attached to the peptide after fragmentation.
mass balancer (green) — this part of the tag incorporates heavy C and N atoms in order to balance out the total mass of the molecule, ensuring the tag is isobaric. This part of the tag can also remain attached to the peptide after fragmentation.
reporter ion (red) — this part of the tag falls off during fragmentation. The total mass of the reporter ion is low, and so it is easy to differentiate those mass peaks at the lower end of our mass spectra. Each reporter ion is unique in mass due to the number of heavy C or N atoms it incorporates.
The way that multiplexing works is say you have 6 samples or even 6 biological replicates: rather than analyzing each sample individually, we can tag each sample with a specific TMT label (so Sample #1 would get the TMT-126 label, Sample #2 the TMT-127N label, Sample #3 the TMT-128N label, and so on). We can then pool all 6 samples into 1 vial and inject that as a single experiment. Instead of 6 experiments, now we have just 1! Magic? Not quite. Since the tags are isobaric (all the same mass) they will show up as one peak in our initial MS1 scan. We select that peak for MS2 fragmentation, which is when the reporter ions will fall off. We then can quantify the reporter ion peaks in our MS2 scan. Theoretically, if the sample is efficiently labelled, the reporter ion should give us a 1:1 readout of peptide abundance in a particular sample. So, if Sample #1 has twice as much of certain peptide than Sample #2, we would see a reporter ion peak at 126 m/z that is twice is big as a reporter ion peak at 127 m/z.
However, this method isn’t foolproof. Sometimes, when we select the initial TMT-labeled mass peak in our MS1 scan, that selection window isn’t as tight as we would like it (and this is simply a limitation of instrument resolution). A wider window can let other peptides that are super similar in mass (and are untagged) get selected for MS2 fragmentation, which means those peptides get blown apart and they will also generate small fragments that are similar in mass to our reporter ion. At this stage, we can’t differentiate between the garbage fragment ion and the reporter ion if they are both 126 m/z, and so the reporter ion peak is “distorted” and no longer represents an accurate 1:1 quantification of our peptide-of-interest. The figure below illustrates this effect.
The best way to avoid reporter ion ratio distortion is to use MS3. The extra selection of the MS3 scan means that we can filter out the garbage fragment ions in MS2, effectively eliminating “false” reporter ions (more on that here). Not all instruments have MS3 capabilities and, depending on the type of work you are doing, you may not even need it. But if you are trying to aggressively limit false positives and to have the most accurate data possible (such as in clinical applications), MS3 should always be the way to go.
The reality is that TMT report ion distortion is significantly more complex than what I’ve briefly outlined here. If you’re interested in reading more about those complexities, I recommend checking out these two posts from Phil Wilmarth’s blog.
More, More, More! 27- and 29-plex TMT Profiling
Wang, Zhen, et al. "27-Plex tandem mass tag mass spectrometry for profiling brain proteome in Alzheimer’s disease." Analytical Chemistry 92.10 (2020): 7162-7170.
Sun, Huan, et al. "29‐Plex tandem mass tag mass spectrometry enabling accurate quantification by interference correction." Proteomics 22.19-20 (2022): 2100243.
There's a lot of reasons why scientists want to multiplex proteomics experiments: increased throughput, less sample variability, reducing the “missing value” problem, more accurate quantification at the peptide level. Essentially, the higher the multiplexing capacity, the more time and money we save (and everyone wants to save money — hence why we've quickly jumped from 6-plex to 11-plex to 16-plex to 18-plex TMT reagents in a little under a decade).
So when I first read this 27-plex paper from Junmin Peng (Peng Lab @ St. Jude Children's Hospital), I couldn't believe how simple (and obvious) this mega-plexing method was: just combine two sets of TMT tags of different mass — in this case 11-plex and 16-plex TMT — and double your multiplexing capacity! Duh! The implication here is that now you can monitor 27 separate conditions in a single experiment, acquiring twice as much data without increasing instrument runtime, which is a huge jump in high-throughput capabilities. In 2022, the Peng Lab released an updated 29-plex method that used the new 18-plex TMTpro and 11-plex TMT, a trend I foresee continuing.
How it works: The logic is pretty simple. The 11-plex mass tag has a different scaffold and a lower mass than the newer 16-plex mass tag (also termed TMTpro). The labelling chemistry remains the same, with both tags sporting identical amine-reactive groups (blue), and because 16-plex TMT has a longer balancer region (green) and different reporter ion (red), this makes the isobaric masses different enough to distinguish in an MS1 scan (229 m/z vs 304 m/z). Either isobaric mass peak can then be selected for MS2 fragmentation and analysis, which is when we can identify the specific peptide through the generated backbone fragments, as well as quantify its abundance using the reporter ion readout.
Figure 1A
The real-world application: The 27-plex TMT method was used to profile the human brain proteome of Alzheimer’s disease (AD), although this part of the paper was by no means the “heavy lifter”. The data from all 11-, 16-, and 27-plex experiments showed a 30% downregulation of proteins found in synapses, which aligned with previously reported findings — and it’s always good when your method performs as you expect it to!
Some pros: Notably, the 27-plex method was able to identify >410,000 peptides, nearly 1.5x more compared to either the 11-plex or 16-plex methods. Perhaps the more impressive technical result is Figure 5D-F, where we see a great linear correlation of fold change in AD/control proteins between the individual 11-plex or 16-plex experiments and the 27-plex experiment. Essentially, this indicates a relatively limited effect of reporter ion ratio distortion in a combined 27-plex experiment despite increased sample complexity! And what this really means is that you can actually use this higher multiplexing method to achieve comparable results without sacrificing accuracy along the way.
Figures 5D-F
Some cons: The published method uses fractionation prior to LC-MS analysis since their instrument does not have MS3 capabilities (MS3 is currently the gold standard for tackling reporter ion distortion). Fractionation isn’t bad, but it does increase the length of your experiments, which was the big issue that multiplexing was supposed to address! By splitting your sample into many separate fractions for individual analysis, you can better separate out interfering ions that are similar in mass to your TMT samples. This reduces reporter ion ratio distortion mainly by decreasing the amount of ions the spectrometer sees during the elution gradient (rather than looking at all the ions in a sample it’s looking at 1/12 of that amount, for example). Less ions means less noise and less coeluting contaminants in our spectra, and so the reporter ion readouts will be less riddled with similar-mass garbage, in turn.
When we take a closer look at the Methods, for an individual multiplexing experiment there are 40 concatenated fractions that are each analyzed for 90 minutes, making the total experiment runtime 60 hours. In an ideal world (where we don’t need fractionation), we would be able to inject our pooled 27-plex sample straight onto the LC-MS, turning a 60-hour experiment in a 90-minute experiment (40x faster!!!). Ultimately, it’s a balancing game: more fractions means more proteome coverage and less...means less. Simple as that.
Skipping the Pitfalls with sCIP-TMT
Burton, Nikolas R., and Keriann M. Backus. "Functionalizing tandem mass tags for streamlining click-based quantitative chemoproteomics." Communications Chemistry 7.1 (2024): 80.
Burton, Nikolas R., et al. "Solid-phase compatible silane-based cleavable linker enables custom isobaric quantitative chemoproteomics." Journal of the American Chemical Society 145.39 (2023): 21303-21318.
This two-author paper from the Backus Lab @ UCLA piqued my interest because it framed itself as a new-and-improved TMT labelling protocol for looking at protein druggability, rather than just boosting protein identification and quantification. Their proposed sCIP-TMT protocol not only introduces a new labelling chemistry (Goal 1, see Introduction) but also decreases sample preparation time (Goal 2).
The workflow specifically targets cysteine residues (which any drug hunter loves for their versatility) and notes that although there are many cysteine-targeting workflows already out there, they all suffer from late-stage isobaric labeling. Well, why do we care about that? As outlined in their paper, the workflow is as follows: cysteine biotinylation -> trypsinize and digest peptides -> isobaric tagging -> LC-MS analysis. Late-stage tagging introduces greater sample variability since it extends the sample preparation process, during which additional reagents and impurities may be introduced prior to injection!
How it works: The solution proposed here is a new type of tag that includes both the isobaric mass, the biotin moiety, and an azide reactive handle (pictured below, Figure 2A). Now that all three reagents are combined into one, samples can be labelled at an earlier stage and don't require any additional labelling steps downstream. So, rather than having to pool all our samples at the very end of our workflow, we get to do it right from the beginning. What this means to the lay scientist is that this new protocol cuts down on prep time, leaves less room for errors or sample loss to occur, and reduces sample-to-sample variance — who doesn’t love that?
Figures 2A and 2B
I'm not going to go over how the sCIP-TMT reagent is made, but the important thing to note is that it has an azido group (N3) and that's what will react with the cysteines in the proteins of our experiment. The previously reported sCIP reagent did have multiplexing capability, but it was pretty low (6-plex), so the great improvement here is 1) using a commercially available reagent that 2) boosts multiplexing capability to 10-plex. Their sCIP-TMT tag seems to be pretty sturdy, with the authors pointing out that the TMT reporter ion is generated at 36(%) NCE in high abundance (this is the bog-standard collision energy used for TMT experiments), and that there aren't any other major fragments that are made at higher collision energies (which is great! we don't need extra noise in our spectra). Figure 4C and Figure 4D show that the TMT ratios aren't succumbing to extreme ratio distortion, except for maybe the 130C channel in the FAIMS experiments — something funky is going on there. Sidenote: these figures are also a great example of what I briefly mentioned earlier, MS3 is truly the gold standard for tightening up those ratios — I mean, look at those teeny tiny error bars! Overall, the sCIP-TMT reagent isn't too fragile and clearly gets the job done as an isobaric tag! But how does it function as a cysteine labeler?
Figures 4C and 4D
The real-world application: Using the novel sCIP-TMT reagent, Backus et al. are able to perform a 10-plex fragment screen for reactive cysteines, ultimately identifying >19,000 cysteines on almost 6000 proteins, and are able to do so much faster/cheaper than previously published approaches. I'm not an expert in cysteine reactivity screens, so I can't really speak confidently to the results here, but glancing at Figure 5E (not shown here) it seems like sCIP-TMT is able to reproduce previous results and identify reactive cysteines with high confidence (log2 ratio >1). I also glanced at the peer review comments and one reviewer suggested that this type of tag could be further elaborated to target other amino acids (e.g. replacing the azide with an ACR-based probe to target histidine residues!) which is definitely a very exciting prospect to think about.
Some pros: The authors estimate that using sCIP-TMT could save more than 4 hours on a 10-channel experiment which...depending on when you start the whole prep process, could save you a whole day's worth of dead time between working hours (being able to finish prep at 4:00pm vs having to pause your prep and finish up the next day). The sCIP-TMT method also uses a cost-efficient amount of TMT reagent (paper here — a great read!) which reduces costs to 1/8th of what you would spend using Thermo Fisher's recommendations.
I was also very interested in something that was mentioned very briefly but that I thought addressed a pretty common issue in proteomics: false discovery rates. The authors noted that they observed a desulfurization ion in >97% of sCIP-TMT labelled peptide spectra, and that the presence/absence of the desulfurization ion correlated well with high confidence/low confidence spectra. Monitoring for this ion could be used to drastically filter out the number of false positives, making sCIP-TMT a more robust identification method. Another massive pro is that this new tag does not need custom software or code, which will save days of troubleshooting for those who are looking to use this new labelling chemistry ASAP! Being able to keep the same data analysis workflow for new chemistries is so underrated!
Some cons: The reality is that there are a lot of ways of cutting down on time during sample prep, and sCIP-TMT is just one of them. The time-saving results could probably be replicated by automating the whole sample prep process and this would also likely decrease sample-to-sample variability — a robot is almost always more precise than the human hand! The authors acknowledge this and it's good that they do. However, this type of thinking is definitely a knee-jerk reaction of well-funded, industry labs (because time is money, money, money). Smaller academic labs simply won't have access to a liquid handling robot for their personal use, and so sCIP-TMT does provide a genuine improvement to their manual workflows.
Conclusion
I hope that by highlighting these two papers you have been able to see some nice examples of technical advancements made across the three areas I outlined before. The Junmin Peng paper introduced a very simple way to increase our multiplexing capabilities (Goal 3) by combining two sets of TMT labels, while the Keriann Backus paper introduced a new type of covalent tag (Goal 1) that targets cysteines through an early labeling step (Goal 2) while retaining the multiplexing powers of TMT. Ultimately, I thought the approaches taken in both these papers were simple and elegant — hopefully you did, too! Let me know if you have any thoughts on this post, TMT, or anything proteomics related.
AS 11/09/24