Accessible single-cell proteomics

Recently single-cell mass-spectrometry analysis has allowed quantifying thousands of proteins in single mammalian cells. Yet, these technologies have been adopted in relatively few mass-spectrometry laboratories. Increasing their adoption can help reveal biochemical mechanisms that underpin health and disease, and it requires robust methods that can be widely deployed in any mass spectrometry laboratory.    

This aim for a “model T” single-cell proteomics has been the guiding philosophy in the development of Single Cell ProtEomics by Mass Spectrometry (SCoPE-MS) and its version 2 (SCoPE2). We aimed to make every step easy to reproduce, from sample preparation and experimental parameters optimization to an open source data analysis pipeline. The emphasis has been on accuracy and accessibility, which has facilitated replication (video) and adoption of SCoPE2. Yet, we still found that some groups adopting these single-cell technologies fail to make quantitatively accurate protein measurements, because they skip important quality control steps of sample preparation (such as negative controls and labeling efficiency), and mass spectrometry analysis, such as apex sampling and purity of MS2 spectra. 

These observations motivated us to write a detailed protocol for multiplexed single-cell proteomics. The protocol emphasizes quality controls that are required for accurately quantifying protein abundance in single cells and scaling up the analysis to thousands of single cells. The protocol and its associated video and web resources should make single-cell proteomics accessible to the wider research community.

When do you need single-cell analysis?

Single-cell analysis is trendy for good reasons: It has enabled asking and answering important questions. Of course, the substantive reasons are surrounded by much hype. Sometimes colleagues tell me they want to add single-cell RNA-seq analysis since it will help them publish their paper in a more prestigious journal, and sadly there is perhaps more truth to that than I want to believe.

On the other end of the spectrum, some colleagues from the mass-spec community are puzzled by our efforts to develop methods for single-cell mass-spec analysis: At HUPO, I have been repeatedly asked: “Why analyze single cells when you can identify more peptides in bulk samples?”

So, when do we need single-cell analysis? Can’t we just FACS sort cells based on markers and analyze the sorted cells? Indeed, that maybe a good strategy when the cells we analyze fall into relatively homogenous clusters (they will never be perfectly homogeneous) and we have a reliable marker for each cluster. If these assumptions hold, the averaging out of differences between individual cells will give us very useful coarse graining. Unfortunately, bulk analysis of the sorted cells cannot validate the assumption of homogeneity. For example, we can easily sort B-cells and T-cells from blood samples because we have well-defined markers for each cell type. However, the bulk analysis of the sorted cells will not provide any information on the homogeneity of the sorted T-cells. Yet, a wealth of single-cell analysis has demonstrated the existence of multiple states within T-cell subpopulations, states for which we rarely have well-defined markers allowing efficient FACS sorting and follow up bulk analysis.

FACS sorting is especially inadequate when the cell heterogeneity is not easily captured by discrete subpopulations / clusters of cells. For example the continuous gradient of macrophage states that we recently observed in our SCoPE2 data:  

To explore the heterogeneity within the macrophage-like cells, we sorted them based on the Laplacian vector. See Specht et al., 2019 for details.

In some cases, e.g., analysis of small clonal populations, the benefits of single-cell analysis may be too small to justify the increased cost. Sometimes, we can gain single-cell information from analyzing small groups of cells, e.g., Shaffer et al., 2018. Sometimes, nobody can be sure if single-cell analysis is needed. If we assume it’s needed and perform it, the data can refute our assumption and show us that there is no much heterogeneity, at least at the level of what we could measure. If we assume that there is no heterogeneity and thus no need for single cell analysis, e.g., FACS sort T-cells, the bulk analysis of the sorted cells will not correct our assumption. We can feel the assumption is validated while being blinded to what might be the most meaningful cellular diversity in the system. So, single-cell analysis is not always needed, but it is much better at correcting our assumptions and teaching us if it is needed or not. 

Direct causal mechanisms

Understanding biological systems: In search of direct causal mechanisms

The advent of DNA-microarrays spurred a vigorous effort to reverse engineer biological networks. Recently, these efforts have been reinvigorated by the availability of RNA-seq data from perturbed and unperturbed single cells. In the talk below, I discuss the opportunities and limitations of using such data for inferring networks of direct causal interactions, with emphasis on the distinctions between models based on direct and indirect interactions. This discussion motivates the need to model proteins since most biological interactions involve proteins. Then I introduce key ideas and technological capabilities of high-throughput single-cell proteomics methods that we have developed and will focus on the opportunities of using such data for inferring direct causal mechanisms in biological systems.

Some of the ideas that I discuss above involve doing single-cell mass-spec measurements with SCoPE-MS. Thus, if you do not have strong background in quantitative mass-spec, you may want first to learn some of the key ideas that make SCoPE-MS possible from this primer by Harrison Specht. Below is its summary.

Quantifying proteins by mass-spec

Mass spectrometry-based proteomics is a suite of high-throughput and sensitive approaches for identifying and quantifying proteins in biological samples. These methods allow for quantifying >10,000 proteins in bulk samples. However, these techniques have not yet been widely applied to single cells despite the fact that modern mass spectrometers can detect single ions. To explain why, this primer talk will explore core concepts of mass spectrometry-based proteomics with emphasis on developing intuition for the physical processes underpinning peptide sequencing and quantification. In particular, I will cover what is called “shotgun” or “discovery” proteomics using isobaric barcoding, a technology used by Single Cell Proteomics by Mass Spectrometry (SCoPE-MS). The primer talk will outline the obstacles that have limited the broad application of quantitative mass spectrometry to single-cell analysis and how SCoPE-MS overcomes these obstacles to enable profiling thousands of proteins across thousands of single cells.

Single-cell analysis

single-cell analysis

Imaging is the most widely used method for single-cell analysis

The success of imaging technologies

The molecular and functional differences among the cells making our bodies have been appreciated for many decades. Yet, the tools to study them were very limited. In the last couple of decades, we have began developing increasingly powerful technologies for molecular single-cell measurements. Currently, the most widely used high-throughput methods for molecular single-cell analysis have two things in common: (1) they quantify nucleic acids and 2) they are based on imagining. The imaging can be done in situ (e.g., fluorescent in situ hybridization, FISH) or in vitro (e.g., single-cell RNA-seq based on next gen DNA sequencing). Imaging has been applied to single-cell protein analysis as well, though most applications have been hampered by their dependance on antibodies. A recent break away from this antibody-dependance is the single-molecule Edman degradation developed by the group of Edward Marcotte. If this is developed further, imaging could become a workhorse for single-cell protein analysis as well.

Emerging mass-spec methods

Efforts to apply mass-spectrometry to single-cell analysis started in the 1990s. As comprehensively reviewed by Rubakhin et al., these efforts focused on ionizing biological molecules via Secondary Ion MS (SIMS) or via Matrix Assisted Laser Desorption/Ionization (MALDI). These methods allow to ionize biological molecules with minimal processing and losses but remain rather limited in their quantification accuracy and in identifying the chemical composition of the analyzed ions. In contrast, the methods that afford robust high-throughput identification (based on analyte separation and tandem MS analysis, e.g., LC-MS/MS or CE-MS/MS) have been very challenging to apply to small samples. Still, the typical mammalian cell contains thousands of metabolites and proteins whose abundance is much higher than the sensitivity of mass-spec instruments. Based on this realization, we outlined directions for multiplexed analysis of single cells by LC-MS/MS that can enable quantifying thousands of proteins across many thousands of single cells. We recently published a proof of principle that has been superseded by a higher throughput single-cell proteomics method. These initial steps need much further developments, both experimental and computational, before they reach the transformative potential that single-cell mass-spec could have.

 

Understanding biology

Single-cell analysis is not merely about measurements. It’s about understanding them. Our progress in understanding single-cell data has been limited, even for the data coming from the more mature technologies. Conceptual progress has been much slower than technological progress. So, how do we make sense of the data?

I will reserve my musings on this question for a forthcoming post. For now, I’ll just say that I like an idea articulated by Munsky et al., 2012 and Padovan-Merhar and Raj, 2013: Using the variability between single cells as a natural perturbation for studying gene regulation. I think that this approach can be a very powerful. More thoughts on that coming soon.

 

 

 

Single-cell proteomics

Ever since my lab posted the SCoPE-MS preprint, I have been repeatedly asked about the future potential and the cost of quantifying proteins by high-throughput mass-spectrometry in single cells. I will summarize a few thoughts that hopefully will be helpful and will reduce email traffic.

Why quantify proteins and PTMs in single cells?

Single-cell RNA-seq has made great strides and become widely available and preferred method for high-throughput single-cell measurements. That is great! These measurements are very useful and their usefulness will continue to grow as we invent new ways to think about these data and reduce their noise. Yet, measuring transcript levels alone is insufficient for studying and understating many physiological and pathological processes, not least because the changes of protein levels human across tissues and cell differentiation are poorly predicted by the corresponding changes in mRNA levels:

 

The usefulness of mRNA levels as surrogate for signaling activing by post-translational modifications (PTMs, e.g., phosphorylation, ubiquitylation) is even more limited.

 

What is the history of single-cell proteomics by mass-spectrometry?

Quantifying proteins in single cells directly, without relying on antibodies, has been a long standing aim and dream for many scientists. There are over a dozen reports for doing so over the last decade but they all have used cells that have over 1000 fold larger volume than the typical mammalian cell, e.g., muscle cells and oocytes, and quantified only a few proteins in a few cells. To my knowledge, SCoPE-MS is the first method to quantify over a thousand proteins across hundreds of mammalian cells with typical cell sizes, i.e., diameter of 10 – 15μm.

 

How expensive is it to do SCoPE-MS?

This questions comes up frequently. The answer depends a lot on what we factor into the price. If you own a suitable high-resolution MS instrument/system, the current cost is about a dollar per cell but very soon that will drop significantly; stay tuned for our next preprint. If you do not own a suitable high-resolution MS instrument, the price depends on the service charges of your prefered MS facility. The cost for a suitable instrument ranges from ~ 100k (low end refurbished instrument) to ~ 700k (the high end benchtop instruments on the market, new).

 

How easy is it to do SCoPE-MS?

For our lab, quite easy. I am proud of the fact that SCoPE-MS is enabled by a simple idea and not by access to the newest corporate technology with limited accessibility. We used an old, low-end instrument for developing SCoPE-MS. We are writing up more detailed protocols and hoping to release a robust data processing pipeline soon. Anyways, there is nothing particularly tricky in the method, and I expect that any good lab should be able to quantify single cell proteomes by SCoPE-MS.

 

How noisy are the data?

As all methods using tandem mass tags, SCoPE-MS measurements are affected by coisolation interferences, which means that about 5 – 10 % of the reporter ion signal for a typical peptide comes from other peptides. This undesirable contribution can be reduced by using newer instruments with better mass-filters that allow for smaller ion isolation windows. It can also be reduced by simply filtering out peptides with more co-isolation and focusing on those with very limited coisolation or by computationally compensating for it.

There is of course also nonsystematic (random) noise. In our current data (Supplemental Figure 2c), the reliability of the measurements for the proteins with the smallest fold changes is over 50 % and for those with the largest fold changes, about 80 %. The reliability is higher for data acquired on the new instruments that use high-quality quadrupole mass-filters, i.e, Q-exactive Orbitraps.

 

Can you measure post-translational modifications (PTMs)?

Yes, we can. Stay tuned for the preprint.

 

What is the future potential for building up on SCoPE-MS?

That is my favorite question! We have outlined ideas and technologies that can advance single cell proteomics methods by several orders of magnitude. In short:

  • Throughput: The throughput will grow as we increase the number of mass-tags. These should go up to 16 in the fall. As the demand for single cell proteomics increases, thermo or the community will come up with much higher plex. Since MS does the measurements on groups of identical ions (not individual molecules as in the case of next-generation DNA sequencing), higher multiplex will increase the number of quantified samples without affecting the depth of coverage. The higher multiplex will also reduce the need for the career channel, first reduce the number of carrier cells and ultimately eliminate the need for them.
  • Accuracy: SCoPE-MS does minimal processing of the samples and the measurement is based on hundreds, even thousands of ions for the quantification of each peptide in each cell. There are no fundamental limits to achieving very high accuracy. Since proteins are much more abundant than mRNAs (on average over 1000 protein molecules per each mRNA), the counting of low copy number molecules or ions is much less problematic compared to single cell RNA sequencing. As we improve our ability to deliver and capture all ions, we should be able to measure even the least abundant proteins and expand the depth of coverage tremendously. This is not just a distant promise. I think it is an imminent possibility.

CSHL Meeting: Single Cell Analyses 2017