Direct causal mechanisms

Understanding biological systems: In search of direct causal mechanisms

The advent of DNA-microarrays spurred a vigorous effort to reverse engineer biological networks. Recently, these efforts have been reinvigorated by the availability of RNA-seq data from perturbed and unperturbed single cells. In the talk below, I discuss the opportunities and limitations of using such data for inferring networks of direct causal interactions, with emphasis on the distinctions between models based on direct and indirect interactions. This discussion motivates the need to model proteins since most biological interactions involve proteins. Then I introduce key ideas and technological capabilities of high-throughput single-cell proteomics methods that we have developed and will focus on the opportunities of using such data for inferring direct causal mechanisms in biological systems.

Some of the ideas that I discuss above involve doing single-cell mass-spec measurements with SCoPE-MS. Thus, if you do not have strong background in quantitative mass-spec, you may want first to learn some of the key ideas that make SCoPE-MS possible from this primer by Harrison Specht. Below is its summary.

Quantifying proteins by mass-spec

Mass spectrometry-based proteomics is a suite of high-throughput and sensitive approaches for identifying and quantifying proteins in biological samples. These methods allow for quantifying >10,000 proteins in bulk samples. However, these techniques have not yet been widely applied to single cells despite the fact that modern mass spectrometers can detect single ions. To explain why, this primer talk will explore core concepts of mass spectrometry-based proteomics with emphasis on developing intuition for the physical processes underpinning peptide sequencing and quantification. In particular, I will cover what is called “shotgun” or “discovery” proteomics using isobaric barcoding, a technology used by Single Cell Proteomics by Mass Spectrometry (SCoPE-MS). The primer talk will outline the obstacles that have limited the broad application of quantitative mass spectrometry to single-cell analysis and how SCoPE-MS overcomes these obstacles to enable profiling thousands of proteins across thousands of single cells.

Evaluating preprints

I am hugely enthusiastic for communicating research by preprints. So naturally, I am happy to see when the president and strategic advisers of one of the most elite funding institutes embraces preprints:

For centuries, publishing a scientific article was just about sharing the results. More recently, publishing research articles in a journal has served two distinct functions: (i) Public disclosure and (ii) Partial validation by peer-review (Vale & Hyman, 2016). The partial validation is sometimes followed up by strong validation: (iii) Independent reproduction and building upon the published work.

Preprints clearly can serve the first function, public disclosure. It has been less clear to me how to validate and curate the highly heterogeneous research that is published as preprints. I think this question remains open, though I have seen signs that some preprints are strongly validated (independently reproduced & built upon) even before the more conventional partial validation by peer-review.

For example, the methods and ideas underlying Single Cell ProtEomics by Mass Spectrometry (SCoPE-MS) were independently validated by multiple laboratories. Some presented their results at conferences before our preprint was peer-reviewed:

Several groups published their results after our preprint was published in a peer-reviewed journal, crediting the preprint for the ideas:

More (that I know of) are underway. All inspired by a preprint.  I see this as a datapoint that preprints can get strong validation even outside of the boundaries of the peer-review system that has dominated our field for the last few decades.  It’s not a complete solution for evaluating all preprints, but I think it’s very encouraging evidence that preprints can be strongly validated even before the weak validation of peer-review!

Single-cell proteomics

Ever since my lab posted the SCoPE-MS preprint, I have been repeatedly asked about the future potential and the cost of quantifying proteins by high-throughput mass-spectrometry in single cells. I will summarize a few thoughts that hopefully will be helpful and will reduce email traffic.

Why quantify proteins and PTMs in single cells?

Single-cell RNA-seq has made great strides and become widely available and preferred method for high-throughput single-cell measurements. That is great! These measurements are very useful and their usefulness will continue to grow as we invent new ways to think about these data and reduce their noise. Yet, measuring transcript levels alone is insufficient for studying and understating many physiological and pathological processes, not least because the changes of protein levels human across tissues and cell differentiation are poorly predicted by the corresponding changes in mRNA levels:

 

The usefulness of mRNA levels as surrogate for signaling activing by post-translational modifications (PTMs, e.g., phosphorylation, ubiquitylation) is even more limited.

 

What is the history of single-cell proteomics by mass-spectrometry?

Quantifying proteins in single cells directly, without relying on antibodies, has been a long standing aim and dream for many scientists. There are over a dozen reports for doing so over the last decade but they all have used cells that have over 1000 fold larger volume than the typical mammalian cell, e.g., muscle cells and oocytes, and quantified only a few proteins in a few cells. To my knowledge, SCoPE-MS is the first method to quantify over a thousand proteins across hundreds of mammalian cells with typical cell sizes, i.e., diameter of 10 – 15μm.

 

How expensive is it to do SCoPE-MS?

This questions comes up frequently. The answer depends a lot on what we factor into the price. If you own a suitable high-resolution MS instrument/system, the current cost is about a dollar per cell but very soon that will drop significantly; stay tuned for our next preprint. If you do not own a suitable high-resolution MS instrument, the price depends on the service charges of your prefered MS facility. The cost for a suitable instrument ranges from ~ 100k (low end refurbished instrument) to ~ 700k (the high end benchtop instruments on the market, new).

 

How easy is it to do SCoPE-MS?

For our lab, quite easy. I am proud of the fact that SCoPE-MS is enabled by a simple idea and not by access to the newest corporate technology with limited accessibility. We used an old, low-end instrument for developing SCoPE-MS. We are writing up more detailed protocols and hoping to release a robust data processing pipeline soon. Anyways, there is nothing particularly tricky in the method, and I expect that any good lab should be able to quantify single cell proteomes by SCoPE-MS.

 

How noisy are the data?

As all methods using tandem mass tags, SCoPE-MS measurements are affected by coisolation interferences, which means that about 5 – 10 % of the reporter ion signal for a typical peptide comes from other peptides. This undesirable contribution can be reduced by using newer instruments with better mass-filters that allow for smaller ion isolation windows. It can also be reduced by simply filtering out peptides with more co-isolation and focusing on those with very limited coisolation or by computationally compensating for it.

There is of course also nonsystematic (random) noise. In our current data (Supplemental Figure 2c), the reliability of the measurements for the proteins with the smallest fold changes is over 50 % and for those with the largest fold changes, about 80 %. The reliability is higher for data acquired on the new instruments that use high-quality quadrupole mass-filters, i.e, Q-exactive Orbitraps.

 

Can you measure post-translational modifications (PTMs)?

Yes, we can. Stay tuned for the preprint.

 

What is the future potential for building up on SCoPE-MS?

That is my favorite question! We have outlined ideas and technologies that can advance single cell proteomics methods by several orders of magnitude. In short:

  • Throughput: The throughput will grow as we increase the number of mass-tags. These should go up to 16 in the fall. As the demand for single cell proteomics increases, thermo or the community will come up with much higher plex. Since MS does the measurements on groups of identical ions (not individual molecules as in the case of next-generation DNA sequencing), higher multiplex will increase the number of quantified samples without affecting the depth of coverage. The higher multiplex will also reduce the need for the career channel, first reduce the number of carrier cells and ultimately eliminate the need for them.
  • Accuracy: SCoPE-MS does minimal processing of the samples and the measurement is based on hundreds, even thousands of ions for the quantification of each peptide in each cell. There are no fundamental limits to achieving very high accuracy. Since proteins are much more abundant than mRNAs (on average over 1000 protein molecules per each mRNA), the counting of low copy number molecules or ions is much less problematic compared to single cell RNA sequencing. As we improve our ability to deliver and capture all ions, we should be able to measure even the least abundant proteins and expand the depth of coverage tremendously. This is not just a distant promise. I think it is an imminent possibility.

CSHL Meeting: Single Cell Analyses 2017