Direct causal mechanisms

Understanding biological systems: In search of direct causal mechanisms

The advent of DNA-microarrays spurred a vigorous effort to reverse engineer biological networks. Recently, these efforts have been reinvigorated by the availability of RNA-seq data from perturbed and unperturbed single cells. In the talk below, I discuss the opportunities and limitations of using such data for inferring networks of direct causal interactions, with emphasis on the distinctions between models based on direct and indirect interactions. This discussion motivates the need to model proteins since most biological interactions involve proteins. Then I introduce key ideas and technological capabilities of high-throughput single-cell proteomics methods that we have developed and will focus on the opportunities of using such data for inferring direct causal mechanisms in biological systems.

Some of the ideas that I discuss above involve doing single-cell mass-spec measurements with SCoPE-MS. Thus, if you do not have strong background in quantitative mass-spec, you may want first to learn some of the key ideas that make SCoPE-MS possible from this primer by Harrison Specht. Below is its summary.

Quantifying proteins by mass-spec

Mass spectrometry-based proteomics is a suite of high-throughput and sensitive approaches for identifying and quantifying proteins in biological samples. These methods allow for quantifying >10,000 proteins in bulk samples. However, these techniques have not yet been widely applied to single cells despite the fact that modern mass spectrometers can detect single ions. To explain why, this primer talk will explore core concepts of mass spectrometry-based proteomics with emphasis on developing intuition for the physical processes underpinning peptide sequencing and quantification. In particular, I will cover what is called “shotgun” or “discovery” proteomics using isobaric barcoding, a technology used by Single Cell Proteomics by Mass Spectrometry (SCoPE-MS). The primer talk will outline the obstacles that have limited the broad application of quantitative mass spectrometry to single-cell analysis and how SCoPE-MS overcomes these obstacles to enable profiling thousands of proteins across thousands of single cells.

Evaluating preprints

I am hugely enthusiastic for communicating research by preprints. So naturally, I am happy to see when the president and strategic advisers of one of the most elite funding institutes embraces preprints:

For centuries, publishing a scientific article was just about sharing the results. More recently, publishing research articles in a journal has served two distinct functions: (i) Public disclosure and (ii) Partial validation by peer-review (Vale & Hyman, 2016). The partial validation is sometimes followed up by strong validation: (iii) Independent reproduction and building upon the published work.

Preprints clearly can serve the first function, public disclosure. It has been less clear to me how to validate and curate the highly heterogeneous research that is published as preprints. I think this question remains open, though I have seen signs that some preprints are strongly validated (independently reproduced & built upon) even before the more conventional partial validation by peer-review.

For example, the methods and ideas underlying Single Cell ProtEomics by Mass Spectrometry (SCoPE-MS) were independently validated by multiple laboratories. Some presented their results at conferences before our preprint was peer-reviewed:

Several groups published their results after our preprint was published in a peer-reviewed journal, crediting the preprint for the ideas:

More (that I know of) are underway. All inspired by a preprint.  I see this as a datapoint that preprints can get strong validation even outside of the boundaries of the peer-review system that has dominated our field for the last few decades.  It’s not a complete solution for evaluating all preprints, but I think it’s very encouraging evidence that preprints can be strongly validated even before the weak validation of peer-review!

Single-cell analysis

single-cell analysis

Imaging is the most widely used method for single-cell analysis

The success of imaging technologies

The molecular and functional differences among the cells making our bodies have been appreciated for many decades. Yet, the tools to study them were very limited. In the last couple of decades, we have began developing increasingly powerful technologies for molecular single-cell measurements. Currently, the most widely used high-throughput methods for molecular single-cell analysis have two things in common: (1) they quantify nucleic acids and 2) they are based on imagining. The imaging can be done in situ (e.g., fluorescent in situ hybridization, FISH) or in vitro (e.g., single-cell RNA-seq based on next gen DNA sequencing). Imaging has been applied to single-cell protein analysis as well, though most applications have been hampered by their dependance on antibodies. A recent break away from this antibody-dependance is the single-molecule Edman degradation developed by the group of Edward Marcotte. If this is developed further, imaging could become a workhorse for single-cell protein analysis as well.

Emerging mass-spec methods

Efforts to apply mass-spectrometry to single-cell analysis started in the 1990s. As comprehensively reviewed by Rubakhin et al., these efforts focused on ionizing biological molecules via Secondary Ion MS (SIMS) or via Matrix Assisted Laser Desorption/Ionization (MALDI). These methods allow to ionize biological molecules with minimal processing and losses but remain rather limited in their quantification accuracy and in identifying the chemical composition of the analyzed ions. In contrast, the methods that afford robust high-throughput identification (based on analyte separation and tandem MS analysis, e.g., LC-MS/MS or CE-MS/MS) have been very challenging to apply to small samples. Still, the typical mammalian cell contains thousands of metabolites and proteins whose abundance is much higher than the sensitivity of mass-spec instruments. Based on this realization, we outlined directions for multiplexed analysis of single cells by LC-MS/MS that can enable quantifying thousands of proteins across many thousands of single cells. We recently published a proof of principle that has been superseded by a higher throughput single-cell proteomics method. These initial steps need much further developments, both experimental and computational, before they reach the transformative potential that single-cell mass-spec could have.

 

Understanding biology

Single-cell analysis is not merely about measurements. It’s about understanding them. Our progress in understanding single-cell data has been limited, even for the data coming from the more mature technologies. Conceptual progress has been much slower than technological progress. So, how do we make sense of the data?

I will reserve my musings on this question for a forthcoming post. For now, I’ll just say that I like an idea articulated by Munsky et al., 2012 and Padovan-Merhar and Raj, 2013: Using the variability between single cells as a natural perturbation for studying gene regulation. I think that this approach can be a very powerful. More thoughts on that coming soon.