Understanding Quality Control in Single-Cell RNA Sequencing: Part II - Detecting Empty Droplets
In our first blog post, we discussed the importance of detecting and filtering out low UMI cells to ensure high-quality single-cell RNA sequencing (scRNA-seq) data. In this second part of our series on scRNA-seq quality control (QC), we will focus on detecting empty droplets. We'll use the 𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺 toolkit to illustrate this process.
What Are Empty Droplets?
In droplet-based scRNA-seq platforms, such as 10x Genomics, individual cells are encapsulated in tiny droplets along with barcoded beads. Ideally, each droplet should contain a single cell. However, many droplets end up empty or containing ambient RNA. These empty droplets, if not identified and removed, can introduce noise and distort downstream analyses.
Why Detect Empty Droplets?
Empty droplets typically contain very few RNA molecules, mainly ambient RNA that contaminates the solution. Including these in the analysis can lead to false positives and obscure true biological signals. Detecting and removing empty droplets is crucial to maintaining data integrity.
Step-by-Step Guide to Detecting Empty Droplets with 𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺
𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺 provides a straightforward approach to identifying empty droplets. Here's how to do it:
Step 1: Load the Data
First, load your scRNA-seq data into R. 𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺 supports various data formats, including 𝚂𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝙴𝚡𝚙𝚎𝚛𝚒𝚖𝚎𝚗𝚝 objects and 𝚂𝚎𝚞𝚛𝚊𝚝 objects.
Step 2: Detect Empty Droplets
Use the 𝚛𝚞𝚗𝙴𝚖𝚙𝚝𝚢𝙳𝚛𝚘𝚙𝚜 function, which implements the EmptyDrops method by Sun AT, et al. 2019. This method uses a statistical test to differentiate between real cells and empty droplets based on the total UMI count per droplet.
Step 3: Examine the Results
The 𝚛𝚞𝚗𝙴𝚖𝚙𝚝𝚢𝙳𝚛𝚘𝚙𝚜 function adds columns to the cell metadata indicating the likelihood that a droplet contains a cell versus being empty.
Recommended by LinkedIn
Step 4: Filter Out Empty Droplets
You can filter out droplets that are likely empty by using the provided thresholds or by examining the distribution of the FDR values.
Practical Example
Let's apply this to the 𝚜𝚌𝙴𝚡𝚊𝚖𝚙𝚕𝚎 dataset included in 𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺.
In this example, droplets with FDR values below 0.01 are identified as empty and removed from the dataset, ensuring that only droplets containing real cells are retained.
The resulting figure highlights the identified empty droplets.
Conclusion
Detecting and filtering out empty droplets is an essential step in scRNA-seq quality control. By using 𝚜𝚒𝚗𝚐𝚕𝚎𝙲𝚎𝚕𝚕𝚃𝙺, researchers can efficiently identify these droplets, reducing noise and improving the accuracy of their analyses. In the next part of this series, we will explore methods to detect doublets—droplets that contain more than one cell. Stay tuned!
By following these guidelines, you can enhance the quality of your scRNA-seq data, paving the way for more reliable and insightful biological discoveries.
References
Hong R, Koga Y, Bandyadka S, Leshchyk A, Wang Y, Akavoor V, Cao X, Sarfraz I, Wang Z, Alabdullatif S, Jansen F. Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data. Nature communications. 2022 Mar 30;13(1):1688.
Lun AT, Riesenfeld S, Andrews T, Dao TP, Gomes T, Participants in the 1st Human Cell Atlas Jamboree, Marioni JC. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome biology. 2019 Dec;20:1-9.