Single‐cell RNA sequencing analysis


Single-cell RNA sequencing (scRNA-seq) is becoming a powerful tool to understand the heterogeneity of cell populations. Nowadays, single-cell data can be generated using a variety of technologies such as CEL-seq2, Drop-seq, Smart-seq, Smart-seq2, and finally 10x Genomics, which is currently the most dominant commercial scRNA-seq platform. These technologies differ in several aspects, including the way they capture ribonucleic acid (RNA) sequences, the number of cells they can process, and the types of biological questions they are best suited to answer (Ziegenhain et al. 2017).

scRNA-seq is a variation of RNA-seq that is used to analyse the transcriptomes of individual cells, rather than a mixture of cells in a tissue sample which is the case in bulk RNA sequencing . This technology allows the study of the expression patterns of genes in individual cells, which can provide important insights into the differences between cells in a tissue and the heterogeneity of cell populations (Slovin et al. 2021).

Data upload and preparation

Input fastq folder

One must upload one folder with all the sequencing data using the

. You must select the following format: Fastq folder.


STEP 1 - Checking the reads quality

This step (task: gws_scomix - Quality Control) allows to investigate the quality of reads from a sequencing dataset project.

The output of this first task is linked to (task: gws_scomix - Aggregated Quality Control) in order to aggregate and visualize quality control results from multiple analysis tools into a single, unified report.

Files :

Informations :

STEP 2 -STAR index : Genome indexing

This step (task: gws_scomix - Building a genome index) is used to generate genome indexes files.

Files :

Informations :

STEP 3 -STAR solo : scRNA seq data quantification

  This step (task: gws_scomix - STAR solo) is used for the quantification of per-cell gene expression by counting the number of reads per gene.

Files :

Informations :

- The 10X Chromium whitelist file can be found inside the CellRanger distribution, e.g. (click on this link) . Please make sure that the whitelist is compatible with the specific version of the 10X chemistry (V1,V2,V3 etc).

STEP 4 - scRNA-seq downstream analysis

1- Load count matrices

  This step (task: gws_scomix - Load count matrices) is considered as the first step in scRNA-seq downstream analysis is count matrices loading and pre-processing using scanpy.

Files :

Informations :

2- scRNA-seq data filtration


  This step (task: gws_scomix - Data filtration)  allows low quality cells elimination after matrices loading using "Load count matrices" task and combined.h5ad file generation by applying various filtering criteria such as min_cells , min_genes , min_n_genes_by_counts , max_n_genes_by_counts , max_pct_counts_mt and max_pct_counts_ribo.

Files :

Informations :

3-scRNA-seq data integration

  This step (task: gws_scomix - Data integration) permit the elimination of batch effect and then clusters determination based on leiden algorithm using scVI library.

Clusters_resolution value must also be specified. This parameter is used tp control the granularity of the clustering algorithm, influencing the number and size of the resulting clusters

Files :

Informations :


3- Automated scRNA-seq clusters annotation

  This step (task: gws_scomix - Decoupler clusters annotation) allows an automated cell type annotation from marker genes using decoupler tool and PanglaoDB database.

Files :

Informations :


Ziegenhain C, Vieth B, Parekh S, Reinius B, Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn H, Hellmann I, Enard W. 2017. Comparative Analysis of Single-Cell RNA Sequencing Methods. Mol Cell 65:631-643.e4.

Slovin S, Carissimo A, Panariello F, Grimaldi A, Bouché V, Gambardella G, Cacchiarelli D. 2021. Single-Cell RNA Sequencing Analysis: A Step-by-Step Overview. Methods Mol Biol 2284:343–365.


  Wolf, F. A., Angerer, P., & Theis, F. J. (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biology, 19(1). doi:10.1186/s13059-017-1382-0   

SCANPY: large-scale single-cell gene expression data analysis F. Alexander Wolf, Philipp Angerer, Fabian J. Theis Genome Biology 2018 Feb 06. doi: 10.1186/s13059-017-1382-0.

  Badia-i-Mompel P., Vélez Santiago J., Braunger J., Geiss C., Dimitrov D., Müller-Dott S., Taus P., Dugourd A., Holland C.H., Ramirez Flores R.O. and Saez-Rodriguez J. (2022), decoupleR: ensemble of computational methods to infer biological activities from omics data, Bioinformatics Advances.