Quality control analysis for 10X snRNA-seq

Dinh H Diep, Daniel Jacobsen

Published: 2022-07-20 DOI: 10.17504/protocols.io.261genbqjg47/v2

Abstract

Here we describe a computational protocol for performing quality control analysis on shallow sequencing data obtained from 10X snRNA-seq experiments. The workflow starts with raw MiSeq run folders and uses cellranger to generate count matrices. The raw count matrices are analyzed and sequencing saturation plots are generated. The saturation plots are then compared against plots from a reference set of libraries with varying qualities (bad, fair, good, great), thus allowing for the determination of sequencing requirements as well as an assessment of the overall quality of each 10X snRNA experiment.

Attachments

10X_snRNA_preseq_analysis_v1.0.tar.gz

Steps

Install cellranger using instructions from https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/installation

Note

Make sure that cellranger is in the environment's path, otherwise modify commands to include the full path to cellranger.

Download the tar.gz file from this protocol.

Extract the tar.gz file from this protocol.

is the name of the downloaded tar.gz file.

#Extract file (linux)
tar -xzf <FILENAME>

Install anaconda or miniconda Python distributions following given instructions.

Get anaconda from here: https://www.anaconda.com/products/distribution , OR

get miniconda from here: https://docs.conda.io/en/latest/miniconda.html

Install preseq using given instructions from http://smithlabresearch.org/software/preseq/.

Note

preseq must be in the environment's path.

Preseq requires the GSL libraries. Install GSL using the instructions from https://www.gnu.org/software/gsl/.

Create a symbolic link so that preseq can find the required gsl library.

#Create a symbolic link to GSL library (linux)
sudo ln -s /usr/local/lib/libgsl.so /usr/lib/libgsl.so.0

Install samtools using given instructions from http://www.htslib.org/download/.

Use conda to install bcl2fastq with the following terminal command:

#install bcl2fastq (linux)
conda install -c dranew bcl2fastq

10.

Use conda to install required python packages with the following terminal command:

#Install python packages for 10X_snRNA_preseq_analysis package (linux)
conda install -c conda-forge numpy seaborn matplotlib pandas

11.

Run cellranger mkfastq to generate fastq files. Make sure that the following placeholders are set to the correct paths and desired names.

<FASTQ_OUT> is the name of the output folder

is the path to the MiSeq run folder

is the path to the sample-sheet.csv file

#Generate fastq files from raw run folders (linux)
cellranger mkfastq --id=<FASTQ_OUT> --run=<RUN> --sample-sheet=<CSV>

12.

Run cellranger count. Make sure that the following placeholders are set to the correct paths and desired names.

is the name of the output folder for the sample

is the sample name used for the sample in the sample-sheet.csv file

is the path to the cellranger reference data folder

is the number of expected cells from the experiment

#Generate the count matrix (linux)
cellranger count --id <ID> --fastqs <FASTQ_OUT> --sample <SAMPLE> --transcriptome <REF> --include-introns --expect-cells <NUM>

13.

Run the preseq script in the folder downloaded from this protocol. Make sure that the following placeholders are sset to the correct paths and names.

<PATH_TO_FOLDER> is the path to the folder that was extracted from the tar.gz file.

is the name of the output folder for the sample generated with cellranger.

#Generate preseq plots from 10X snRNA output folder (linux)
<PATH_TO_FOLDER>/scripts/loop.preseq.r.sh <ID>

14.

View outputs.

.lc_extrap_log.txt contains preseq statistics

.lc_extrap_output.png to view the sequencing saturation plots

/outs/web_summary.html to view the cellranger analyses

/outs/summary.csv to view quality statistics generated by cellranger

Quality control analysis for 10X snRNA-seq

Abstract

Attachments

Steps

推荐阅读