Characterization of human immune cell subpopulations in cerebrospinal fluid using mass cytometry.

Gerardina Gallaccio, Meng Wang, Stephan Schlickeiser, Desiree Kunkel, chotima.boettcher, Camila Fernández-Zapata

Published: 2024-01-30 DOI: 10.17504/protocols.io.36wgqjp4ovk5/v1

Abstract

Phenotypic and compositional changes of immune cells in cerebrospinal fluid (CSF) can be used as biomarkers to help diagnose and track disease activity for neuroinflammatory and neurodegenerative diseases. Here, we describe an end-to-end workflow to perform high-dimensional immune profiling at single-cell resolution using Cytometry by Time-of-Flight (CyTOF) on cells isolated from the CSF of patients with neuroinflammation. We include protocols for sample collection and preparation, barcoding to allow for multiplexing, and downstream data analysis using R.

For complete details on the use and execution of this protocol, please refer to Fernández-Zapata, C. et a¹1.

Before start

The CSF hosts a subset of blood immune cells predominantly composed of memory T cells, but also B cells, monocytes, natural killer (NK) cells, unconventional T cells and antigen-presenting cells (APCs).

Routine CSF collection is integral for diagnosing various CNS disorders, offering a more practical alternative to assess the CNS. Thus, immunophenotyping of CSF cells is feasible and can provide better insights in immune-driven/mediated pathophysiology of many neurological disorders including neuroinflammation ⁷. Moreover, unravelling cellular biomarkers can potentially be used as a diagnostic tool and as a measure of disease activity, thereby facilitating more personalized treatment approaches and enhancing our comprehension of the underlying factors contributing to diseases heterogeneity.

Taking advantage of the minimal spillover between channels and no interference by auto-florescence, Cytometry by Time-of-Flight (CyTOF) is a powerful tool for comprehensive characterization of multiple immune cell populations in different body compartments. It allows a simultaneous evaluation of over 40 protein markers at the single cell level. The combination of a comprehensive array of protein markers and unsupervised data analysis provides a powerful strategy for profiling the heterogeneity of human immune cells in health and disease ^8,9.

However, an implementation of CyTOF-based immune phenotyping studies in neuroimmunology research is limited by complex experimental workflows and the variation in samples composition between batches. Therefore, it is important to

standardize and streamline the experimental and analysis workflows starting from sample preparation, especially in longitudinal studies with a large cohort of patients. We describe here a standardized and streamlined workflow from sample collection and processing to data acquisition and analysis of small numbers of immune cells in CSF. Historically, it has been technically challenging to perform immune profiling of CSF cells due to small cell numbers (~5,000– 15,000 cells/ml) and susceptibility of CSF immune cells (which limits possibility of cryopreservation), as well as availability of the patient CSF samples. Furthermore, our protocol can be also applied for isolated single cells from other tissues such as brain or from peripheral blood.

Of note, in this protocol, we do not aim to describe in detail about antibody panel design and titration, which have previously been well described by Thrash et al., STAR protocols (2020) ¹⁰.

Steps

Sample collection and storage

Prepare the anchor sample.

An anchor sample is peripheral blood mononuclear cells (PBMCs) used as internal reference across different measurements/batches to facilitate the signal normalization⁸.

However, cell types other than PBMCs can also be used as an anchor sample but they should properly express all the markers of desired panel(s).

a. Isolated PBMCs (1x10⁶) are mixed with 250µL albumin (BSA) (in PBS). b. Add350µL of proteomic stabilizer (PROT1) buffer, gently mix and incubate atRoom temperature (RT) for 0h 12m 0s .c. Immediately store at-80°C

Note

Number of anchor sample to be prepared depends on number of planned batches. One anchor sample will be added to each batch of a pooled sample.CRITICAL: Centrifugation speed for living cells should not exceed 300 xg. This step is critical; be careful not to spin down too fast during the isolation of PBMC from whole blood. This precaution helps overcome high cell loss caused by increased cell lysis and membrane deformation.

CSF sample collection and processing (for 3ml CSF)

CSF samples must be kept on ice and should be processed (i.e., cell isolation and aliquoting) within maximum one hour after lumbar puncture. CSF cell pellet is commonly invisible (due to low cell numbers, commonly about 5,000-10,000 cells/ml), thus sample processing must be performed with care to avoid disturbing the cell pellet (see also Troubleshooting Problem 1).

a. Place3mL Sample in 15mL Falcon polypropylene (PP) tube. b. 300x g,4°C.c. Carefully take out the supernatant. To avoid disturbing the cell pellet, about100µL ofSample is left in the tube. CSF supernatant may be aliquoted for e.g., proteomics or metabolomics analysis.d. Add 400µL in (PBS) to the CSF cell pellet, gently mix by pipetting up and down for about 4-5 times.e. Add 700µL of PROT1 buffer, gently mix and incubate at Room temperature 0h 12m 0s .f. Immediately store at -80°C .

Note

After the centrifugation (after b.) the CSF sample must be clear and without any blood contamination (depicted by the presence of an erythrocyte pellet). CSF contaminated with blood cells should be excluded from the study. (see the Troubleshooting Problem 2).

Antibody panel preparation.

In-depth descriptions of the antibody titration and antibody panel validation steps have been extensively described in previous STAR protocol by Thrash et al.¹⁰For the steps regarding the validation and conjugation of antibodies, comprehensive information can be accessed through the Standard Biotools website

https://fluidigm.my.salesforce.com/sfc/p/#700000009DAw/a/4u0000019jXU/6wyoqHHEDHl5D5e0cLOsylAsnfB0hdiCEprKHI9aFj88¹¹1

a. Preparation of antibody cocktail and storage:

For example, for five batches of pooled samples, a final volume of 500 µl of an antibody cocktail is prepared, consisting of a combination of all antibodies in each panel. Each antibody is diluted in staining buffer according to a validated dilution. The final antibody cocktail is subsequently divided into 5 aliquots (100 µl each) and stored at -80°C. Each pooled sample (i.e., one batch containing max. of 20 individual samples) will be stained with a 100 µl frozen antibody cocktail.

Barcoding and Staining

For sample barcoding, we use the Cell-ID-20-plex Pd Barcoding Kit (Standard Biotools), containing a total of 20 different metal combinations, using a 6-choose-3 barcoding scheme (combinations of any 3 Palladium (Pd) isotopes of¹⁰²2Pd,¹⁰⁴4Pd,¹⁰⁵5Pd¹⁰⁶6Pd,¹⁰⁸8Pd, or¹¹⁰0Pd). Barcoding can also be used as a tool to identify cell-cell doublets (caused from different samples). To minimize variability of inter-sample staining and acquisition, we pool all samples (max. of 20 samples) into one batch prior to staining with frozen antibody cocktai¹²2 as described below.

Note

CRITICAL: One limitation of this method is that samples need to be fixed prior to panel staining, to avoid the corresponding antibody no longer recognizing the intended epitope.

4.1.

Maximum of nineteen samples and one anchor sample (all PROT1-fixed samples) are transferred from -80°Con dry ice.

4.10.

Resuspend the pellet in 20µL (Stock sol= 160mg/ml, diluted 1:160 in staining buffer).

Incubate 0h 10m 0s 4°C .

4.11.

Add900µL (frozen at -80°C) antibody cocktail, resuspend and incubate a4°C 0h 30m 0s .

4.12.

Wash twice with 1mL and 600x g,4°C .

Aspirate the supernatant carefully.

4.13.

Resuspend cell pellet in 500µL (Pierce, freshly diluted from stock solution in PBS) ( max.of 1 Mio cells per 100µL , rotate with a volume bigger than 500µL ) and incubate 4°C .

4.14.

On day 2

Wash cell suspension once with 1mL .

800x g,4°C and discard the supernatant.

4.15.

Thawn frozen (-80 °C) intracellular antibody master mix On ice .

Mix 50µL with50µL (eBioscience).

4.16.

Resuspend the cell pellet in90µL of the diluted antibody cocktail. Incubate atRoom temperature for0h 30m 0s .

4.17.

Wash twice in 1mL , and 008x g,4°C .

4.18.

Add 500µL (diluited 1.1000 in PBS containing 2% FA).

Incubate at Room temperature for 0h 20m 0s

4.19.

Wash twice with 1mL .

800x g,4°C Centrifuge 700xg at 4°C for 5 min. Discard the supernatant.

4.2.

Thaw samples at 4°C , until completely thawed (take approximately 0h 20m 0s )

4.20.

Keep at 4°C in max. of 80µL at this point until ready for CyTOF measurement.

4.21.

Transfer cells in max of80µL on the strip for washing with MilliQ H2O using the Laminar Wash Mini-1000 system (Curiox Biosystems). Settle down for0h 30m 0s , wash immediately before measurement (9 cycles, FR 5)

4.22.

CyTOF acquisition

4.3.

Transfer cells into a 15 ml Falcon PP tube, containing 10mL (Standard Biotools).

4.4.

600x g . Discard the supernatant.

4.5.

Resuspend cell pellet in 1mL and transfer cell suspension into 1.5 ml Eppi. 600x g,4°C. Discard the supernatant.

4.6.

Resuspend cell pellet in 1mL

Add additional 4mL .

Pipette gently up and down.

4.7.

Incubate at Room temperature for 0h 30m 0s

4.8.

Wash twice with 1mL . 600x g . Discard the supernatant.

4.9.

Pool all twenty samples in new 1.5mL Eppi.

600x g . Discard the supernatant.

Acquisition

In this section, we highlight all the good practises´phases to achieve optimal data acquisition with the CyTOF instrument. For a detailed information please refer to the related paper “Mass Cytometry, Methods and Protocols, Helen M.McGuire and Thomas M. Ashhurst,2019¹³3. Before

each use, a quality control of the instrument should be performed and documented for performance tracking. This quality control should at least include:

5.1.

Contamination check: before the instrument is tuned to the manufacturer’s instructions, short preview with ultrapure water will show remaining metal contamination.

5.2.

Quality control: EQ four element beads from Fluidigm can be run for a defined period of

time (e.g., 2 min) to control for yield and sensitivity, and can serve as a quality control before each experiment.

5.3.

Samples check: It is advisable to resuspend samples one by one and run samples with a large number of cells in aliquots of no more than 1 h runs (approx. one million cells), as the cells tend to disintegrate over time if they are kept in water, even if fixed well. It is important to adjust the cell concentration directly before running the sample.

5.4.

After a sample has been acquired, the data are digitized as a raw integrated mass data (IMD) file, which represents a matrix of ion counts for each selected mass channel for every push.

The Fluidigm software then converts the IMD to a flow cytometry standard (FCS) file. The resulting FCS file contains total integrated ion counts for every selected channel for every event and can be analyzed using FlowJo, Cytobank, or other available third-party cytometry analysis software.

Note

Note: Optima sample concentration might differ depending on the source of the cells or cell type. Isolated cells from liquid biopsies (e.g., blood, liquor, urine) are less prone to form clumps than cells isolated from tissue and can be run at higher concentrations. There is a maximum number of events/ second that is around 300-500 events/s.CRITICAL: Good cell fixation is extremely important to allow cells to withstand the hypotonic stress associated with the final water washes prior to sample acquisition; all stained samples should be post fixed with freshly diluted 1-4% formaldehyde in PBS. Inadequate fixation will result in sample degradation, which may manifest as significant loss of cells during water wash. Even in appropriately fixed samples, exposure to water for prolonged periods of time will cause cellular degradation.

Cell signal intensity may decrease during acquisition as result of instrument performance or sample degradation. Changes in instrument performance can occur between samples and even within a single acquisition of a sample. This can be due to gradual loss of detector sensitivity, or changes in plasma ionization efficiency. Fluidigm’s EQ Four Element Calibration Beads are polymer beads that contain known standards of four elements at natural isotopic abundance (cerium, europium, holmium, and lutetium). Several metrics can be evaluated using EQ bead-derived signals to track daily instrument performance and to monitor changes during sample acquisition using conventional cytometry analysis software or automated tools. While instrument performance can be tracked using EQ beads, cell-specific degradation cannot. Once a sample has been acquired and undergone initial quality control several steps must be taken to process the data for analysis. As mentioned previously, instrument performance can vary within a single sample. To address this issue, EQ bead normalization must be performed to account for this technical variability in order to better represent real biological differences between samples. These beads allow for monitoring of instrument performance and for normalization of signal intensity to account for fluctuations over time or variations between instruments.

Data pre-processing

Debarcoding

In this section we describe the gating strategy to identify individual samples. Raw data (. FCS files) are generated after CyTOF acquisition. Each raw data (.FCS file) comprises 20 barcoded samples, which are then analyzed using FlowJo Software to remove beads,clog and dead cells. For debarcoding, Boolean gating is used to deconvolute individual samples according to the barcode combination in FlowJo. All de-barcoded samples are then exported as individual FCS files for further analysis^1,8.

Citation

Figure 1. Representative plots of sample barcoding, staining and gating strategy. Samples are barcoded, pooled and stained with a panel of metal-conjugated antibodies and acquired on the CyTOF instrument. Prior to the data analysis each individual sample is de-barcoded on FlowJo. Figure adapted with BioRender.com.

Install tools/software ,packages and libraries.

This protocol utilizes the R environment for statistical computing and data visualization. Depending on your operating system, download the necessary software from the CRAN repositories (https://cran.r-project.org/)) and RStudio website. Install the required R tools accordingly.

After installing the latest software version, install all the R packages specific to CyTOF data analysis, by executing the following command on the RStudio console. Here we show the libraries used to execute our workflow. The key resources table lists the R packages used.

7.1.

Install the packages

if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("CATALYST")
BiocManager::install("flowCore")
BiocManager::install("flowWorkspace")
BiocManager::install("CytoML")
BiocManager::install("flowFP")
BiocManager::install("diffcyt")
BiocManager::install("scater")

Note

Be aware of installing the latest version of R environment, previous veriosn could return some errors.

Normalization:

Due to availability, human immunology studies often involve sample collection over the course of months to years. To curate enough dataset for powerful statistical testing, it is necessary to process and run samples in multiple batches over a period of time. A barcoding approach allows for multiple samples to be stained together in one tube, reducing the intra-barcode technical variability, and optimizing data acquisition speed and efficiency (resulting in decreased cell loss), as it constitutes a single sample run on the instrument. However, it is only possible to measure 20 samples per barcode set and so , multiple barcode sets (batches) are still required to address questions in robustly powered study designs. To improve integration of data from different batches, thus to minimize the batch effect, we must integrate a signal-normalization step in the data analysis workflow. For detailed information, please refer to the related paper “Minimizing Batch Effects in Mass Cytometry Data, Schuyler et al 2019”¹⁴

8.1.

Before starting the Normalization step, create a folder containing:

The script “Normalization_R;

The ChannelsToAdjust_example.txt file containing a list of the channel names used in the experiment;

The function BatchAdjust_R, which can be downloaded from https://github.com/CUHIMSR/CytofBatchAdjust.

The compensated FCS files in a subfolder named “Files”.

8.2.

Rename your FCS files:

The FCS files will be divided in groups of batches: eg., Batch1, Batch2, Batch3 etc. Each batch group must contain an anchor sample. Rename the FCS files by adding the word “BatchNumber_” with the corresponding number of the batch group they belong to and add the word “anchor” for every anchor sample.

Note

Pause point: Take a look at this explanoatroy example before proceeding. You have a series of FCS files that belong to Batch 1 and ". You could rename the file as follow. Sample_number_batchNumber_SampleName.Sample1_Batch1_013.fcsSample2_Batch1_014.fcsSample3_Batch1_015.fcsSample4_Batch2_033.fcsSample5_Batch2_034.fcsDo the same for the anchor samples. You must have only one anchor sample for each Batch.Number of anchor sample_number of batch_anchor.Sample01_Batch1_anchor.fcsSample02_Batch2_anchor.fcsSample03_Batch3_anchor.fcs(Sample01, Sample02, Sample03 if they belong to Batch1, Batch2, Batch3)

8.3.

Run the script.

#Change the directory accordingly!
# Note you must have a basedir with your
original files ONLY and the function will create an empty output folder called “Out”. 
#you need a .txt file with your channels to adjust, check the example and adjust to your own
# the batch key word and anchor key word are very important, you need to change your files names accordingly 

library(flowCore) 
 
# Call the function 
 
source("Normalization/BatchAdjust_.R")

 # Directory containing original files 
 
filedir< - "Normalization/Files/"
 
# Normalize the files accordingly with the batches and anchor samples
#Chose the percentile method,95th or 80th 
 
BatchAdjust(basedir=filedir, 
outdir="C:/Users/admin/Desktop/Normalization/Out",
channelFiles=" Normalization/ChannelsToAdjust_example.txt,
batchKeyword="Batch", anchorKeyword= "anchor",
method="95th")

Note

Use the 95th percentile as the high end for our normalization target point to avoid outliers, and 80th percentile as the low end.

8.4.

Obtain the output graphs:

Take a look at the pre- and post -normalization variance and verify that the signal intensity of markers in each channel is correctly normalized.

CRITICAL: You might encounter an error that prevents you from obtaining the normalized output files and the corresponding plots. This issue could be related to the name of the channel-markers listed in the text file.

Citation

Figure 2. (A) Scaling factor plots. (B) Pre and Post variance plots .(C) Pre and post Variance for each markers.

8.5.

Obtain the output of normalized FCS files.

Use the normalized FCS files to proceed with the Compensation step.

Compensation:

This section will drive into the evaluation and compensation of signal spillover. Although signal spillover in CyTOF is minimal compared to fluorescent-based technologies, there is still signal crosstalk between channels that can interfere with the interpretation of the data. This spillover is mainly due to natural isotopic impurity (m + 1, m + 2, etc.) and oxidation of elements during measurement (m + 16). The spillover is correlated with the original signal in an approximately linear manner and can be corrected via a process called compensation¹⁶. In parallel to multiplexed sample staining, single stains were generated by staining polystyrene antibody-capture beads (SS-beads). After staining, beads were pooled and run as a single sample in the mass cytometer. Each bead is assigned to a specific population based on the dominant signal, and the purity of the bead populations is further increased by automatically applying estimated sample-specific cutoffs. In a second step, the spillover matrix is calculated based on the spillover observed for single-stained populations¹⁶.This workflow primarily relies on the usage of CATALYST packages¹⁵(https://github.com/HelenaLC/CATALYST), which are necessary for performing CyTOF data analysis.

Before starting: Create a folder that contains the SS-beads FCS file, unzipped FCS files, and the script needed for the analysis.

9.1.

Load the libraries needed for this step.

library(CATALYST)
library (flowCore)
library(SingleCellExperiment)

CATALYST performs compensation via a two-step approach comprising identification of single positive populations via single-cell debarcoding (SCD) of single-stained beads (or cells) and estimation of a spillover matrix (SM) from the populations identified, followed by compensation via multiplication of measurement intensities by its inverse, the compensation matrix (CM).

9.2.

Data organization

Load the single stains data and make sure to have SS_Beads FCS file in the working directory. Data are organized into an object called SingleCellExperiment (SCE)¹⁵ which can be constructed from a directory housing a single or set of FCS files. FCS files are read into R with read.FCS function of the flowCore package and are represented as an object of class flowFrame ¹⁵.

# Load the single-stained beads (SS_Beads) and address the parameters 

Single_stains< -“SS_Beads_01.FCS” 
ss_exp< -read.FCS(single_stains,transformation=FALSE,truncate_max_range=FALSE)

bc_ms < -as.numeric(gsub("[[:alpha:]]", "", sapply(strsplit(parameters(ss_exp)$desc,"_"), '[[',1)))
bc_ms < - bc_ms[!is.na(bc_ms)]
bc_ms < - bc_ms[!(bc_ms %in% c(89, 113, 115,140, 190, 191, 193, 195))]

9.3.

Debarcoding

The debarcoding process commences by assigning a preliminary barcode ID to each event.

a. assignPrelim function will return either a binary barcoding scheme or a vector of numeric masses as input, and accordingly assigns each event the appropriate row name or mass as ID.

b. Final assignment will be made by applyCutoffs function.

c. plotYields, shows the distribution of barcode separations and yields upon debarcoding as a function of separation cutoff.

#Prepare the data
re< - prepData(ss_exp)

#Assign the preliminary barcode ID 
re< - assignPrelim(re, bc_ms, verbose = FALSE) 

#Apply the cutoffs
re< - applyCutoffs(estCutoffs(re)) 
re< -estCutoffs(x=re) 
sep_cutoffs< -re$sep_cutoffs
re< -applyCutoffs(x=re, sep_cutoffs = sep_cutoffs)

#Visualize the single stained bead deconvolution 
plotYields(x=re,which=0)

9.4.

Compensation:

These steps are relevant to the compensation of FCS files.

a. Extract the spillover matrix: The following functions, computeSpillmat and plotSpillmat, provided an estimation and visualization of the spillover matrix for channels intensities signal.

re< - computeSpillmat(re)
 
#Check the channels and metals
sm< -metadata(re)$spillover_matrix
chs< -channels(re)
ss_chs< -chs[rowData(re)$is_bc]
all (diag(sm[ss_chs, ss_chs]) == 1) 
all (sm >= 0 & sm <= 1) 
custom_isotope_list< - c(CATALYST::isotope_list, list(BCKG=190)) 
 
#Get the Spill matrix plot before the compensation of the datasets 
plotSpillmat(re,isotope_list=custom_isotope_list)

Note

The SM is stored in the SCE object as well as the custom_isotope list.

b. compCytof function permits to compensate mass cytometry-based experiments using a provided spillover matrix.

# Use the “flow” method
re_c < -compCytof(re, sm, method ="flow", isotope_list=custom_isotope_list)
fs < -sce2fcs(re_c)
exp_dat < -exprs(fs)
set.seed(25)
exp_dat< -asinh(exp_dat[sample.int(nrow(exp_dat),5000),c(7:50)]/5)
# Obtain the first scatter plot matrix 
pairs(exp_dat, pch=".")

c. Select a random sample, for instance “sample1”, within the dataset used.

#Random chosen sample: sample1
#Load the sample1

sample1< -read.FCS("sample_01.fcs",transformation=FALSE,truncate_max_range=FALSE)
#Check the info stored in the sample1
sample1

#Adress all the parameters to sample1 before performing the compensation  
sce< - prepData(sample1) 
sce< - assignPrelim(sce, bc_ms, verbose = FALSE)
 
#Look at the information stored in sample1 (desc function) and select the
numbers corresponding to the right channels. 
 
exp_dat< -exprs(sample1) 
exp_dat< -asinh(exp_dat[sample.int(nrow(exp_dat),5000),c(1,9:18,20:24,28:35,43,50:61)]/5) 
 
#Getthe diagnostic scatter plot before compensation 
pairs(exp_dat,pch=".") 
 
#Performcompensation
sce_c< -compCytof(sce, sm, method ="flow",isotope_list=custom_isotope_list) 
fs< -sce2fcs(sce_c)
exp_dat< -exprs(fs) 
exp_dat < -asinh(exp_dat[sample.int(nrow(exp_dat),5000),c(1,9:18,20:24,28:35,43,50:61)]/5)

 #Get the scatter plot after compensation
pairs(exp_dat,pch=".")

Note

This step will generate a scatterplot matrix for the signal in all non-compensated channels. It is important to carefully examine the plot to proceed effectively with the final result of compensation.Take a close look at this example! The figure depicts an “over-compensated” channel. The spill value on the matrix needs to be decreased.

Figure 3. Example of correct compensation of Spillover Matrix and Scatterplots. Have a look at the first Spillover matrix obtained and try to decrease the values for each channel where needed. The rows represent the x-axis, and the columns represent the y-axis. Manually adjust the values for individual channels in the upper triangle by decreasing them.

d. Modify the spillover matrix, compensate, and plot again:

# e.g.
# change comp matrix for individual channels
 
sm[ "Pr141Di" , "Nd142Di" ]<-0.000 
sm[ "Nd143Di" , "Nd142Di" ]<-0.001 # new value
sm[ "Nd142Di" ,"Nd143Di" ] <-0.001
sm[ "Nd142Di" ,"Gd158Di" ] <-0.002
sm[ "Gd158Di" ,"Nd142Di" ] <-0.002
sm[ "Nd142Di" ,"Nd144Di" ] <-0.002
sm[ "Nd143Di" ,"Nd144Di" ] <-0.001
sm[ "Nd143Di" ,"Sm147Di" ] <-0.000
sm[ "Sm147Di" ,"Nd143Di" ] <-0.000
sm[ "Nd143Di" ,"Tb159Di" ] <-0.002
sm[ "Nd143Di" ,"Nd145Di" ] <-0.001

e. Obtain the Spillover matrix with the new values.

 # New spillmatrix with corrected values 
 
metadata(re)$spillover_matrix < -sm
plotSpillmat(re,isotope_list=custom_isotope_list)

# Compensate again with the corrected spillover matrix
sce_c < -compCytof(sce, sm, method ="flow", isotope_list=custom_isotope_list) 
fs < -sce2fcs(sce_c) 
exp_dat < -exprs(fs)
exp_dat< -asinh(exp_dat[sample.int(nrow(exp_dat),5000),c(1,9:18,20:24,28:35,43,50:61)]/5)

# Obtain the new scatter plot
pairs(exp_dat, pch=".")

f. Based on the new spillover matrix compensate all FCS files, previously uploaded in the working directory.

# files to compensate 
files < -dir(pattern=".fcs$")
 
# you may remove the fcs with single stains 
#files < -files[!files %in% single_stains]

# compensate each file with "NNLS"method and save under new name
for (file in files){ 
ff_exp < - flowCore::read.FCS(file,transformation=F, truncate_max_range=FALSE) 
ff_exp< -prepData(ff_exp)
 ff_exp< - compCytof(ff_exp,sm, method = "nnls",isotope_list=custom_isotope_list) 
ff_exp < - sce2fcs(ff_exp) 
write.FCS(ff_exp, sce(".fcs","_comped.fcs", file)) 
}

Data analysis

10.

Clustering and UMAP visualization

This section explores how to generate a self-organizing map (SOM) where cells are assigned

to clusters according to their similarities in marker expression. Here, we show how to perform an unsupervised analysis to generate metaclusters using the FlowSOM and ConsensusClusterPlus algorithms, along with the Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction.

10.1.

Load the packages dependencies required for the Clustering and UMAP visualization:

library(CATALYST)
library(flowCore)
library(flowWorkspace)
library(CytoML)
library(flowFP)
library(parallel)
library(diffcyt)
library(scater)
library(ggplot2)
library(RColorBrewer)
libray(readxl)

10.2.

Load the flowset

# Make sure that the folder "Files" is in the working directory

path < -"Files"

# Get the path to each fcs file 
(fcs.files < - dir(path=path, full.names = FALSE))

fs < -read.flowSet(paste0(path,"/",fcs.files), transfrom=FALSE, truncate_max_range=FALSE)

10.3.

Data organization and metadata object:

Create your matrix (called in our script “md”) as an Excel file containing all the features you want to visualize in the analysis.

The “md” must contains the following columns: file_name, sample_id and “condition”. Clarify your "condition” in this step(e.g.: group, body compartments, diagnosis, treatment). Save the “md” in the folder that contains the FCS files and the script.

Load the “md" in the R environment.

#Load metadata stored in an excel file. 
md < -read_excel("md.xlsx")
md$sample_id < - factor(md$sample_id)
md$condition_id < -factor(md$condition_id)
md$diagnosis_id< -factor(md$diagnosis_id)

A	B	C	D
SampleID	condition_id	diagnosis_id	file_name
001	Non-Neuroinflammatory diseases disease	CON	CON_001.fcs
002	Non-Neuroinflammatory diseases disease	CON	CON_002.fcs
003	Neuroinflammatory diseases disease	AD	AD_003.fcs
004	Neuroinflammatory diseases	MS	MS_004.fcs
005	Neuroinflammatory diseases	DEM	DEM_005.fcs

Table2. Example of a possible metadata matrix(md) used to read the flowset

10.4.

Look at the description of the parameters stored in the flowset and extract the information. In this section, you will dive into the preliminary information stored in the flowset (the FCS data used). Therefore, carefully examine them, as this step is essential to understand the dataset and will be used in following analysis.

# Keep the parameter description 
fs.desc < -parameters(fs[[1]])@data[,1:2]

# Select channels of interest 
umap.ch.idx< -c(6,13:27,31:38,45,51:56,58:62) 

# make marker names more readable and remove unwanted chars
p.desc < -unname(parameters(fs[[1]])$desc)

# Update flowSet with marker names
for (f in 1:length(fs)) { parameters(fs[[f]])$desc < p.desc
}

# Update the parameter description	
fs.desc < -cbind(fs.desc, p.desc, umap=logical(nrow(fs.desc)))
fs.desc$umap[umap.ch.idx] < - TRUE
fs.desc

fsApply(fs, colnames)

# Get an overview and an estimate of the....
(n.fr < - length(fs)) # ...number of samples
(v.events < -fsApply(fs, nrow)) # ...number of events per sample
(min.events < -min(v.events)) # ...minimum number of events
sum(v.events)
cbind(md, v.events)

Note

The isotope, metal and antigen information are stored in the flowSet( the container for multiple samples) object.

10.5.

Create the Panel and address the marker information.

In this step, we explore how to create the Panel, which is a data.frame containing columns with the name for each marker present in the input raw data, the targeted protein markers and the marker class (“type” or “state”).

Note

It is important to double-check the marker class specification to achieve a robust clustering analysis. FlowSOM/ConsensusClusterPlus will use the “type” markers to perform the clustering. The “state” markers will be considered as functional markers. Markers referred to as "Type" mainly determine phenotypic differences between cell clusters and are typically the lineage markers. The rest of the markers are listed as "State" and are then used to analyse differential marker expression of each cluster between conditions.

a. Depending on your experiment change the number of the marker_class

# Channels and marker names
fcs_colname < -unname(fs.desc$name)
antigen < -unname(fs.desc$p.desc,)

#Define the marker classes
#Note: all "type" markers will be used for clustering 
marker_class < -rep("none", nrow(fs.desc))

# Select here the markers needed to be named as "type"
marker_class[c(6,13:27,31:38,45,51:56,58:62)] < -"type" 
marker_class < - factor(marker_class, levels = c("type", "state", "none"))

 #Create the Panel 
panel < -data.frame(fcs_colname, antigen, marker_class, stringsAsFactors = FALSE)

b. Additional information:

#Switch the "type" markers as "state" markers if it is needed  rowData(sce)$marker_class[c(seq(10,43),49)] < - "state"
 # Double check!
rowData(sce)$marker_class

10.6.

Create a SingleCellExperiment (sce).

This section shows how to store all data used and returned throughout the differential

analysis in an object of the SingleCellExperiment(sce)class.

Be aware:

The function prepData() requires the filenames listed in the md$file_name column to match those in the flowSet.

#Prepare the Data 
md$file_name< -c(keyword(fs, "FILENAME"))

# Construct SingleCellExperiment
sce < -prepData(fs, panel, md, features = panel$fcs_colname, 
                md_cols = list(file = "file_name", id = "sample_id", factors = c("sample_id","condition_id","diagnosis_id")),
    panel_cols = list(channel = "fcs_colname", antigen = "antigen", class = "marker_class"))

10.7.

Visualization of the results with CATALYST package:

This section explores how to obtain results using CATALYST pipeline¹⁵.

The details of the procedures are available in https://github.com/HelenaLC/CATALYST.

Overall, the pipeline allows to obtain a comprehensive explanatory view of the sample dataset through the generation of exploratory plots like Multidimensional scaling plot (MDS) and Non redundancy score plot (NRS). The visualization of FlowSOM heatmaps and UMAP plots after the clustering step provides insights into the distribution of immune cell populations into different meta-clusters based on their similarity (Figure 4)

# MDS plot 
plot < -pbMDS(sce, color_by = "condition_id", label_by = NULL,  features = "type")

# Marker ranking based on the NRS
plot < -plotNRS(sce, features = "type", color_by = "condition_id")

#Perform the clustering 
 
sce < - cluster(sce, features ="type", xdim = 10, ydim = 10, maxK = 20, seed = 1234)
 
#Visualize the marker expression per cluster with FlowSOM heatmap 
 
plotExprHeatmap(sce,features= "type",by = "cluster_id",k = "meta20",bars=TRUE,perc=TRUE)
 
#Dimensionality reduction
 
#run UMAP on at most 500/1000 cells per sample 
sce< - runDR(sce, "UMAP", cells = 1e3, features ="type")
 
 
#UMAP plot stratified by clusters 
 
plotDR(sce,"UMAP", color_by="meta20")

Citation

Figure 4. (A) MDS plot of 7 samples. (B) NRS plot. Observed variance of marker expression in each sample. Each dot represents the per-sample NR scores. Whisker plots show the min (smallest) and max (largest) values. The line in the box denotes the median. The empty black circles are mean NR scores. (C) Cell population abundance compares the proportions of cell types across the two conditions and aims to highlight populations that are present at different ratios. Bars coloured by cluster ID , where the size of a given stripe reflects the proportion of the corresponding cell type in a given sample. (D) UMAP projection stratified for condition_id (E) UMAP projection colouring indicates 1-10 clusters. Each dot represents one cell.

Expected outcomes

11.

In here, we show the workflow strategy to characterize the immune cell population

in CSF samples. Successful completion of the protocol (Figure 5) should enable the generation of different plots for data visualization using the CATALYST pipeline¹⁵5 .

Citation

Figure 5. Workflow strategy to characterize immune cell populations by using CyTOF. Figure adapted with BioRender.com.

The Figure 4 present results from a small cohort of patients with CON (non-neuroinflammatory disease n=3), neuroinflammatory disease(n=4) and corresponding data analysis.

The MDS plot shows similarities between samples in unsupervised manner. On the other hand , the NRS plot identifies the ability of markers to explain the observed variance in each sample. Differences in cell compositions between CON and neuroinflammatory disease can be seen in the UMAP plots. To further evaluate the phenotypic differences of immune cells between the two conditions, we performed clustering analysis using the FlowSOM and ConsensusClusterPlus algorithms. A total of ten clusters were identified. Phenotypic differences in CSF B cells (Cluster 1) were detected between the two conditions. The proportion of CD19⁺B cells was found higher in the neuroinflammatory patients than in the CON (Figure 4).

Before proceeding further with the analysis ,it is good practice to thoroughly examine the features of the dataset. As explained earlier in this protocol, it is important to consider which samples to include in the analysis. For instance, you may have outliers or a small number of cells in certain samples. In our analysis, we set a minimum of 10 cells per sample per cluster to consider for the clustering step. The individual operator must take this point into consideration, depending on the type of the dataset and the experiment they are working on.

Statistical analysis

12.

The method of our choice is edgeR test which is an optimal statistical test tool for low number of cells.

edgeR is a Bioconductor package for differential expression analyses. The package implements exact statistical methods for multigroup experiments developed by Robinson and Smyth. It also implements statistical methods based on generalized linear models (GLMs), suitable for multifactor experiments of any complexity.

# Create design matrix depending on your experiment
# Choose the features you want to test. In this example is selected condition_id
design < - createDesignMatrix(md, cols_design = "diagnosis_id")

# Create the contrast depending on the experiment and the objects chosen for the comparison (e.g ., CON vs MS)

contrast < - createContrast(c(0, 1))

#Perform the test
#Set the number for your "min_samples". Have a look at the number of your samples in the dataset. Min_samples refer to the smallest group between the two objects you are comparing.

res_DA_E < - diffcyt(sce, design =design, contrast = contrast,
 analysis_type = "DA", method_DA = "diffcyt-DA-edgeR", clustering_to_use = "meta20", min_cells=3, min_samples=3)

# Show the p values  
(top_DA_E < -topTable(res_DA_E, format_vals = TRUE, all=TRUE, show_counts = TRUE, show_props = TRUE))

Limitations

13.

Sample staining and preparation for CyTOF can cause high rate of cell loss. Typically, only 50-70% of the sample can be recovered in the data. Therefore, studies involving rare cell populations require a larger starting sample size for adequate rigor as compared to studies investigating prevalent subset. The panel design is key to mass cytometry success. However, CyTOF experiments are still practically limited to around 60 markers, meaning that researches must still focus on particular types or functions of cells. Nevertheless, normalization methods and the use of anchor samples can be implemented to improve the reproducibility and comparability of CyTOF

results across experiments and study sites¹1. Finally, the CSF studies usually fall short in providing longitudinal data because repetitive lumbar punctures are difficult to justify. Moreover, defining appropriate control groups is another crucial point because CSF samples from strictly healthy participants are usually not availabl⁶6. However, together with the magnetic resonance imaging measures the CSF cell analysis can improve the diagnostic accuracy and help to estimate individual prognosis.

Troubleshooting

14.

Problem 1 :

Due to the small number of cells, the generated pellet could be very small and difficult to see. ( before to begin)

Potential solution:

We recommended being careful during this passage and always double- checking the supernatant and the pellet. Leave a small amount of the supernatant with the pellet in the Falcon tube.

14.1.

Problem 2 :

Contamination of CSF sample ( to begin). After centrifugation, the sample may contain blood droplets.

Potential solution :

Using a CSF blood-contaminated sample will affect the analysis and the reliability of the results. Therefore, we highly recommend excluding any contaminated samples from the analysis.

Characterization of human immune cell subpopulations in cerebrospinal fluid using mass cytometry.

Abstract

Before start

Steps

Sample collection and storage

Barcoding and Staining

Acquisition

Data pre-processing

Data analysis

Expected outcomes

Statistical analysis

Limitations

Troubleshooting

推荐阅读