CITE-seq

muon features a module to work with protein measurements:

from muon import prot as pt

CITE-seq is a method for cellular indexing of transcriptomes and epitopes by sequencing. It’s single-cell data comprising transcriptome-wide measurements for each cell (gene expression) as well as surface protein level information, typically for a few dozens of proteins. The method is described in Stoeckius et al., 2017 and also on the cite-seq.com website.

Normalisation

dsb

Various methods can be used to normalise protein counts in CITE-seq data. muon brings one of the methods developed specifically for CITE-seq — denoised and scaled by background — to Python CITE-seq workflows. This method uses background droplets defined by low RNA content in order to estimate background protein signal and remove it from the data. The method is described in Korliarov, Sparks et al., 2020 and its original implementation is available on GitHub.

pt.pp.dsb(adata_prot, adata_prot_raw, empty_counts_range=...)
# will use cell calling from the filtered matrix

# or

adata_prot = pt.pp.dsb(adata_prot_raw, cell_counts_range=..., empty_counts_range=...)
# will use provided cell_counts_range for cell calling

CLR

The centered log ratio (CLR) transformation is one of the strategies to normalise protein counts (see e.g. Stoeckius et al., 2017):

pt.pp.clr(mdata['prot'])