Molecular Medicine Israel

Visualizing Cluster-specific Genes from Single-cell Transcriptomics Data Using Association Plots

Highlights

Visualization of genes highly expressed in a cell cluster is a challenge.•

Existing data embedding methods do not easily depict genes characterizing a cluster.•

Association Plots visualize gene-cluster associations from sc transcriptomics data.•

The R package APL implements this concept and allows for interactive visualization.

Abstract

Visualizing single-cell transcriptomics data in an informative way is a major challenge in biological data analysis. Clustering of cells is a prominent analysis step and the results are usually visualized in a planar embedding of the cells using methods like PCA, t-SNE, or UMAP. Given a cluster of cells, one frequently searches for the genes highly expressed specifically in that cluster. At this point, visualization is usually replaced by studying a list of differentially expressed genes.

Association Plots are derived from correspondence analysis and constitute a planar visualization of the features which characterize a given cluster of observations. We have adapted Association Plots to address the challenge of visualizing cluster-specific genes in large single-cell data sets. Our method is made available as a free R package called APL.

We demonstrate the application of APL and Association Plots to single-cell RNA-seq data on two example data sets. First, we present how to delineate novel marker genes using Association Plots with the example of Peripheral Blood Mononuclear Cell data. Second, we show how to apply Association Plots for annotating cell clusters to known cell types using Association Plots and a predefined list of marker genes. To do this we will use data from the human cell atlas of fetal gene expression. Results from Association Plots will also be compared to methods for deriving differentially expressed genes, and we will show the integration of APL with Gene Ontology Enrichment.

Sign up for our Newsletter