Molecular Medicine Israel

A global view of aging and Alzheimer’s pathogenesis-associated cell population dynamics and molecular signatures in human and mouse brains

Abstract

Conventional methods fall short in unraveling the dynamics of rare cell types related to aging and diseases. Here we introduce EasySci, an advanced single-cell combinatorial indexing strategy for exploring age-dependent cellular dynamics in the mammalian brain. Profiling approximately 1.5 million single-cell transcriptomes and 400,000 chromatin accessibility profiles across diverse mouse brains, we identified over 300 cell subtypes, uncovering their molecular characteristics and spatial locations. This comprehensive view elucidates rare cell types expanded or depleted upon aging. We also investigated cell-type-specific responses to genetic alterations linked to Alzheimer’s disease, identifying associated rare cell types. Additionally, by profiling 118,240 human brain single-cell transcriptomes, we discerned cell- and region-specific transcriptomic changes tied to Alzheimer’s pathogenesis. In conclusion, this research offers a valuable resource for probing cell-type-specific dynamics in both normal and pathological aging.

Main

Progressive changes in brain cell populations, which can occur during aging, may contribute to functional decline and increased risks for neurodegenerative diseases such as Alzheimer’s disease (AD)1,2,3,4. Although the recent advances in single-cell genomics have created unprecedented opportunities to explore the cell-type-specific dynamics across the entire mammalian brain5,6,7,8, most prior studies relied on a relatively shallow sampling of the brain cell populations and failed to reveal rare aging or AD-associated cell types. Additionally, they were technically limited in several ways, including failing to recover isoform-level gene expression patterns and the associated chromatin landscape that regulates cell-type-specific alterations across aging stages.

Here, we introduced EasySci, a cost-effective single-cell profiling strategy based on extensive optimization of single-cell RNA sequencing (RNA-seq) by combinatorial indexing9. While the original method has been widely used to study embryonic and fetal tissues10,11, it remains restricted to gene quantification proximal to the 3’ end and limited in efficiency and cell recovery rate11. EasySci provided improved conditions for cell lysis, fixation, sample preservation, enzymatic reaction, oligonucleotide design, and purification methodologies (Supplementary Table 1). Several test conditions were inspired by optimizations described in recently developed or optimized single-cell techniques12,13. The major features of EasySci include (i) 1 million single-cell transcriptomes were prepared for ~US $700 (library preparation cost only, not including personnel or sequencing cost; Fig. 1a–c); (ii) reverse transcription (RT) with indexed oligo-dT and random hexamer primers was achieved, thus recovering cell-type-specific gene expression with full gene body coverage (Fig. 1d); (iii) cell recovery rate, as well as the number of transcripts detected per cell, were substantially improved through optimized nuclei storage, enzymatic reactions and improved primer design (Fig. 1e and Extended Data Fig. 1); and (iv) an extensively improved single-cell data processing pipeline was developed for both gene counting and exonic counting using paired-end single-cell RNA-seq data (Methods).

Leveraging the technical innovations during the development of EasySci-RNA, we further optimized the single-cell chromatin accessibility profiling method by combinatorial indexing (sci-ATAC-seq3)14,15. The key optimizations include (i) a tagmentation reaction with indexed Tn5 that are fully compatible with indexed ligation primers of EasySci-RNA; (ii) a modified nuclei extraction and cryostorage procedure to further increase the library complexity. (A comprehensive quality comparison with other single-cell sequencing assay for transposase-accessible chromatin (scATAC) protocols is shown in Extended Data Fig. 2.) It is noteworthy that the assay for transposase-accessible chromatin with sequencing (ATAC-seq) signal specificity of EasySci-ATAC parallels the original sci-ATAC-seq14,15, albeit lower than 10x ATAC-seq, potentially due to the indexed Tn5 used in single-cell combinatorial indexing. The detailed protocols for EasySci are included as supplementary files (Supplementary Protocols 1 and 2) to facilitate individual laboratories to cost-efficiently generate gene expression and chromatin accessibility profiles from millions of single cells.

Results

A single-cell catalog of the mouse brain in aging and AD

We first applied EasySci to characterize cell-type-specific gene expression, and chromatin accessibility profiles across the entire mouse brain sampling at different ages, sexes and genotypes (Fig. 1f). We collected C57BL/6 wild-type (WT) mouse brains at 3 months (n = 4), 6 months (n = 4) and 21 months (n = 4). To gain insight into the early molecular changes associated with the pathophysiology of AD, two mutants from the same C57BL/6 background at 3 months were included: an early-onset AD (EOAD) model (5xFAD) that overexpresses mutant human amyloid-beta precursor protein and human presenilin 1 harboring multiple AD-associated mutations16; and a late-onset AD (LOAD) model (APOE*4/Trem2*R47H) that carries two of the highest risk factor mutations of LOAD, including a humanized ApoE knock-in allele and missense mutations in the mouse Trem2 gene17,18.

In brief, nuclei were extracted from the whole brain and then deposited to different wells for indexed RT (RNA) or transposition (ATAC), such that the first index indicated the originating sample and assay type of any given well. The resulting EasySci libraries (RNA and ATAC) were sequenced separately, yielding a total of 20 billion paired-end reads. After filtering out low-quality cells and doublets, we recovered gene expression profiles in 1,469,111 single nuclei (a median of 70,589 nuclei per brain sample; Extended Data Fig. 3a) and chromatin accessibility profiles in 376,309 single nuclei (a median of 18,112 nuclei per brain sample, Extended Data Fig. 3b) across conditions. Despite shallow sequencing depth (~4,340 and ~16,000 raw reads per cell for RNA and ATAC, respectively), we recovered an average of 1,788 unique molecular identifiers (UMIs) (RNA, median of 935 UMIs) and 5,515 unique fragments (ATAC, median of 3,918) per nucleus (Extended Data Fig. 3c–f), comparable to other published datasets10,11,14.

With UMAP visualization19 and Louvain clustering20, we identified 31 main cell types by gene expression clusters (a median of 16,370 cells per cell type; Fig. 1g), annotated based on cell-type-specific gene markers2. Each cell type was present in nearly all individuals, except for rare pituitary cells (0.09% of the population), which were absent in 3 out of 20 individuals (Extended Data Fig. 3g). The cell-type-specific fractions in the global cell population ranged from 0.05% (inferior olivary nucleus neurons) to 32.5% (cerebellum granule neurons) (Fig. 1h). An average of 74 marker genes were identified for each main cell type (defined as at least a twofold expression difference between first- and second-ranked cell types; false discovery rate (FDR) of 5%; and transcripts per million (TPM) > 50 in the target cell type; Supplementary Table 2). In addition to the established marker genes, we identified novel markers that were not previously associated with the respective cell types, such as markers for microglia (e.g., Arhgap45 and Wdfy4), astrocytes (e.g., Celrr and Adamts9) and oligodendrocytes (e.g., Sec14l5 and Galnt5) (Extended Data Fig. 3h).

Several integration analyses were performed to validate the recovered cell types across different layers. First, we applied a deep-learning-based strategy21 to integrate transcriptome and chromatin accessibility profiles, yielding 31 main cell types (Fig. 1g). The gene body accessibility and expression of marker genes across cell types were highly correlated (Fig. 1i), as well as the fraction of each cell type (Pearson correlation r = 0.95, P = 6.68 × 10−16) (Fig. 1j). We further investigated the epigenetic controls of the diverse brain cell types through differential accessibility analysis (Extended Data Fig. 4a). We identified a median of 474 differential accessible peaks per cell type (FDR of 5%, TPM > 20 in the target cell type; Extended Data Fig. 4b,c and Supplementary Table 3). Key cell-type-specific transcription factor (TF) regulators were discovered by correlation analysis between motif accessibility and expression patterns, such as Spi1 in microglia22Nr4a2 in cortical projection neurons 3 (ref. 23) and Pou4f1 in inferior olivary nucleus neurons24 (Extended Data Fig. 4d).

We next integrated our dataset with a 10x Visium spatial transcriptomics dataset through a modified NNLS approach (Methods). As expected, specific brain cell types were mapped to distinct anatomical locations (Fig. 1k,l), especially for region-specific cell types such as cortical projection neurons (clusters 6–8), cerebellum granule neurons (cluster 3) and hippocampal dentate gyrus neurons (cluster 9). These integration analyses confirmed the annotations and spatial locations of main cell types in our single-cell datasets.

In-depth view of cellular subtypes in the mammalian brain

Rather than performing subclustering analysis with the gene expression alone, we exploited the unique feature of EasySci-RNA (that is, full gene body coverage) by incorporating both gene counts and exonic counts for principal-component analysis followed by unsupervised clustering. The approach substantially increased the clustering resolution, as shown in a microglia subtype example (Fig. 2a,b). Leveraging this subclustering strategy, we identified a total of 359 subclusters, with a median of 1,038 cells in each group (Fig. 2c). All subclusters were contributed by multiple individuals, with a median of nine exonic markers enriched in each subcluster (Extended Data Fig. 5a,b and Supplementary Table 4). Some subtype-specific exonic markers were not detected by conventional differential gene analysis (for example, Map2-ENSMUSE00000443205.3 in microglia-8; Extended Data Fig. 5c). Notably, our strategy favors detecting extremely rare cell types, such as rare pinealocytes (choroid plexus epithelial cells 7, 21 cells, marked by Tph1 and Ddc25) and tanycytes (vascular leptomeningeal cells-2, 35 cells, marked by Fndc3c1Scn7a26) (Extended Data Fig. 5d–g).

Sign up for our Newsletter