Molecular Medicine Israel

Transcriptional linkage analysis with in vivo AAV-Perturb-seq

Abstract

The ever-growing compendium of genetic variants associated with human pathologies demands new methods to study genotype–phenotype relationships in complex tissues in a high-throughput manner1,2. Here we introduce adeno-associated virus (AAV)-mediated direct in vivo single-cell CRISPR screening, termed AAV-Perturb-seq, a tuneable and broadly applicable method for transcriptional linkage analysis as well as high-throughput and high-resolution phenotyping of genetic perturbations in vivo. We applied AAV-Perturb-seq using gene editing and transcriptional inhibition to systematically dissect the phenotypic landscape underlying 22q11.2 deletion syndrome3,4 genes in the adult mouse brain prefrontal cortex. We identified three 22q11.2-linked genes involved in known and previously undescribed pathways orchestrating neuronal functions in vivo that explain approximately 40% of the transcriptional changes observed in a 22q11.2-deletion mouse model. Our findings suggest that the 22q11.2-deletion syndrome transcriptional phenotype found in mature neurons may in part be due to the broad dysregulation of a class of genes associated with disease susceptibility that are important for dysfunctional RNA processing and synaptic function. Our study establishes a flexible and scalable direct in vivo method to facilitate causal understanding of biological and disease mechanisms with potential applications to identify genetic interventions and therapeutic targets for treating disease.

Main

Advances in single-cell CRISPR screening methods are making it possible to study complex genotype–phenotype landscapes in a high-throughput manner5,6. The combination of pooled CRISPR libraries, lentiviral delivery and single-cell omics were applied in vitro to study protein misfolding7, gene regulation6 and immunity8 as well as in vivo to study mouse neurodevelopment9. Although these efforts have fundamentally changed our ability to investigate the genetic networks underlying complex cellular processes, current methods are restricted to in vitro applications or a very narrow range of developmental timepoints, tissues and cell types conducive to lentiviral infection in vivo. A general framework for broadly applicable direct in vivo single-cell screens is urgently needed to enable the systematic interrogation of the growing catalogue of disease-associated risk alleles in disease-relevant cells and tissues10, understand their causality, function and pathology, as well as develop new diagnostics and therapeutics1.

To address this challenge, we developed AAV-Perturb-seq, an AAV-based single-cell or single-nucleus CRISPR screening method that is simple to implement, tuneable and broadly applicable for in vivo functional genomics studies. We achieved this by creating a recombinant AAV vector for efficient guide RNA (gRNA) expression and detection within single-cell libraries as well as optimizing delivery and transgene expression for obtaining large numbers of single nuclei infected by single viruses from complex tissues. The use of AAVs for in vivo delivery offers many advantages over previous lentivirus-based screening approaches that are commonly used in vitro11, including the possibility of systemic delivery through intravenous injections leading to the targeting of a wide range of tissues and cell types in animals of any age in a tuneable manner. We applied AAV-Perturb-seq, using either gene editing in LSL-Cas9 mice12 or transcriptional inhibition in dCas9-KRAB mice, to systematically investigate the genotype–phenotype landscape of individual genes linked to 22q11.2 deletion syndrome (hereafter, 22q11.2DS)—a complex genetic disorder that affects numerous organs, including the brain, in which dysfunction is typically clinically expressed as schizophrenia or autism spectrum disorder (ASD)3,4. Using our data-analysis pipeline, we extracted high-quality transcriptomes spanning perturbations and brain cell types, enabling us to highlight previously underappreciated genetic contributions and identify new cellular phenotypes that may contribute to 22q11.2DS pathology. Our results establish AAV-Perturb-seq as a robust methodology for transcriptional linkage analysis and systematic transcriptional profiling of genotype–phenotype landscapes in vivo.

In vivo single-nucleus CRISPR screening

Towards creating a robust and broadly applicable direct in vivo single-cell CRISPR screening platform, we reasoned that it must have the following features: (1) simple to apply in mouse models; (2) relevant to a broad range of tissues and cell types yet also tuneable to subsets of interest; (3) capable of inducing efficient genetic perturbations and recovering this information with a transcriptomic readout; and (4) delivery enabling low multiplicity of infection such that single cells receive single perturbations (Fig. 1a). We hypothesized that systemic AAV-mediated delivery may offer each of these features and we therefore set out to establish and characterize this approach in vivo in the mouse brain.

To test whether AAVs permit infection of many cells at a low multiplicity of infection, we performed an in vivo titration experiment. We prepared three AAV transfer plasmids to independently express mTagBFP, Venus or mCherry under the control of a ubiquitous CBh promotor (Extended Data Fig. 1a). Each fluorescent protein was additionally fused to a KASH domain, which physically attaches proteins to the nuclear membrane, therefore enabling nucleus sorting. In each case, we used the AAV.PHP.B13 capsid to achieve brain-wide infection after systemic delivery in LSL-Cas9 mice. We injected an equal mixture of the three viruses through the tail vein at a low (2.5 × 109), medium (5.0 × 109) or high (2.5 × 1010) dose of total virus particles (Extended Data Fig. 1b). Flow cytometry analysis of nuclei isolated from brain tissue revealed a direct correlation between the viral dose, the number of infected cells and the multiplicity of infection (Extended Data Fig. 1c,d). For subsequent experiments, we selected the dose of 5.0 × 109 AAV particles per mouse as it maximized the total number of cells infected with a single AAV (Fig. 1b and Extended Data Fig. 1d–f).

We next focused on establishing a method to capture both mRNA and CRISPR gRNA molecules from the same AAV-infected nucleus. The use of nuclei rather than cells permits the study of complex, mature tissues from which good-quality single-cell suspensions are challenging to obtain14. We designed two different strategies in which the gRNA expression cassette was either embedded within a mRNA (pAS006) or expressed independently (pAS088), enabling either 3′ (CROP-seq5) or 5′ (ECCITE-seq15) capture sequencing methods, respectively (Extended Data Fig. 1g,h). We injected AAV.PHP.B containing small pools of ten distinct gRNAs for each construct. Four weeks after injection, we isolated single nuclei and prepared single-nucleus RNA-sequencing (snRNA-seq) libraries using either the 5′ or 3′ capture method for cells infected with pAS088 or pAS006, respectively. The percentage of total nuclei with a gRNA detected was 65% (around 25 unique molecular identifiers (UMIs) per gRNA) and 20% (around 3 UMIs per gRNA) for the 5′- and 3′-based approaches, respectively, and most infected nuclei contained a unique gRNA (Extended Data Fig. 1i–k). Taken together, we established that the 5′-based approach, combining independent gRNA expression (pAS088) with 5′ capture sequencing, best captures mRNA and gRNA information from AAV-infected nuclei and we therefore proceeded with this method for the subsequent experiments (Fig. 1a).

AAV-Perturb-seq of genes at the 22q11.2 locus

We used AAV-Perturb-seq to examine 22q11.2-locus genes in mature somatic cells in the prefrontal cortex of adult mice. Heterozygous deletion of the 22q11.2 locus is one of the most common chromosomal deletions in humans and results in a complex spectrum of phenotypes, including altered neuronal development and function3, but the function(s) of individual 22q11.2 genes in the adult brain are poorly understood. To identify candidate genes that are important for brain function in adult mice, we analysed DropViz16 data to measure the expression of the mouse homologues of 22q11.2 genes. This analysis revealed that 29 of the 37 genes in the locus are expressed in the adult mouse prefrontal cortex, a region that is thought to underly many of the 22q11.2DS neuropsychiatric manifestations3 (Fig. 1c, Extended Data Fig. 2a and Supplementary Table 1). We designed a CRISPR gRNA library to target each of the 29 adult expressed genes with two independent gRNAs and included five control gRNAs targeting mouse safe-harbour17 (SH) loci (Supplementary Table 2). We then used this library, packaged within AAV.PHP.B, to perform an AAV-Perturb-seq screen in vivo (Extended Data Fig. 1h).

We first focused our analysis on cell type identification and perturbation assignment (Extended Data Fig. 2b). Clustering analysis using Seurat18 identified expected neuronal and non-neuronal brain cells, highlighting our ability to infect and recover transcriptional information from a broad range of cell types (Fig. 1d and Extended Data Fig. 2c–e). Expression of gRNA molecules was detected in all cell types, most of which contained a single gRNA (Extended Data Fig. 3a–c). Furthermore, we detected all gRNAs in the library, with average numbers of nuclei per gRNA ranging from around 10 in microglia to 400 in interneurons (Fig. 1e and Extended Data Fig. 3d). We did not observe a change in the composition of cell types (Extended Data Fig. 3e). Taken together, our direct in vivo screen experiment resulted in a single-nucleus transcriptomic dataset containing about 60,000 nuclei spanning 6 brain cell types and perturbation of all 22q11.2 genes expressed in the adult prefrontal cortex.

Gene- and cell-specific phenotypes

We developed a data analysis pipeline to associate gRNAs, and therefore genetic perturbations, with cell-type-specific transcriptional phenotypes (Fig. 2a and Extended Data Fig. 2b). For each cell type separately, we created pseudobulk profiles by aggregating nuclei with the same perturbation and used edgeR19 to calculate the pairwise differential expression between the control and each perturbation. We used pseudobulk rather than single-cell-specific methods given that it is less biased towards highly expressed genes and less prone to false-positives20 (Extended Data Fig. 3f). Using our pseudobulk approach, we found substantial transcriptional phenotypes in four perturbations across all neuron types (Dgcr8Dgcr14Gnb1l and Ufd1l) (Fig. 2b). Transcriptional phenotype scoring analysis using the Hoteling’s T2 statistic21 confirmed the identity of the genes with strong transcriptional phenotypes when perturbed in neurons (Extended Data Fig. 3g). We also observed that all four genes are present within the 1.5 Mb minimal region that is believed to be critical in 22q11.2-related disorders3 (Fig. 1c).

Next, we characterized the transcriptional phenotypes resulting from perturbation of Dgcr8Dgcr14Gnb1l and Ufd1l across neuron types. The overarching result was that perturbation of each gene led to a largely distinct transcriptional phenotype that was mostly shared across neuron types. Support for this came from: (1) clustering the top 20 (Fig. 2c) or all (Extended Data Fig. 3h) upregulated genes for each perturbation; (2) Augur score analysis, which scores cells on the basis of their dissimilarity to the control condition (Extended Data Fig. 3i); (3) correlation analysis using all differentially expressed genes (DEGs) (Extended Data Fig. 3j); and (4) two-dimensional uniform manifold approximation and projection (UMAP) embeddings, which directly segregated nuclei with different perturbations from each other and from SH control cells (Fig. 2d). Taken together, these observations demonstrate that AAV-Perturb-seq retrieves both mutation and cell-type-specific signatures and indicate that perturbation of Dgcr8Dgcr14Gnb1l and Ufd1l affect specific subsets of unique genes across neuron types.

In vivo gene editing efficiency

To assess whether underperforming gRNAs were confounding our ability to robustly identify perturbed cells, we prepared eight individual AAV.PHP.B viruses expressing gRNAs targeting the four genes with a strong transcriptional change (Dgcr8Dgcr14Gnb1l and Ufd1l) and four randomly chosen genes with no apparent transcriptional phenotype (ComtMed15Ranbp1 and Pi4ka), and then individually injected these viruses into distinct mice. Analysis of Cas9-mediated mutations (indels) revealed that the percentage of mutated cells was similar across all tested gRNAs, with the majority of edited cells containing frame-shifting loss-of-function mutations in the targeted gene (Fig. 2e and Extended Data Fig. 4a), indicating that gene editing efficiency is not confounding our analysis.

While our analysis revealed efficient gene editing and DEGs across perturbations and neuron types, we set out to examine the possibility of another confounding factor—gene editing mosaicism. As not all nuclei expressing a gRNA necessarily carry a loss-of-function mutation, and merging perturbed and non-perturbed transcriptomes into a pseudobulk profile could dampen and/or confound transcriptional phenotypes, we focused on identifying and filtering non-perturbed nuclei from the analysis (Extended Data Fig. 2b). Using the previously detected DEGs for each perturbation as variables (Fig. 2b), we used linear discriminant analysis (LDA) to identify gRNA-containing nuclei with a transcriptional phenotype that is significantly distinct from SH control nuclei. This analysis revealed that, on average, around 50% of nuclei containing a particular gRNA were perturbed (Extended Data Fig. 4b), consistent with our observed gene editing efficiency (Fig. 2e) and expected non-loss-of-function genotypes (Extended Data Fig. 4a). After discarding non-perturbed nuclei and repeating the pseudobulk differential expression analysis, we observed that nuclei filtering increased our sensitivity to detect DEGs without biasing the transcriptional phenotype (Extended Data Fig. 4c)…

Sign up for our Newsletter