Abstract
Chromatin modifications are linked with regulating patterns of gene expression, but their causal role and context-dependent impact on transcription remains unresolved. Here we develop a modular epigenome editing platform that programs nine key chromatin modifications, or combinations thereof, to precise loci in living cells. We couple this with single-cell readouts to systematically quantitate the magnitude and heterogeneity of transcriptional responses elicited by each specific chromatin modification. Among these, we show that installing histone H3 lysine 4 trimethylation (H3K4me3) at promoters can causally instruct transcription by hierarchically remodeling the chromatin landscape. We further dissect how DNA sequence motifs influence the transcriptional impact of chromatin marks, identifying switch-like and attenuative effects within distinct cis contexts. Finally, we examine the interplay of combinatorial modifications, revealing that co-targeted H3K27 trimethylation (H3K27me3) and H2AK119 monoubiquitination (H2AK119ub) maximizes silencing penetrance across single cells. Our precision-perturbation strategy unveils the causal principles of how chromatin modification(s) influence transcription and dissects how quantitative responses are calibrated by contextual interactions.
Main
Regulation of eukaryotic transcription is guided by a complex interplay between transcription factors (TFs), cis regulatory elements and epigenetic mechanisms. The latter includes chromatin-based systems, most prominently post-translational histone and DNA modifications. Such ‘chromatin modifications’ influence transcription activity by directly altering chromatin compaction, by acting as specific docking sites for ‘reader’ proteins and/or by influencing TF access to cognate motifs1,2,3. As a result, chromatin marks are thought to play a central regulatory role in deploying and propagating gene expression programs during development, while, conversely, aberrant chromatin profiles are linked with gene mis-expression and pathology4,5,6.
Major initiatives have mapped genome-wide chromatin modifications across healthy and disease cell types, revealing correlations with genomic features and transcription activity7,8,9,10,11,12. For example, H3K4me3 is enriched at active gene promoters, and H3K9 dimethylation (H3K9me2), H3K9me3, H3K27me3 and H2AK119ub are correlated with transcription repression, while active enhancers are comarked by H3K4 monomethylation (H3K4me1) and H3K27 acetylation (H3K27ac)13. Whether the observed correlations indicate causation remains unresolved however14,15,16,17. To interrogate the nature of functional relationships, perturbation strategies have been widely deployed, often by manipulating chromatin-modifying enzymes or histone residues5,18,19,20. While insightful, such approaches affect the entire (epi)genome simultaneously and thus render it challenging to distinguish direct from indirect effects. Indeed, chromatin-modifying enzymes also have multiple non-histone substrates21,22 and non-catalytic roles23,24, which further complicates interpretation of their loss of function. Thus, the extent to which chromatin modifications per se causally instruct gene expression states remains unresolved.
A deeper understanding of the functional role of epigenetic modifications on DNA-templated processes would be facilitated by the development of tools for precision chromatin perturbations. Epigenome editing technologies that enable manipulation of specific chromatin states at target loci have recently emerged, primarily based around programmable dead Cas9 (dCas9)-fusion systems25,26. For example, p300 and histone deacetylase 3 (HDAC3) have been fused to dCas9 to reciprocally modulate histone acetylation, while other systems aimed to edit DNA methylation, H3K27me3, H3K4me3 and H3K79me2 (refs. 27,28,29,30,31,32,33,34,35,36). Such pioneering studies revealed proof of principle that altering the epigenome can induce at least some changes in gene expression. However, the transcriptional responses to specific marks are generally modest, if at all, and register at only a restricted set of target genes. This may partly reflect technical limitations of current approaches in depositing physiological levels of chromatin modifications, but also implies that their functional impact varies depending on context-dependent influences. Indeed, there is increasing appreciation that factors such as underlying DNA motifs and variants, and the cell type-specific repertoire of TFs, will all modulate the precise impact of a chromatin modification at a given locus37,38. Thus, beyond the principle of causality, it is also important to deconvolve the degree to which each chromatin mark affects transcription levels quantitatively (as opposed to an ON–OFF toggle), how DNA sequence context influences this and the hierarchical relationships involved.
Here, we develop a suite of modular epigenome editing tools to systematically program nine biologically important chromatin modifications to target loci at physiological levels. By coupling this with single-cell readouts, we capture the causal and quantitative impact of specific modification(s) on transcription. We further show that epigenetic marks are linked to each other by hierarchical interplays, act combinatorially, and are functionally influenced by underlying sequence motifs.
Results
A toolkit for precision epigenome editing at endogenous loci
We sought to engineer a modular epigenome editing system that can program de novo chromatin modification(s) to target loci at physiological levels. To achieve this, we exploited a catalytically inactive dCas9 fused with an optimized tail array of GCN4 motifs (dCas9GCN4)39,40. This tethers five scFV-tagged epigenetic ‘effectors’ to genomic targets, thereby amplifying editing activity (Fig. 1a). To program a broad range of chromatin modifications, we built a library of effectors, each comprising the catalytic domain (CD) of a DNA- or histone-modifying enzyme linked with scFV (collectively, CDscFV). By isolating the CD, we can exclude confounding effects of tethering entire chromatin-modifying proteins, which can exert non-catalytic regulatory activity. The toolkit includes catalytic cores that deposit H3K4me3 (Prdm9-CDscFV), H3K27ac (p300-CDscFV), H3K79me2 (Dot1l-CDscFV), H3K9me2 (G9a-CDscFV), H3K36me3 (Setd2-CDscFV), DNA methylation (Dnmt3a3l-CDscFV), H2AK119ub (Ring1b-CDscFV) and full-length (FL) enzymes that write H3K27me3 (Ezh2-FLscFV) and H4K20me3 (Kmt5c-FLscFV) (Fig. 1a). As further controls, we generated catalytic point mutants for each CDscFV effector (mut-CDscFV) that specifically abrogate their enzymatic activity (Extended Data Fig. 1a). Our strategy therefore enables direct assessment of the functional role of the deposited chromatin mark per se.
We engineered the system to be doxycycline (DOX) inducible for dynamic epigenetic editing and used an enhanced guide RNA (gRNA) scaffold for targeting41. Moreover, all CDscFV effectors were tagged with superfolder green fluorescent protein (GFP) to monitor protein stability, to track dynamics and to isolate epigenetically edited populations (Extended Data Fig. 1b–d). Finally, up to three nuclear localization sequences were incorporated into effectors, as fewer often precluded nuclear accumulation, for example, for Dot1l-CDscFV (Extended Data Fig. 1e).
To test for epigenome editing, we introduced dCas9GCN4 and each CDscFV into mouse embryonic stem cells (ESCs) with the piggyBac system and targeted the endogenous Hbb-y locus with a single gRNA. Following DOX induction, each effector directed significant deposition of its chromatin modification relative to recruitment of GFPscFV, judged by quantitative cleavage under targets and release using nuclease (CUT&RUN–quantitative PCR (qPCR)). This includes de novo establishment of H3K27ac (P = 0.0003), H3K4me3 (P = 0.011), H3K79me2 (P = 0.029), H4K20me3 (P = 0.001), H3K27me3 (P = 0.0006), H2AK119ub (P = 0.0002), H3K36me3 (P = 0.001), H3K9me2/3 (P = 0.0002) (Fig. 1b) and DNA methylation (P < 0.0001) (Fig. 1c).
To determine the quantitative level and genomic spreading of installed chromatin marks, we independently assessed enrichment across the entire Hbb-y locus. We observed a peak around the gRNA-binding site, with programmed domains extending >2 kb on either side. Enrichment of targeted histone modifications ranged from sevenfold to >20-fold over background (Fig. 1d–i) and, importantly, was quantitatively comparable to strong positive peaks in most cases. For example, H3K4me3 installation at Hbb-y was equivalent to that at highly marked Pou5f1 (Oct4) and Nanog promoters (Fig. 1d), while de novo H3K27me3 and H2AK119ub were similar to those at Polycomb targets Zic4 and Wnt10a (Fig. 1e,f). Moreover, de novo H3K36me3, H3K79me2 and H4K20me3 were equivalent to endogenous peaks, while H3K9me2/3 and H3K27ac were deposited at moderately lower levels (Fig. 1g–i and Extended Data Fig. 1f). Finally, up to 60% DNA methylation was installed at previously unmethylated promoters (Fig. 1j).
We did not detect OFF-target chromatin mark deposition at negative (nontargeted) loci with most effectors (Fig. 1d–i and Extended Data Fig. 1f). Indeed, analysis of the highly active Prdm9-CDscFV effector revealed robust H3K4me3 installation at ON-target Hbb-y but only six other de novo sites genome wide, implying that our recruitment strategy largely facilitates ON-target chromatin editing (Extended Data Fig. 2a,b). We further tested for indirect and OFF-target effects at the functional level by performing RNA-seq following induction of each epigenome editing system. We observed no toxicity and only minor changes in global gene expression (Fig. 1k). An exception is p300-CDscFV, which elicited indirect expression changes and reduced cell viability. To mitigate this, we limited p300-CDscFV induction by using DOX at a concentration 20-fold lower (Extended Data Fig. 2c,d). Overall, the data suggest that OFF-target and/or indirect effects are minimized with our modular CDscFV recruitment design.
Thus, we developed a flexible epigenome editing toolkit capable of programming high levels of nine key chromatin modifications to specific endogenous loci. The system includes multiple controls to isolate the causal function of chromatin modifications per se, is compatible with combinatorial targeting, and can track temporally resolved responses and epigenetic memory.
Chromatin modifications can instruct transcriptional outputs
To investigate the direct regulatory role of chromatin modifications on transcription, we initially engineered a reporter system that facilitates quantitative single-cell readouts. We embedded the endogenous Ef1a (Eef1a1) core promoter (212 bp) into a contextual DNA sequence (~3 kb) selected from the human genome to be feature neutral: it carries no transposable elements, has ~50% GC content and has minimal TF motifs (Fig. 2a). We inserted the sequences for this ‘reference’ (REF) reporter into two genomic locations, chosen to be either permissive (chromosome (chr)9) or nonpermissive (chr13) for transcriptional activity (Fig. 2a). Consistently, knock-in to the permissive locus supported strong expression (ON), whereas the nonpermissive landing site resulted in minimal activity (OFF), which partially reflects acquisition of Polycomb silencing (Fig. 2b and Extended Data Fig. 2e,f). These identical reporters residing within distinct genomic locations thus enable assessment of both activating and repressive activity of induced chromatin modifications on the same underlying DNA sequence….