Highlights
- •hnRNPM is a major RBP that occupies deep introns to repress cryptic splicing
- •hnRNPM binding is enriched in LINEs, repressing pseudo splice sites
- •LINE/Alu-containing cryptic exons can form dsRNAs, triggering IFN-I responses
- •hnRNPMlow tumors show increased cryptic splicing, IFN-I, and immune infiltration
Summary
RNA splicing is pivotal in post-transcriptional gene regulation, yet the exponential expansion of intron length in humans poses a challenge for accurate splicing. Here, we identify hnRNPM as an essential RNA-binding protein that suppresses cryptic splicing through binding to deep introns, maintaining human transcriptome integrity. Long interspersed nuclear elements (LINEs) in introns harbor numerous pseudo splice sites. hnRNPM preferentially binds at intronic LINEs to repress pseudo splice site usage for cryptic splicing. Remarkably, cryptic exons can generate long dsRNAs through base-pairing of inverted ALU transposable elements interspersed among LINEs and consequently trigger an interferon response, a well-known antiviral defense mechanism. Significantly, hnRNPM-deficient tumors show upregulated interferon-associated pathways and elevated immune cell infiltration. These findings unveil hnRNPM as a guardian of transcriptome integrity by repressing cryptic splicing and suggest that targeting hnRNPM in tumors may be used to trigger an inflammatory immune response, thereby boosting cancer surveillance.
Introduction
In eukaryotic evolution, the expansion of genome size increases genome complexity and biological diversity. A significant portion of genetic expansion is accounted for by intron net gain. In humans, introns constitute approximately 25% of the genome, representing an extensive expansion compared with lower eukaryotes and other mammals.1,2,3 Transposition by intronic retrotransposable elements, such as long interspersed nuclear elements (LINEs), short interspersed elements (SINEs), and endogenous retroviruses (ERVs), is a major driver of intron expansion across various species,4,5,6 expanding average intron length in human genes to several kilobases.7,8 Long introns harbor numerous pseudo splice sites that are highly similar to annotated splice sites.9,10,11,12 Misusage of pseudo splice sites results in cryptic splicing, which may be detrimental to cell viability, leading to diseases.13,14,15,16 In spite of the presence of many pseudo splice sites, RNA splicing occurs in an accurate and precise manner. This suggests that potent protective mechanisms exist to repress cryptic splicing and ensure transcriptome integrity.
RNA-binding proteins (RBPs) regulate RNA processing, including splicing, localization, polyadenylation, and translation, through interaction with RNAs and other proteins.17,18,19,20 RBP binding to pseudo splice sites may conceivably suppress cryptic splicing by preventing pseudo splice site usage. One such example is the function of hnRNPC to suppress aberrant ALU exonization through preventing spurious U2AF65 binding at cryptic splice sites.21 Furthermore, expression of cryptic exons (CryEx) can be deleterious to normal gene expression, a mechanism partly responsible for the contribution of TDP-43 to amyotrophic lateral sclerosis (ALS).22 Conversely, CryEx may evolve into tissue-specific exons regulated by RBPs.23 Despite these findings, the field of cryptic splicing is still in its infancy.
Here, we identify heterogeneous nuclear ribonucleoprotein M (hnRNPM) as a key RBP to repress cryptic splicing. We develop a bioinformatic pipeline that nominates CryEx from RNA sequencing (RNA-seq) datasets. We show that hnRNPM represses a large quantity of CryEx. These CryEx are enriched in LINEs, and some can form cytoplasmic double-stranded RNAs (dsRNAs) that mimic viral dsRNA known to elicit interferon (IFN) responses.24,25,26 We further link LINE-associated dsRNAs derived from CryEx to IFN-response-induced tumor immunity.