Molecular Medicine Israel

Metazoan MicroRNAs

MicroRNAs (miRNAs) are ∼22 nt RNAs that direct posttranscriptional repression of mRNA targets in diverse eukaryotic lineages. In humans and other mammals, these small RNAs help sculpt the expression of most mRNAs. This article reviews advances in our understanding of the defining features of metazoan miRNAs and their biogenesis, genomics, and evolution. It then reviews how metazoan miRNAs are regulated, how they recognize and cause repression of their targets, and the biological functions of this repression, with a compilation of knockout phenotypes that shows that important biological functions have been identified for most of the broadly conserved miRNAs of mammals.
Introduction
MicroRNAs (miRNAs) are small regulatory RNAs that are processed from stem-loop regions of longer RNA transcripts. Hundreds of different miRNAs have been identified in humans, many of which are conserved in other animals, and these conserved miRNAs have preferentially conserved interactions with most human mRNAs (Friedman et al., 2009). This inferred regulation of most human mRNAs suggests that miRNAs influence essentially all developmental process and diseases. Indeed, loss-of-function studies disrupting miRNA genes in mice have revealed diverse phenotypes, including defects in the development of the skeleton, teeth, brain, eyes, neurons, muscle, heart, lungs, kidneys, vasculature, liver, pancreas, intestine, skin, fat, breast, ovaries, testes, placenta, thymus, and each hematopoietic lineage, as well as cellular, physiological, and behavioral defects. Many of these developmental and physiological defects affect embryonic or postnatal viability or cause other severe conditions, such as epilepsy, deafness, retinal degeneration, infertility, immune disorders, or cancer. In addition, some miRNA-knockout strains have altered susceptibility to infections, and many have differential responses to mouse models of diseases or injuries.

As with many discoveries of fundamental importance to mammalian development, physiology, and disease, the first known miRNA was not found in humans or other mammals but was instead found in an invertebrate model organism. Molecular geneticists studying the lin-4 and let-7 genes, which are each required for the proper timing of C. elegans development, found that instead of producing mRNAs, these genes produce noncoding RNAs (ncRNAs), including short RNAs ∼22 nt in length (Lee et al., 1993, Reinhart et al., 2000). The lin-4 and let-7 RNAs both had imperfect complementarity to conserved sites within the 3′ UTRs of genetically identified regulatory targets, which led to a model in which these small RNAs mediate translational repression through antisense interactions (Lee et al., 1993, Wightman et al., 1993, Moss et al., 1997, Olsen and Ambros, 1999, Reinhart et al., 2000). The let-7 RNA was subsequently recognized in humans and other bilaterian animals, with temporal expression resembling that observed in C. elegans (Pasquinelli et al., 2000). This discovery showed that these regulatory RNAs were not mere curiosities of worms and led to the idea that additional “small temporal RNAs” might exist to regulate the timing of other development transitions (Pasquinelli et al., 2000). Soon thereafter, molecular searches for endogenous small RNAs refined the identities of the lin-4 and let-7 RNAs and revealed that these RNAs were actually part of a much larger class of small RNAs (Lagos-Quintana et al., 2001, Lau et al., 2001, Lee and Ambros, 2001). Members of this class all resembled lin-4 and let-7 RNAs in their small size and potential to be processed from hairpin precursors, but most were not expressed in a temporal manner. Because they were identified without the help of genetics, their functions were not known—what was known is that they were small, and so they were called “microRNAs” (Lagos-Quintana et al., 2001, Lau et al., 2001, Lee and Ambros, 2001).

At this point, interest in these small regulatory RNAs surged. It was clear that there were hundreds of different miRNAs in humans, flies, nematodes, and presumably other animals, which implied that in each of these species there were hundreds of different mRNAs that were regulated. Moreover, these potential regulatory targets were not just mRNAs involved in the timing of development—any mRNA that a biologist was studying might be regulated by one or more miRNAs. In this review, I touch on what has since been learned about the miRNAs of animals, particularly with respect to their biogenesis, genomics, regulation, mechanisms of action, target recognition, and biological functions.

miRNA Biogenesis, Genomics, and Evolution
The metazoan miRNA pathway derived from a more basal RNA-silencing pathway known as RNA interference (RNAi), which appears to have been present in the last common ancestor of eukaryotes and continues to defend against viruses and transposons in many extant eukaryotes (Shabalina and Koonin, 2008). The hallmark innovation of the miRNA pathway is the use of short hairpins to produce defined guide RNAs appropriate for directing the silencing machinery to specific cellular mRNAs, whereas the ancestral RNAi pathway starts with longer double-stranded RNA (dsRNA) precursors that each produce a large diversity of small interfering RNAs (siRNAs) (Bartel, 2004). When considering the intrinsic advantages of short hairpins for generating defined guide RNAs appropriate for enlisting RNA silencing for endogenous gene regulation, miRNAs might have been expected to have arisen more than once in eukaryotic evolution. Indeed, miRNAs or miRNA-like RNAs have emerged independently in other diverse eukaryotic lineages, including land plants (Jones-Rhoades et al., 2006), green algae (Molnár et al., 2007, Zhao et al., 2007), brown algae (Cock et al., 2010), filamentous fungi (Lee et al., 2010), and slime mold (Avesson et al., 2012).

Biogenesis of Canonical miRNAs
In animals, canonical miRNAs are transcribed by RNA polymerase II (Pol II) as part of much longer RNAs called “pri-miRNAs” (Lee et al., 2002, Cai et al., 2004, Lee et al., 2004) (Figure 1). Each pri-miRNA has at least one region that folds back on itself to form a hairpin substrate for Microprocessor, a heterotrimeric complex containing one molecule of the Drosha endonuclease and two molecules of its partner protein, DGCR8 (named Pasha in flies and nematodes) (Nguyen et al., 2015). Drosha has two RNase III domains that each cut one strand of the stem of the pri-miRNA hairpin with a 2 bp offset, which liberates a ∼60 nt stem-loop called a “pre-miRNA” (Lee et al., 2003) (Figure 1A). Note that although all canonical pri-miRNAs have a 5′ cap, as expected for Pol II transcripts, they do not necessarily have a poly(A) tail because cotranscriptional processing by Microprocessor can sometimes trigger transcription termination in a manner that preempts normal 3′-end maturation (Ballarino et al., 2009).

After export to the cytoplasm through the action of Exportin 5 and RAN–GTP (Yi et al., 2003, Bohnsack et al., 2004, Lund et al., 2004), the pre-miRNA is further processed by Dicer (Grishok et al., 2001, Hutvágner et al., 2001). Like Drosha, Dicer is an endonuclease with two RNase III domains (Bernstein et al., 2001, Zhang et al., 2004), and like Drosha, it associates with a partner protein, although the Dicer partner protein (named TRBP in mammals and Loquacious in flies) is essential for pre-miRNA processing in flies but not mammals (Ha and Kim, 2014). Dicer cuts both strands near the loop to generate the miRNA duplex, which contains the miRNA paired to its passenger strand (often called the “miRNA∗,” pronounced “miRNA star”). This duplex has a ∼2 nt 3′ overhang on each end, resulting from the offset cuts made by both Drosha and Dicer (Lee et al., 2003, Zhang et al., 2004) (Figure 1A). Once formed, the miRNA duplex is loaded into an Argonaute protein with assistance from chaperone proteins (HSC70/HSP90), which use ATP to help Argonaute assume a high-energy, open conformation suitable for binding the rigid miRNA duplex (Iwasaki et al., 2010). Following loading of the duplex, relaxation of Argonaute back to its ground-state conformation is thought to promote expulsion of the miRNA∗ to form the mature silencing complex (Kawamata and Tomari, 2010).

The choice of which strand of the duplex usually becomes the miRNA, i.e., the guide strand of the silencing complex, and which one usually is discarded and degraded as the miRNA∗ depends on the preferred orientation by which the duplex binds Argonaute, and this orientation of the duplex depends on which strand of the duplex has a 5′ terminus most suitable for loading into the pocket within Argonaute that binds the 5′-nucleoside monophosphate of guide RNAs. This pocket prefers a 5′-terminal pU or pA (Frank et al., 2010, Suzuki et al., 2015) as well as the 5′-nucleoside monophosphate of the strand with the least stable 5′-terminal pairing (Khvorova et al., 2003, Schwarz et al., 2003). Because the preferred loading orientation is independent of the duplex orientation within the pre-miRNA, the strand from either arm of the hairpin can be retained to become the miRNA (Figure 1B). Once loaded into the silencing complex, the miRNA pairs to sites within mRNAs and other transcripts to direct their posttranscriptional repression (Figure 1A).

Genomics and Evolution of Canonical miRNAs
High-throughput sequencing of small RNAs has transformed miRNA gene discovery (Lu et al., 2005, Ruby et al., 2006). Of course, not every small RNA sequenced from the cell is a miRNA. Some are other types of small RNAs, such as piwi-interacting RNAs (piRNAs) and endogenous siRNAs, which derive from related RNA-silencing pathways (Malone and Hannon, 2009), and others are degradation fragments of longer RNAs. To prevent these other types of small RNAs from being misannotated as miRNAs, stringent analysis pipelines require that the annotated miRNA have a consistent 5′ terminus and map to a potential hairpin supported by reads corresponding to both strands of the miRNA duplex, with its diagnostic ∼2 nt 3′ overhangs (Ruby et al., 2006). Using these criteria, recent analyses have supported the authenticity of many previously annotated genes and identified some new genes, bringing tallies of confidently identified canonical miRNA genes to 147 in C. elegans (Jan et al., 2011), 164 in Drosophila (Fromm et al., 2015), 475 in mouse (Chiang et al., 2010), and 519 in human (Fromm et al., 2015).

Although lin-4, let-7, and a couple of other metazoan miRNAs were named based on their mutant phenotypes, the other miRNAs, which were identified before a mutant phenotype was known, were named with numbers, in the order of their discovery (Ambros et al., 2003). Some, albeit imperfect, attempt was also made to give orthologs from different species the same name. For example, 11 of the 12 human orthologs of C. elegans let-7 bear the let-7 name (the other one being mir-98). Likewise, an attempt was made to assign similar names to paralogs within a species, with letter suffixes (a, b, c, . . .) distinguishing genes producing similar mature miRNAs and number suffixes (-1, -2, -3, . . .) distinguishing those producing identical mature miRNAs.

Of the more than 100 studies that have sought to annotate miRNAs, not all have imposed stringent criteria for gene annotation, leading to many additional gene annotations. For example, miRBase v21 lists 1,193 miRNA gene annotations in mouse and 1,881 in human (Kozomara and Griffiths-Jones, 2014), raising the question of how many of these additional annotations represent authentic miRNAs missed by the stringent annotation criteria (which might happen, for instance, if a miRNA was not expressed at sufficient levels in the sequenced samples), and how many are false-positive annotations. Experimental evaluation of a subset of these additional annotations indicates that a large majority are false positives (Chiang et al., 2010), implying that biologists interested in exploring miRNA functions should focus on only miRNA annotations that satisfy the stringent criteria, based on either existing or newly acquired high-throughput sequencing data.

MicroRNAs are grouped into families based on their targeting properties, which depend primarily on the identity of their extended seed region (miRNA nucleotides 2–8) (Bartel, 2009). For example, mice and humans each have three members of the miR-1/206 seed family (miR-1-1, miR-1-2, and miR-206), which are paralogous miRNAs that arose through duplication of an ancestral gene inherited from a common ancestor of all bilaterian animals. Indeed, members of the same seed family are usually evolutionarily related, and evolutionarily related miRNAs are usually members of the same seed family. However, the use of the term “family” does not strictly denote common ancestry. For example, miR-32, which is not related to other members of the miR-25/32/92/363/367 seed family, is nonetheless an adopted member because it has converged on the same extended seed and thus has the same targeting preferences. Other miRNAs, such as miR-200a and miR-200b, which are clearly related, have a single-nucleotide difference in their seed regions that place them into different families because of their divergent targeting preferences. As with paralogous proteins, members of the same seed families often have at least partially redundant functions, with severe loss-of-function phenotypes apparent only after multiple family members are disrupted (Tables 1 and 2). However, phenotypes are often observed after disruption of a single member, particularly in contexts in which that member is preferentially expressed (Tables 1 and 2, e.g., miR-206, miR-7a-2, miR-9-3, etc.).

Sign up for our Newsletter