Molecular Medicine Israel

Distinct and shared genetic architectures of gestational diabetes mellitus and type 2 diabetes

Abstract

Gestational diabetes mellitus (GDM) is a common metabolic disorder affecting more than 16 million pregnancies annually worldwide1,2. GDM is related to an increased lifetime risk of type 2 diabetes (T2D)1,2,3, with over a third of women developing T2D within 15 years of their GDM diagnosis. The diseases are hypothesized to share a genetic predisposition1,2,3,4,5,6,7, but few studies have sought to uncover the genetic underpinnings of GDM. Most studies have evaluated the impact of T2D loci only8,9,10, and the three prior genome-wide association studies of GDM11,12,13 have identified only five loci, limiting the power to assess to what extent variants or biological pathways are specific to GDM. We conducted the largest genome-wide association study of GDM to date in 12,332 cases and 131,109 parous female controls in the FinnGen study and identified 13 GDM-associated loci, including nine new loci. Genetic features distinct from T2D were identified both at the locus and genomic scale. Our results suggest that the genetics of GDM risk falls into the following two distinct categories: one part conventional T2D polygenic risk and one part predominantly influencing mechanisms disrupted in pregnancy. Loci with GDM-predominant effects map to genes related to islet cells, central glucose homeostasis, steroidogenesis and placental expression.

Main

Gestational diabetes mellitus (GDM) is a common disorder of pregnancy that has substantially increased in prevalence across diverse population groups in the last 15 years14. Despite conferring substantial morbidity to both mother and child, relatively little is known about the genetics of GDM outside of a proposed shared genetic etiology with type 2 diabetes (T2D). The largest existing genome-wide association study (GWAS) of GDM revealed five genome-wide significant loci, all but one previously associated with T2D13. Although the results seem to broadly support the hypothesis of shared etiology, none of the existing GWAS were sufficiently powered to fully assess the degree to which genetic risk is shared between GDM and T2D. The one prior GDM locus not associated with T2D, while intriguing, is insufficient to identify mechanisms or biological pathways specific to, or with differential effects in, GDM.

To elucidate the genetic underpinnings of GDM, we conducted a GWAS15 of GDM in 12,332 cases and 131,109 parous female controls. Participants were of Finnish ancestry from the FinnGen study16. Cases were identified using Finnish health and population registry sources, including registry data from inpatient hospitalizations, outpatient specialty clinics and birth registry. Cases were confirmed to have a diagnosis within a pregnancy window, and those with diagnoses of diabetes before the index pregnancy were excluded (MethodsSupplementary Note).

Our GWAS nearly tripled the previously known loci for GDM, identifying 13 distinct associated chromosomal regions (Fig. 1 and Supplementary Figs. 113). Significant variants include 4 of 5 previously reported GWAS loci. We observe a modest effect at the fifth, the HKDC1 locus (rs9663238β = 0.05, P = 0.0024), proposed as a unique GDM contributor, and note that, intriguingly, the only genome-wide significant finding for this variant in FinnGen is for intrahepatic cholestasis of pregnancy (FinnGen R11: β = 0.24, P = 1.6 × 10−14; Supplementary Table 2 and Supplementary Note).

To confirm the robustness of these findings, we performed replication studies using samples newly recruited to FinnGen after the data freeze and a large sample from the Estonian Biobank (EstBB; a combined 8,931 cases and 170,809 controls; Supplementary Table 3). Eleven of 13 associations replicated (the well-established T2D and previously observed GenDIP hit13 at CDKN2B was not significant but was directionally consistent as was the association at CMIP). Notably, the two new Finnish-enriched findings at ESR1 and MAP3K15 were both strongly confirmed (replication P values of 3.5 × 10−5 and 4.3 × 10−6, respectively).

Fine-mapping17 of the 13 loci pinpointed 14 independent signals (the region near CDKN2B containing two independent signals), of which nine regions had a 95% credible set containing five or fewer SNPs (Table 1 and Supplementary Table 1Methods). Nine regions represented new GDM associations not reported in previous GDM GWAS. We characterized the 13 current confirmed GDM GWAS loci through annotation and colocalization of credible sets with >3,800 GWAS (Supplementary Tables 4 and 5), quantitative trait loci (QTLs) for gene expression, biomarkers and metabolites (Supplementary Tables 611) and chromatin interactions (Supplementary Table 11), along with tests of enrichment by functional consequence, gene expression or canonical gene sets (Supplementary Tables 1316 and Supplementary Figs. 1417Methods). Given the consistency of the replication, we include also top results and fine-mapping of a joint GWAS of the FinnGen discovery and holdout samples (18,474 cases and 171,349 controls), which nominates additional new loci for further investigation (Supplementary Table 17).

We next performed analyses to evaluate the shared genetic etiology with T2D. Assessment of genome-wide significant signals using our algorithm Significant Cross-trait Outliers and Trends in Joint York regression (SCOUTJOY; Methods) indicated that the 13 GDM-associated loci showed significant heterogeneity in their relationship to T2D (P < 0.001; Supplementary Table 18). Five of the 13 GDM-associated loci were not significantly associated (P < 5 × 10−8) with T2D in either a previously published large T2D meta-analysis18 or in FinnGen, while the remaining loci are established T2D hits (Table 1 and Supplementary Fig. 18). At the genomic level, GDM and T2D were genetically correlated (rg = 0.71, s.e. = 0.06, P = 6.8 × 10−37), which is significantly greater than zero (P = 6.8 × 10−37) but less than 1 (P = 1.2 × 10−7Methods). Significant genetic correlations were also seen with 12 diseases or traits and eight blood laboratory values in cases where the disorder or value was phenotypically related to GDM (Fig. 1, Supplementary Figs. 19 and 20 and Supplementary Tables 1923). In both the genomic correlation and top hits comparison, GDM was significantly associated with fasting glucose (FG), hemoglobin A1c (HbA1C) and 2-h glucose results on oral glucose tolerance testing but was not associated with fasting insulin level. None of these glycemic traits or related disorders, however, appeared to stratify the 13 GDM-associated loci into distinct groups similar to T2D (Supplementary Fig. 21 and Supplementary Table 24). Comparison of the effect of GDM- and T2D-associated loci across sex and across pregnancy history indicated that the relationship was not generally mediated by pregnancy effects or sex differences (Supplementary Figs. 22 and 23 and Supplementary Tables 1825 and 26).

We then explored the relationship between GDM and T2D effects in more detail applying a Bayesian classification algorithm19 to the top associations for GDM and top associations for T2D selected to have comparable statistical evidence for association (13 loci for GDM and 15 loci for T2D; Methods; Supplementary Note). Initial assessment was performed using T2D effect sizes from a GWAS of male FinnGen participants (27,607 cases and 118,687 controls) to prevent the Bayesian algorithm from being affected by sample overlap. We then performed the same analysis in men and in women from a large external meta-analysis of T2D for comparison.

The shared variants analysis suggested that the genetics of GDM risk falls into two categories, one shared with T2D risk and the other predominantly gestational (Fig. 2 and Supplementary Table 27). Specifically, the comparison of effect sizes between GDM and T2D does not support the existence of a single, consistent relationship between GDM and T2D across loci, but instead proposes two distinct classes of significant variants in this scan (Fig. 2)—class G, with GDM-predominant effects, and class T, with T2D-predominant effects. The two-class model of relationship between GDM and T2D fits the observed distribution of odds ratios (ORs) significantly better than a single-class model (log10(Bayes factor (BF)) = 29.41). Class G contains 8 of the 13 GDM-associated loci that have GDM-predominant SNP effects, with effect sizes roughly three times greater in GDM than in T2D on average (Fig. 2 and Table 1). The majority of class G loci had a positive effect in T2D, but it was proportionately less than their effect in GDM. In comparison, the GDM-associated SNPs contained in class T had effects in the two disorders that were consistent with T2D-signals significantly associated with diabetes only in the T2D GWAS—namely, a reduced effect size in GDM versus T2D—a pattern of effects that was observed for all SNPs in class T. Variant classes were maintained whether comparing to T2D in men or women, with no evidence of a sex-specific classification (Supplementary Figs. 24 and 25 and Supplementary Tables 18 and 28). Stratification patterns were also consistent in T2D regardless of pregnancy history (Supplementary Fig. 25 and Supplementary Table 18) or inclusion of extended GDM results from GWAS including the FinnGen holdout set (Supplementary Figs. 26 and 27 and Supplementary Tables 29 and 30). Fasting plasma glucose associations occur in all classes, specifically with 5 of 8 class G loci, 2 of 3 class T loci and 2 of 3 unclassified loci.

The existence of the GDM-predominant class of effects, class G, distinct from those traditionally seen in T2D, raises the possibility of physiologic mechanisms of glycemic control with different actions or regulations during pregnancy (Supplementary Note, Supplementary Table 31 and Supplementary Fig. 28). As presented in Table 1, the eight class G loci have a peak SNP that is either intronic to a protein-coding gene, a missense mutation or, a 5′-UTR variant. Although the effects of a locus do not always operate through the nearest gene, several of the loci implicate genes involved in plausible cellular processes, for example, signal transduction and hormone processing. Examples of such genes20,21,22,23,24,25,26,27 are presented in Box 1.

Finally, to gain further insight into potential functional differences between GDM and T2D, we examined the cell-type specific expression patterns associated with the GWAS summary statistics28 (Methods; Fig. 3, Supplementary Tables 3235 and Supplementary Figs. 3032). We evaluate cell-type specific enrichment despite the lack of significant tissue-level enrichment because pregnancy induces major adaptive changes to specific cell populations within maternal tissues that might not be reflected in bulk tissue expression. Analyses integrating multiple large single-cell RNA expression datasets indicated that pancreatic β cells are significantly associated both with GDM and T2D. However, only GDM had significant associations with the hypothalamus, that is, hypothalamic GABAergic neurons (GABA2), hypothalamic glutaminergic neurons (GLU7) and neurons in the ventromedial hypothalamus (VMH) arcuate nucleus (NR5a1_Adcyap1; Fig. 3 and Supplementary Table 34).

Taken together, we present data from the largest GDM GWAS to date, identifying 14 independent signals in 13 associated chromosomal regions. This study replicates all five of the loci previously associated with GDM, albeit with indications of a weaker effect of HKDC1 than previously reported, and discovers nine new loci. Our key finding is that GDM has a partially distinct genetic etiology, that is, while GDM and T2D in part share a polygenic predisposition, there is a second category of GDM genetic risk factors that are predominantly gestational contributors to disease. This contextualizes the substantial effect of the MTNR1B locus, which had been reported previously as an outlier9, but our data now show that MTNR1B is representative of a whole group of GDM-predominant loci, characterized by a larger effect on GDM than on T2D.

Further studies will be required to characterize the precise GDM-predominant molecular effects, but our current results suggest plausible mechanisms related to maternal adaptive physiological responses to pregnancy. Broadly, pregnancy increases circulating gestational hormones (for example, human placental lactogen, progesterone and estrogen) altering normal homeostatic glycemic pathways in the brain and pancreas as well as impaired insulin sensitivity in maternal peripheral tissues. The brain and pancreas both show clear enrichment of signal in our cell-type specificity analysis of GDM, with our results in the brain showing specific associations with hypothalamic and arcuate (ARC) neurons in GDM that are not seen in T2D (Fig. 3 and Supplementary Tables 3234). The hypothalamus and ARC are connected by multiple neural pathways29, and both regions have been implicated in adaptive glycemic response during pregnancy30. In that context, our ESR1 locus is particularly interesting given that the VMH contains glucose-sensing neurons that express the estrogen receptor-α (ERα, encoded by ESR1) and act to regulate glucose levels31. Moreover, in mice, ERα knockout or perturbation of estrogen levels (which occurs in pregnancy) alters the expression of multiple class G genes (for example, PCSK1MTNR1B and SPC25G6PC2) in ARC neurons that arise in the VMH32. Our cell-type specificity results particularly highlight Nr5a1_Adcyap1 in ARC, which projects from the VMH33 (Supplementary Fig. 31, Supplementary Table 34 and Supplementary Note). Given the complexity of GDM, however, much larger studies will be required to reach a comprehensive view of the molecular underpinnings of GDM susceptibility.

The current study design in the rather homogeneous Finnish population carries specific strengths and weaknesses associated with this analysis approach. On one hand, GWAS discovery is enhanced by population homogeneity16, and the linkage of national birth, inpatient and outpatient medical registries enables robust phenotyping (Methods). The generalizability of the results may suffer, however, as some detected loci may be for rare alleles specifically enriched in the Finnish population. In our analyses of GDM, two loci mapped to rare alleles enriched in Finland, which may be difficult to replicate elsewhere, while 70% of the loci correspond to variants that are common (minor allele frequency (MAF) > 10%), in non-Finnish European ancestry individuals (Table 1). Nonetheless, additional studies prioritizing ancestrally diverse populations are needed for a better understanding of the genetic underpinnings of GDM in all populations at risk.

In summary, we discovered nine new loci associated with GDM and demonstrated that GDM genetic risk is distinct from T2D both at the locus and genomic scale. Our results suggest that the genetics of GDM risk falls into the following two categories: one part T2D risk and one part predominantly gestational contributors to disease. Tissue characterization of GDM genetics further implicates tissues previously identified in adaptive pregnancy responses, raising hypotheses regarding genetic effects in these tissues during pregnancy. Broadly, this work underscores the benefits of focusing resources on pregnancy disorders as pregnancy is a natural perturbation that offers leverage to discover loci with new physiologic mechanisms of glycemic or homeostatic control….

Sign up for our Newsletter