Molecular Medicine Israel

Inherited polygenic effects on common hematological traits influence clonal selection on JAK2V617F and the development of myeloproliferative neoplasms

Abstract

Myeloproliferative neoplasms (MPNs) are chronic cancers characterized by overproduction of mature blood cells. Their causative somatic mutations, for example, JAK2V617F, are common in the population, yet only a minority of carriers develop MPN. Here we show that the inherited polygenic loci that underlie common hematological traits influence JAK2V617F clonal expansion. We identify polygenic risk scores (PGSs) for monocyte count and plateletcrit as new risk factors for JAK2V617F positivity. PGSs for several hematological traits influenced the risk of different MPN subtypes, with low PGSs for two platelet traits also showing protective effects in JAK2V617F carriers, making them two to three times less likely to have essential thrombocythemia than carriers with high PGSs. We observed that extreme hematological PGSs may contribute to an MPN diagnosis in the absence of somatic driver mutations. Our study showcases how polygenic backgrounds underlying common hematological traits influence both clonal selection on somatic mutations and the subsequent phenotype of cancer.

Main

Myeloproliferative neoplasms (MPNs) are rare chronic hematological cancers characterized by the overproduction of mature blood cells leading to elevated blood cell parameters. They are typically driven by somatically mutated JAK2-mediated, calreticulin (CALR)-mediated or MPL-mediated clonal expansion1JAK2 mutations are found in both polycythemia vera (PV) and essential thrombocythemia (ET), which are distinct but overlapping MPNs characterized by increased numbers of red blood cells and platelets, respectively. Mutant JAK2 is commonly detectable in 0.1–3% of the healthy population as clonal hematopoiesis (CH)2,3,4,5,6,7, with the vast majority of carriers not meeting or going on to develop disease-defining characteristics of MPN. Little is understood about why only a minority of individuals with mutated JAK2 develop more severe hematological manifestations of MPN and the factors that influence blood count heterogeneity in MPNs.

The 46/1 haplotype near JAK2 is a known germline risk factor for MPNs in the population8. Genome-wide association studies (GWAS) have identified additional disease-associated germline risk loci, estimating the liability-scale heritability of MPNs based on common single-nucleotide polymorphisms (SNPs) to be ~6.5% (refs. 9,10,11). However, these germline risk loci insufficiently explain the phenotypic heterogeneity observed within MPNs and in JAK2-mutated healthy carriers.

Blood cell traits vary widely in the healthy population. The genetic architecture underlying these traits is highly polygenic, with more than 11,000 independently associated genetic variants discovered so far12,13,14. These genome-wide associated variants, when combined in polygenic scores (PGSs), explain a large proportion of phenotypic variance among healthy individuals (from 2.5% for basophil count to 27.3% for mean platelet volume) and are associated with multiple common diseases and rare hematological disorders14. We hypothesized that a genetic burden of germline variants associated with extreme hematological traits could influence phenotypic heterogeneity in association with mutated JAK2, by influencing the clonal dynamics of mutant JAK2 and/or modifying its downstream consequences. In this study, we integrate information on somatic driver mutations, germline genetic variants associated with MPNs, and CH and hematological trait PGSs to study how inherited polygenic variation underlying blood cell traits influences clonal selection on mutated JAK2 and MPN disease phenotypes (Supplementary Fig. 1).

Results

Inherited polygenic contribution to JAK2 V617F positivity

One in 30 healthy individuals reportedly harbors JAK2V617F in their blood, as determined using sensitive assays6. The majority of such individuals have low levels of JAK2V617F and do not meet clinical criteria for MPN due to the absence of elevated blood cell parameters. We wished to understand whether inherited polygenic loci that underlie blood cell traits influence the strength of clonal selection on JAK2V617F.

We studied the germline characteristics of individuals in UK Biobank (UKBB) with and without JAK2V617F. From 162,534 genetically unrelated individuals of European ancestry within the UKBB whole-exome sequencing cohort (‘200k UKBB-WES cohort’; Methods), we identified 540 individuals with one or more mutant reads for JAK2V617F (0.3%, median variant allele frequency (VAF) = 0.056, range = 0.019–1; Supplementary Fig. 2; ‘UKBB-JAK2V617F cohort’). The lower rate of JAK2V617F in the UKBB-WES cohort compared to other population studies6,7 could be explained by its low sequencing coverage (21.5× depth), as also reported previously15 (Supplementary Fig. 3). As expected, there was some overlap among individuals with JAK2V617F and those with a diagnosis of MPN. Of the 423 individuals labeled with a diagnosis of MPN (156 with ET, 161 with PV and 106 with myelofibrosis (MF)), 72 were positive for JAK2V617F (Supplementary Table 1).

We built PGSs for 29 blood cell traits covering a wide range of hematopoietic parameters (Supplementary Table 2). Blood cell trait-specific PGSs were then weighted (by effect size) by the sum of all common (minor allele frequency (MAF) > 0.01) variants that were independently associated with a blood cell trait at genome-wide significance (P < 5 × 10−8) in UKBB (Methods)14. To assess the association between hematological PGSs and small (VAF < 0.1, n = 397) or large (VAF ≥ 0.1, n = 143) JAK2V617F clones, we used multinomial logistic regression including PGSs for each hematological trait (units of s.d.), together with previously reported germline sites associated with MPN9 and CH16 (PGSMPN and PGSCH) as covariates. To account for the recognized predisposition risk for MPN driven by the JAK2 46/1 haplotype8, we computed two PGSMPN scores, separating rs1327494 (tagging the JAK2 46/1 haplotype; PGSMPN46/1) from nontagging JAK2 variants (PGSMPN-other). We found a negative association between the PGSs for both mean reticulocyte volume (PGSMRV) and immature reticulocyte fraction (PGSIRF) and small JAK2V617F clones (P = 6.2 × 10−4 and 0.0018, false discovery rate (FDR) < 0.05; Supplementary Table 3). We also found significant positive associations with small JAK2V617F clones for the PGSs of plateletcrit (PGSPCT) and monocyte count (PGSMONO) (P = 9.5 × 10−4 and 0.0036, FDR < 0.05). Germline predisposition to high MONO and PCT values was also positively associated with large JAK2V617F clones at modest significance (P = 0.033 and 0.0022, FDR-adjusted P = 0.31 and 0.064; Fig. 1a). Repeating the analysis above excluding MPN cases still demonstrated a significant association between PGSPCT or PGSMONO and small JAK2V617F clones (P < 0.013, Bonferroni corrected; Supplementary Table 4), suggesting that the inherited effects on JAK2V617F were not driven by the subset of MPN cases. These associations were independent of the known germline risk loci associated with MPN and CH (Supplementary Table 3). Validating these associations in the full UKBB-WES dataset (n = 799 and 326 for small and large clones, respectively, and n = 338,919 for controls), we again replicated the associations between PGSPCT and small JAK2V617F clones and between PGSMONO and large JAK2V617F clones at FDR < 0.05 (PCT: odds ratio (OR) = 1.15 (change in odds per increase of 1 s.d. in PGS), 95% confidence interval (CI) = 1.07–1.24, P = 1.4 × 10−4; MONO: OR = 1.20, 95% CI = 1.07–1.34, P = 0.0014; Supplementary Table 5).

To understand the causal relationship among these associations, we undertook Mendelian randomization (MR) analyses with GWAS estimates for the exposure (blood traits) and the outcome (JAK2V617F positivity; Supplementary Fig. 4) obtained from two independent sources. We used genetic instruments for hematological traits identified from UKBB, with effect size estimates from INTERVAL17 (n = 30,305), an external independent cohort. MRV was excluded due to a lack of data in INTERVAL. Both PCT and MONO showed significant causality on the presence of a JAK2V617F clone based on inverse variance-weighted (IVW)18 MR and demonstrated consistent effect estimates using two other MR methods (simple median and weighted median), suggesting that higher MONO and higher PCT values cause a detectable JAK2V617F clone (Supplementary Table 6).

Extending this analysis to the full UKBB-WES cohort (JAK2V617Fn = 1,125; controls, n = 338,919) validated these causal associations with greater estimation accuracy (PCT: ORIVW = 1.52, 95% CI = 1.29–1.78, P = 3.0 × 10−7; MONO: ORIVW = 1.3, 95% CI = 1.15–1.49, P = 4.6 × 10−5; Fig. 1b and Supplementary Table 7). The IVW method of MR (Methods) assumes that the germline loci that drive MONO and PCT have no direct causal effect on driving a JAK2V617F clone (that is, there are no direct causal effects of the genetic instruments on the outcome). We found no evidence of pleiotropy using the MR-Egger19 test; the estimated intercept was not significantly different from zero with P = 0.84 and P = 0.90 for PCT and MONO, respectively. The causal relationship was also significant for PCT and MONO (P < 0.05; Supplementary Table 7 and Supplementary Fig. 5). Additionally, the estimates were not biased by any potential pleiotropic outlier variants and were highly consistent with outlier-corrected causal estimates (Supplementary Table 7 and Methods). Lastly, to ensure the results were not confounded by the possibility that the genetic loci used as instruments for MR directly promoted the outcome (that is, JAK2V617F positivity), we repeated the analysis excluding genetic instruments associated with JAK2V617F positivity (Passociation < 10−6), as well as those that correlated with JAK2V617F variants (that is, those variants and JAK2V617F variants are in linkage disequilibrium (LD) r2 > 0.01) or were in proximity to JAK2V617F variants (in the 10-Mb region centered on each variant), and found no major changes (Supplementary Table 8). Importantly, any reverse causal effect we detected for MONO and PCT was subtle and with pleiotropic effects (PEgger > 0.05 and PEgger-intercept < 0.05; Supplementary Table 9 and Supplementary Fig. 6).

Overall, the association results combined with MR suggest that higher PCT and MONO are causal for the presence of a JAK2V617F clone. This would also explain why individuals with germline predisposition to high PCT and MONO are also more likely to harbor a JAK2V617F clone. Given that acquisition of somatic mutations in blood is largely stochastic in healthy populations20, our data suggest that genetically predicted PCT and MONO influence clonal selection on nascent JAK2V617F cells to promote mutation acquisition.

Germline contribution to blood cell count variation in MPNs

Having shown that polygenic germline loci can predispose to JAK2 clone positivity through their influence on blood cell trait levels, we next studied the contribution of these inherited sites to clinical phenotypes of MPN. We first considered the four blood cell traits that are used to define MPN subcategories clinically21 as follows: hemoglobin concentration (HGB) (g dl–1 divided by 10), hematocrit (HCT) (%), platelet count (PLT) (×109 divided by 1,000) and white blood cell count (WBC) (×109 divided by 100). We used SNP arrays to measure genome-wide polymorphism in an MPN cohort of 761 patients (PV, n = 112; ET, n = 581; MF, n = 68), in whom diagnostic blood cell counts were available and mutation status for a panel of cancer-associated genes (Fig. 2a) had previously been characterized22.

We built PGSs for the four blood cell traits in both patients with MPN and a cohort of healthy blood donors from the INTERVAL study (n = 30,305; Methods). For each trait, we built a linear regression model with predictors that included the corresponding PGS (for example, PGSHGB), demographic variables (age and sex), JAK2 46/1 haplotype status (due to its influence on hematological traits23,24) and somatic mutation status for genes frequently (>20 patients) mutated in the MPN patient cohort (Fig. 2b). Following stepwise regression, we identified three somatic mutations (JAK2 mutation, CALR mutation and chromosome 9 aberration resulting in homozygous JAK2 mutation), PGS and sex as significant explanatory variables for at least one of the four traits (P < 0.001, Bonferroni corrected; Fig. 2c). We then estimated the phenotypic variation in blood cell traits explained by each variable conditional on the others (Fig. 2d and Methods). As a benchmark, we estimated PGS-explained hematological trait variation in controls from the INTERVAL cohort based on similar linear regression models with covariates such as age, sex and ten principal components (PCs) controlling for population stratification. Blood cell trait phenotypes were inverse-normal transformed, and only genetically unrelated individuals were included (Supplementary Fig. 7 and Methods).

The estimated PGS-explained phenotypic variance in blood cell traits in INTERVAL was as follows: 6.8% (95% CI = 6.2–7.4%) for HGB; 6.7% (95% CI = 6.1–7.2%) for HCT; 10% (95% CI = 9.7–11%) for WBC; and 25% (95% CI = 24–26%) for PLT, highly consistent with previously published results14. In the MPN patient cohort, when taking into account the effects of somatic mutations, there remained significant but smaller PGS effects on HGB, HCT and WBC (P < 0.001, Bonferroni corrected; n = 380 to 577; Fig. 2c and Methods), explaining only 2.0% (95% CI = 0.62–4.2%), 3.0% (95% CI = 0.79–7.0%) and 2.8% (95% CI = 1.0–5.3%) of trait variance, respectively (Fig. 2d), while PGS had no significant effect on PLT (Fig. 2cP > 0.05).

To validate these findings, we analyzed the UKBB-WES cohort including patients with MPN (‘UKBB-MPN cohort’; patients, n = 423; healthy controls, n = 161,872; Supplementary Table 1) with hematological PGS and JAK2V617F somatic mutation as the main explanatory variables (Fig. 2e). Around 60% of MPNs were positive for JAK2V617F. However, only 44 of 423 individuals (10.4%) with labels of ET, PV or MF in UKBB had two or more mutant reads for JAK2V617F at a median sequencing depth of 21× and only 72 of 423 individuals (17.2%) had one or more mutant reads for JAK2V617F (Supplementary Table 1), suggesting that, despite the low depth of exome sequencing coverage, this cohort of individuals may represent a mixture of those with true somatic mutation-driven MPN and those with high blood counts driven by other causes. Both PGS and JAK2V617F in the UKBB-MPN cohort captured a substantial proportion of the phenotypic variation for all four traits (PGS, 9.8–14.3%; JAK2V617F, 2.8–12.0%; Methods and Fig. 2f). Interestingly, the contribution of PGS to blood cell traits was notably higher in the UKBB-MPN cohort than in the MPN patient cohort (Fig. 2d,f). This may reflect an ascertainment bias in estimation of the genetic weights of the PGSs from the UKBB. However, it is also possible that some individuals with labels of MPN included in the UKBB-MPN cohort had elevated blood counts driven by a high PGS. This could also contribute to the low prevalence of JAK2V617F in the UKBB-MPN cohort compared to the more strictly defined clinical MPN patient cohort.

Germline polygenic impact on MPN subtype at diagnosis

We next explored whether polygenic loci underlying hematological traits also affect initial disease classification, severity and subsequent disease evolution. MPNs can be classified into chronic phase conditions (ET and PV) and advanced phase MF. To assess whether PGSs for hematological traits influence MPN classification at diagnosis, we used multinomial logistic regression to explore the associations with standardized PGSs, including age, sex and ten PCs as covariates. We performed this analysis in genetically unrelated individuals across PGSs for the 29 different blood cell traits, including the 4 tested previously (Supplementary Table 2; ET, n = 581; PV, n = 112; control, n = 30,305).

We found that PGSs for multiple MPN-relevant blood traits showed significant associations with ET at FDR < 0.05. High PGSPLT, PGSPCT and PGSWBC were associated with increased risk while high PGSHCT, PGSHGB and PGSPDW (platelet distribution width) were associated with decreased risk of having an ET diagnosis (Supplementary Fig. 8a and Supplementary Table 12). Increased risk of PV diagnosis was modestly associated with an increased PGS for several red blood cell traits (PGSHGB, PGSHCT and PGSRBC), PGSPCT and PGSs for white blood cell traits (eosinophil count (EO) and MONO). PGSMRV showed a risk-decreasing effect for PV (P < 0.05; Supplementary Fig. 8a and Supplementary Table 12). We repeated this analysis in the UKBB-MPN cohort and healthy controls (ET, n = 156; PV, n = 161; control, n = ~161,000; Supplementary Table 1), taking into account the VAF of JAK2V617F (Supplementary Fig. 2) because it can influence blood count parameters (Fig. 2d,f). We found that the PGSs for two platelet traits (PGSPCT and PGSPLT) were significant risk factors for an ET diagnosis while those for four red blood cell traits (PGSHGB, PGSHCT, PGSRBC and PGSMCHC (mean corpuscular hemoglobin concentration)) were significant risk factors for PV diagnosis at FDR < 0.05 (Supplementary Fig. 8b and Supplementary Table 13). Thus, we replicated the significant polygenic germline risk effects of PGSPLT and PGSPCT for ET and PGSHGB, PGSHCT and PGSRBC for PV in both the MPN patient and UKBB-MPN cohorts. These results provide evidence of a strong polygenic germline predisposition for one hematological malignancy over the other, in this case ET versus PV, irrespective of somatic driver mutation status and driven by inherited variants implicated in basic hematopoietic processes.

We next asked whether an individual such as a JAK2V617F carrier might be protected from developing an MPN by inheriting a low PGS for relevant blood cell traits. Using enrichment tests in the full UKBB-WES cohort across the PGSs of six hematological traits that were identified to be either putative causal factors for JAK2V617F clones or associated factors for MPN diagnosis (MONO, PCT, PLT, HGB, HCT and RBC; Methods), we found that healthy JAK2V617F carriers were enriched in the low-PGS group for the two platelet traits and monocytes, with an enrichment OR around 2 (PGSPCT: OR = 2.8, 95% CI = 1.51–5.42, P = 3.8 × 10−4; PGSPLT: OR = 2.35, 95% CI = 1.29–4.43, P = 0.0027; PGSMONO: OR = 1.99, 95% CI = 1.11–3.67, P = 0.015), indicating a protective effect that makes low-PGS individuals around two times less likely to have ET than those in the high-PGS group. This is interpreted as a relative risk, which is very close to OR given the low incidence of ET (~1.6 per 100,000; Supplementary Table 14). Importantly, this indicates that an individual’s PGS for several hematological traits also influences the risk of developing subsequent disease from JAK2V617F CH. The association of low PGS with healthy JAK2V617F carriers was also confirmed in a logistic regression analysis (PGSPCT: OR = 2.32, 95% CI = 1.27–4.25, P = 0.0065; PGSPLT: OR = 2.48, 95% CI = 1.36–4.54, P = 0.0032; PGSMONO: OR = 2.08, 95% CI = 1.15–3.77, P = 0.016) with covariates included (Methods).

Of note, only 10–17% of the UKBB-PV cohort had a JAK2V617F mutation (n = 1 or ≥2 reads, respectively; Supplementary Table 1), although mutated JAK2 is expected to be found in >99% of PV cases. This raises the possibility that polygenic germline predisposition to high red blood cell indices may also contribute to other causes of clinical polycythemia not driven by JAK2V617F mutation25. However, we cannot exclude the possibility that some JAK2 mutations were missed due to the length of time between UKBB blood sampling and diagnosis (Supplementary Fig. 9) and the low sequencing coverage of JAK2, although these factors would equally affect the PV subgroups positive and negative for JAK2V617F in the UKBB cohort (Supplementary Fig. 3).

Combined germline impact on MPN classification at diagnosis

Because several SNPs and germline loci have been found to be associated with both MPN and CH9,15,26,27,28,29, we assessed whether germline predisposition to an ET versus PV diagnosis through PGSs for blood cell traits was independent of previously reported germline risk loci. To this end, we estimated the independent germline effects of each of the five hematological traits significant to MPN (for example, PGSPCT), taking into account previously reported genetic loci associated with MPN9 and CH16 (PGSMPN-46/1, PGSMPN-other and PGSCH). Across both MPN patient cohort and UKBB-MPN cohort, the strongest (P < 0.05 in both cohorts) germline risk factors for a diagnosis of ET were PGSPLT and PGSPCT, followed by MPN-specific risk loci (that is, PGSMPN-other). In PV, PGSHGB and PGSMPN-other were the strongest risk factors (Fig. 3a, Supplementary Table 15 and Methods). Our data confirmed strong risk effects for these five hematological PGSs independent of all currently known genetic loci predisposing to risk of MPN and CH. As a sensitivity analysis, we confirmed these associations after excluding variants associated with MPN and CH and their proxies (LD r2 > 0.6) from the hematological PGSs (Methods). Interestingly, the 46/1 haplotype, which was most strongly associated with PV in the MPN patient cohort, was not significant for PV in the less well-defined UKBB-PV cohort (Fig. 3a), in which the majority of individuals were JAK2V617F negative, suggesting that this locus is not a risk factor for developing high red blood cell indices independently of mutant JAK2, such as in secondary or apparent polycythemia….

Sign up for our Newsletter