Molecular Medicine Israel

Androgen receptor binding sites enabling genetic prediction of mortality due to prostate cancer in cancer-free subjects

Abstract

Prostate cancer (PrCa) is the second most common cancer worldwide in males. While strongly warranted, the prediction of mortality risk due to PrCa, especially before its development, is challenging. Here, we address this issue by maximizing the statistical power of genetic data with multi-ancestry meta-analysis and focusing on binding sites of the androgen receptor (AR), which has a critical role in PrCa. Taking advantage of large Japanese samples ever, a multi-ancestry meta-analysis comprising more than 300,000 subjects in total identifies 9 unreported loci including ZFHX3, a tumor suppressor gene, and successfully narrows down the statistically finemapped variants compared to European-only studies, and these variants strongly enrich in AR binding sites. A polygenic risk scores (PRS) analysis restricting to statistically finemapped variants in AR binding sites shows among cancer-free subjects, individuals with a PRS in the top 10% have a strongly higher risk of the future death of PrCa (HR: 5.57, P = 4.2 × 10−10). Our findings demonstrate the potential utility of leveraging large-scale genetic data and advanced analytical methods in predicting the mortality of PrCa.

Introduction

Prostate cancer (PrCa) is the most common cancer in Europe and North America and the second most common cancer worldwide in males, accounting for an estimated 6.7% of cancer mortality in males1. Given its high mortality, the prediction of incidence and death due to PrCa would be of great interest from a public health perspective as implementing early detection and intervention for individuals with high prospective risk could be beneficial to both patients and health providers. To construct such predictive models, genetic components would be excellent sources as PrCa is evidenced to have a heritability of up to 58%, which is the highest among all cancers2. Family history has been utilized to identify the at-risk subjects, but this alone is not sufficient for precise risk stratification. More recently, polygenic risk scores (PRS) based on genetic variants identified from genome-wide association studies (GWAS) of PrCa3,4,5,6,7,8,9,10,11,12,13,14 have been developed14,15,16,17,18,19,20,21. However, prediction of the incidence of PrCa has limited clinical utility as PrCa could be latent, and autopsy studies showed a high prevalence of asymptomatic PrCa in older men. Predicting death due to PrCa will provide more value for clinical management but remains to be investigated.

Compared with conventional PRSs based on variants of GWAS at different P value thresholds, it has been shown that PRSs based on finemapped variants or variants with functional relevance to diseases have a superior predictive performance22,23. There are two methods to identify variants that are informative for the PRS aside from the simple expansion of the GWAS. One is a statistical fine-mapping which may pinpoint potentially causal variants from the implicated association regions. Another is to prioritize variants in cell-type-specific regulatory elements relevant to the target phenotype. Considering the nature of PrCa as a male-specific cancer, in the current study we focused on the androgen receptor (AR), an important factor of PrCa development and progression and also a therapeutic target of PrCa24. AR is a transcription factor. Upon the binding of the active androgen dihydrotestosterone25,26, AR is translocated into the nucleus and binds to hormone response elements of DNA and subsequently regulates the expression of various genes related to proliferation and differentiation27. AR was overexpressed and dysregulated in 56% of primary lesions and almost all metastatic lesions of PrCa28. Thus, we hypothesized that focusing on statistically finemapped variants within the AR-binding sites as putative causal variants may improve the prediction accuracy for incidence and death of PrCa, which could be useful in clinical settings.

Since European population is still the major source of genetic association studies in PrCa14, non-European population would be useful to find unreported associations. While Japanese has relatively low prevalence of PrCa in comparison with European populations and African Americans29, the previous studies showed substantial genetic overlap among populations14.

In the present study, we conducted a multi-ancestry meta-analysis for PrCa and identified 171 loci associated with PrCa including 9 unreported loci. Furthermore, the fine-mapping analysis showed that variants with high posterior probability were enriched in AR-binding sites and PRS based on these variants predicts the PrCa mortality in cancer-free subjects. These findings provide insights into the basics underlying PrCa and clues for genetic prediction of the development and death of PrCa, resulting in potential early detection and therapeutic intervention for PrCa.

Results

A genome-wide association study of Biobank Japan

The overall study design of GWAS was shown in the Supplementary Fig. 1. First, we conducted a GWAS of Biobank Japan (BBJ) samples consisting of 8645 cases and 89,536 controls (Supplementary Table 1). We identified 32 significant loci including an unreported locus. The unreported signal at 16.q22.2-16q22.3 peaks at rs8052683, an intronic variant of tumor suppressor gene ZFHX330,31. rs8052683 is located in an expression quantitative trait locus (eQTL) for ZFHX3 in the prostate (the risk allele of the variant lowering the expression of ZFHX3) in the GTEx data32. In line with the possible regulation of ZFHX3 expression by this variant, rs8052683 is positioned in the H3K27ac-marked region of the prostate33.

We confirmed the relatively high SNP-heritability estimate of 26.2% (SE of 4.3%). As expected, we observed a strong genetic correlation of PrCa susceptibility between BBJ and Europeans (genetic effect correlation = 0.88 and p = 0.36 by popcorn software, indicating that genetic correlation is not different from 1, see “Methods”). Genetic correlation analyses also revealed significant positive correlations (FDR < 0.05) of PrCa with breast cancer (Supplementary Table 2, “Methods”). This correlation was also observed in Europeans (Supplementary Table 3). These findings are consistent with family studies that men with a family history of breast or prostate cancer had elevated prostate cancer risks34,35,36. Significant negative genetic correlations were found in cardiovascular-related phenotypes and similar trends were observed in Europeans (peripheral artery disease and chronic heart failure (Supplementary Table 3)), supported by enrichment of SNP heritability in the cardiovascular cell group (Supplementary Table 4, “Methods”).

Multi-ancestry meta-analysis identified 171 significant loci including nine unreported loci

Next, we conducted a multi-ancestry meta-analysis, using the results of BBJ and the summary statistics of a previous GWAS14 for prostate cancer including European, African, and Hispanic ancestries assuming a random effect (“Methods”, Table 1, Supplementary Fig. 2). The combined dataset consisted of 107,218 cases and 197,733 controls. A total of 6,720,553 variants were tested and 171 independent loci reached the genome-wide significance threshold (log10 [Bayes factor (BF)] > 6 and fixed-effect P value < 1 × 10−5, for further details, see “Methods”). The 171 loci contained nine unreported loci PrCa including ZFHX3ARHGEF28-LINC01334, and GINS1 regions (Table 1) which showed relevance to PrCa or functional mechanisms of variants on susceptibility to PrCa. rs4704108, located at the intergenic region between ARHGEF28 and LINC01334, is an eQTL for ENC1 in prostate tissues in the GTEx data32 and in high LD with the lead eQTL SNP (rs17636369) of ENC1, suggesting that rs4704108 (or its tightly linked variant) is associated with PrCa via altering expression of ENC1 in the prostate. Risk allele of rs4704108 decreases expression of ENC1. rs11087515 is an intronic variant of GINS1 located at 20p11.21 which is expressed in high-grade prostate cancer and thus may be involved in the mechanism where cancer cells become invasive or metastatic37….

Sign up for our Newsletter