Molecular Medicine Israel

Integrated analysis of cervical squamous cell carcinoma cohorts from three continents reveals conserved subtypes of prognostic significance

Abstract

Human papillomavirus (HPV)-associated cervical cancer is a leading cause of cancer deaths in women. Here we present an integrated multi-omic analysis of 643 cervical squamous cell carcinomas (CSCC, the most common histological variant of cervical cancer), representing patient populations from the USA, Europe and Sub-Saharan Africa and identify two CSCC subtypes (C1 and C2) with differing prognosis. C1 and C2 tumours can be driven by either of the two most common HPV types in cervical cancer (16 and 18) and while HPV16 and HPV18 are overrepresented among C1 and C2 tumours respectively, the prognostic difference between groups is not due to HPV type. C2 tumours, which comprise approximately 20% of CSCCs across these cohorts, display distinct genomic alterations, including loss or mutation of the STK11 tumour suppressor gene, increased expression of several immune checkpoint genes and differences in the tumour immune microenvironment that may explain the shorter survival associated with this group. In conclusion, we identify two therapy-relevant CSCC subtypes that share the same defining characteristics across three geographically diverse cohorts.

Introduction

Despite screening and the introduction of prophylactic human papillomavirus (HPV) vaccination in developed countries, cervical cancer continues to be one of the leading worldwide causes of cancer-related deaths in women1. Prognosis for patients with metastatic disease remains poor, thus new treatments and effective molecular markers for patient stratification are urgently required. Cervical cancer is caused by at least 14 high-risk human papillomaviruses (hrHPVs), with HPV16 and HPV18 together accounting for over 70% of cases worldwide, with some variation by region1,2,3,4. Cervical squamous cell carcinoma (CSCC) is the most common histological subtype of cervical cancer, accounting for approximately 60–70% of cases, again with some variation seen across different populations2. Adeno- and adenosquamous histology are both associated with poor prognosis5,6,7,8, while the relationship, if any, between HPV type and cervical cancer prognosis remains unclear9. HPV type is also associated with histology; HPV16 and HPV18 were reported in 59.3% and 13.2% of CSCC and in 36.3% and 36.8% of adenocarcinoma respectively worldwide, between 1990 and 20102. Previous landmark studies described the genomic landscape of cervical cancer in different populations10,11,12,13 and in some cases identified subtypes based on gene expression, DNA methylation and/or proteomic profiles10,11. The Cancer Genome Atlas (TCGA) network identified clusters based on RNA, micro-RNA, protein/phospho-protein, DNA copy number alterations and DNA methylation patterns and combined data from multiple platforms to define integrated iClusters10. In their analysis, only clustering based on the expression levels and/or phosphorylation state of 192 proteins as measured by reverse-phase protein array (RPPA) was associated with outcome, with significantly shorter overall survival (OS) observed for a cluster of cervical cancers exhibiting increased expression of Yes-associated protein (YAP) and features associated with epithelial-to-mesenchymal transition (EMT) and a reactive tumour stroma. Since TCGA’s RPPA analysis was restricted to 155 tumours including SCCs, adeno- and adenosquamous carcinomas, we set out to test the hypothesis that with data from more samples, we could identify a set of transcriptional and epigenetic features associated with prognosis within CSCC and to establish whether it is also present in independent patient cohorts representing different geographical locations and ethnicities. To identify molecular subtypes and prognostic correlates, we identified a set of 643 CSCCs (all HPV-positive), for which clinico-pathological data and genome-wide DNA methylation profiles were either publicly available or generated in this study, and for which in most cases, matched gene expression and somatic mutation data were also available (Table 1). Here, we show that CSCC samples from TCGA’s cohort can be classified into one of two subgroups (C1 or C2), using gene expression or DNA methylation profiles and that the key features of these subgroups are conserved across validation cohorts from different continents. We identify differences in the tumour immune microenvironment and genetic alterations between subgroups and use detailed disease-specific survival data from our European cohort to validate the prognostic significance of the C1 / C2 classification.

Results

Identification of two gene expression-based clusters in cervical squamous cell carcinoma

Sign up for our Newsletter