Abstract
While the genomes of normal tissues undergo dynamic changes over time, little is understood about the temporal-spatial dynamics of genomes in premalignant tissues that progress to cancer compared to those that remain cancer-free. Here we use whole genome sequencing to contrast genomic alterations in 427 longitudinal samples from 40 patients with stable Barrett’s esophagus compared to 40 Barrett’s patients who progressed to esophageal adenocarcinoma (ESAD). We show the same somatic mutational processes are active in Barrett’s tissue regardless of outcome, with high levels of mutation, ESAD gene and focal chromosomal alterations, and similar mutational signatures. The critical distinction between stable Barrett’s versus those who progress to cancer is acquisition and expansion of TP53−/− cell populations having complex structural variants and high-level amplifications, which are detectable up to six years prior to a cancer diagnosis. These findings reveal the timing of common somatic genome dynamics in stable Barrett’s esophagus and define key genomic features specific to progression to esophageal adenocarcinoma, both of which are critical for cancer prevention and early detection strategies.
Introduction
Normal tissues have recently been shown to harbor surprisingly extensive somatic mutations, the vast majority of which have little clinical consequence1,2,3,4,5,6. Barrett’s esophagus (BE), a predominantly benign metaplasia that arises in the esophagus in response to chronic gastric reflux7, also develops somatic mutations, but can further evolve extensive genomic alterations which confer significantly increased risk of progression to esophageal adenocarcinoma (ESAD)8,9,10,11. While important advances have been made in understanding the genomics of BE and ESAD, a key remaining question is defining molecular features, including somatic genomic dynamics, that can be used to stratify patients with BE at the highest risk of a cancer outcome (CO) vs. those likely to remain cancer free (noncancer outcome, NCO), and target those requiring aggressive treatment (ablation, endoscopic resection, surgery) vs. conservative monitoring for early detection of cancer7.
Cancer-only studies have uncovered a vast array of genomic alterations in cancer, but are unable to provide a direct comparison of somatic genome evolution of benign neoplastic tissue in non-progressing patients from those who were ultimately diagnosed with cancer. BE is an excellent in vivo model in which to study these genome dynamics. While both BE and ESAD have very high point mutation loads, very few genes are commonly mutated across patients10,12,13,14, and a large number of low-frequency gene alterations affect critical biological processes in ESAD15,16,17. ESAD is characterized by frequent somatic TP53 mutation and chromosomal copy number alterations (genome doubling [GD], aneuploidy, chromosomal instability)9,12,13,18,19,20. Complex structural chromosomal features are frequently detected in ESAD13,21,22,23 and some of these events can be detected years before ESAD diagnosis9,11,14,19,23,24. However, the targeted, exome, and low-pass whole-genome sequencing approaches applied to date have been unable to resolve these genome-wide mutational processes and complex structural variant features in sufficient detail. To address this gap, we conducted a large-scale deep whole-genome sequencing (WGS) study of BE with a validated cancer outcome based on a longitudinal cohort. This is a unique case-control WGS study of multi-region, well-annotated longitudinal, purified endoscopic biopsies from patients with BE who have been followed without endoscopic therapeutic interventions (e.g. ablation, mucosal resection). Our WGS data, spanning 427 samples across 80 patients, allowed us to compare 40 controls with stable BE who never progressed to ESAD with 40 cases who progressed to an early, endoscopically detected incident cancer.
Here we show genomic states characteristic of BE and identify chromosomal structural dynamics common to all BE genomes. The study design allows comprehensive assessment of ESAD genes of interest in NCO and CO patients, revealing TP53 dynamics and genomic features specific to cancer risk. These results support emerging evidence that many somatic alterations detected in cancer are also detected in benign tissues and thus are not obligate for cancer progression. This study provides a valuable genomic resource and serves as a template for future precancer atlas efforts25.
Results
Longitudinal multi-sample study of cases with BE who progressed to an ESAD outcome compared to controls with BE who did not progress
We sequenced whole genomes of 340 purified biopsies at high-depth (median 76X [range 40X – 106X]) as well as 62 blood and 25 normal gastric control samples at medium-depth (median 39X [range 29X – 64X]) across 80 individuals with BE (Supplementary Data File 1). These patients included 40 BE cases who progressed to ESAD (“cancer outcome”, CO) and 40 who did not progress to ESAD (“non-cancer outcome”, NCO) during a median 17.47 years (range 4.46–29.63 years) follow-up period (Fig. 1, Supplementary Data Fig. 1). For each patient, we assessed two spatially mapped samples from each of two timepoints “T1” and “T2”, (mean time between T1-T2 was 2.9 years in CO and 3.4 years in NCO), where T2 in the CO patients was the endoscopy in which cancer was first diagnosed (Supplementary Data File 2). We matched NCO to CO using baseline total somatic chromosomal alterations (SCA – copy gains, losses and copy neutral loss of heterozygosity (cnLOH))19, age at T1 (T1 = first endoscopy with sufficient sample availability), and time between T1 and T2 (T2 in NCO = follow-up endoscopy randomly selected such that the distribution time between T1-T2 was similar in CO and NCO populations). In 10 NCO, we sequenced a third-time point T3, sampled a mean of 13.2 years after T1. We purified each biopsy to separate BE epithelium from the stroma and extracted DNA from purified epithelium for WGS and 2.5 M Illumina SNP array analyses (Supplementary Data File 3).
High “trunk” mutation load (those shared by all four biopsies per patient) was observed in a majority of both NCO (24 patients, mean 4439 mutations in trunk [range 938–3,762]) and CO (26 patients, mean 7,741 in trunk [range 834–32,558]), indicative of the early expansion of a highly mutant clone before T1 and a single clonal origin of the BE segment in most patients. In contrast, the remaining 30 patients had low (<289) trunk mutations, consistent with a multiclonal origin or early divergence of clones during the establishment of the BE segment. Trunk mutations were detected in at least one patient in 1254 genes across these 24 NCO and 26 CO patients (Supplementary Data File 7), with functional trunk mutations significantly less frequent in NCO (mean 15.5 genes/patient [range 1-61]) compared to CO (mean 39.7 genes/patient [range 3–177]) (P = 0.040). Functional trunk mutations included 39 ESAD associated genes (in 12 NCO and 21 CO)10,13,27 (Supplementary Data File 8) and 17 out of 66 gastroesophageal driver genes (in 7 NCO and 17 CO) identified in Dietlein et al.,28. This indicates highly selected mutations can arise long before the onset of ESAD, as well as early in BE tissue that does not progress to ESAD over long follow-up.
Despite evidence for the spatial spread of clones, mutations private to single biopsies were more numerous than shared mutations in both NCO and CO (Fig. 2d), indicating ongoing mutational processes during evolution. The pairwise divergence between biopsies was highly variable across patients and did not significantly distinguish NCO and CO, regardless of mutation classes (i.e., functional/nonfunctional, clonal/subclonal). We directly compared total mutation load with both patient age at the time of biopsy and time between T1 to T2 and found no significant differences in either NCO or CO. Using an EM algorithm to infer the average change in mutation load between T1 and T2 (see Methods), we found no significant change in NCO (P = 0.9), but a small, significant average increase of 947 mutations/year per biopsy (approximately 6.5% of the median mutation load of a CO biopsy) accumulated between T1 and T2 in CO (P = 0.012, 95% CI 128-1,766). Taken together, these results suggest that in most patients with BE, independent of ESAD progression, there is early clonal expansion of a highly mutant progenitor, including mutations in ESAD-associated genes, with continued localized mutation accumulation.
Mutation signatures reflect the combination of intrinsic mutagenic processes and extrinsic exposure to mutagens over the history of a tissue29,30. To determine whether BE biopsies from CO show evidence of distinct mutagenic processes relative to NCO, we used SigProfiler to extract single-base (SB) signatures across all biopsies. We detected 10 Cosmic SB signatures, nine of which were previously identified at similar proportions in ESAD29 (Fig. 2e, Supplementary Data File 9). Overall, we found no significant difference in detection of each signature by the patient between CO and NCO for each of the ten signatures (adjusted p values all >0.05); by biopsy, we found SBS18 was detected marginally more frequently in NCO (P = 0.041) and SBS40 marginally more frequent in CO (P = 0.041). SBS17a/b (unknown etiology; common in the stomach and esophageal adenocarcinoma) and SBS5 (unknown etiology; correlated with age) were nearly ubiquitous across patients and samples, and typically had high numbers of assigned mutations in all four biopsies per patient. Previous studies have consistently detected SBS17a/b in both BE and ESAD10,12,14,27,31,32,33. Our results further demonstrate this mutation signature was not associated with cancer outcome, but rather with the tissue environment in which BE develops. The proportion of both SBS17a and SBS17b increased significantly with increasing single base mutation load in both NCO and CO (P < 0.0001), and has also been previously observed in ESAD10.
We examined whether combinations of mutation signatures separated patients based upon their progression status. Hierarchical clustering using cosine similarity of mutation signatures resulted in three clusters of patients (Fig. 2f): i) ubiquitous and a high count of SBS5 and SBS17a/b, ii) ubiquitous SBS1, SBS5, SBS17a/b, plus one or more biopsies with SBS40, and iii) ubiquitous signatures plus a combination of SBS18, SBS2/SBS13, and SBS34. However, the count of NCO vs. CO patients in each cluster was not significantly different (P = 0.999, 0.076, 0.144 for clusters 1, 2, and 3, respectively). In both NCO and CO, roughly half of the patients had variable mutation signatures between biopsies, consistent with ongoing clone-specific or localized mutagenic insults within individual patients (Supplementary Data Fig. 2). Of the eight signatures with sufficient numbers of mutations for evaluation, SBS1, SBS40, SBS5, and SBS17a/b had similar mean proportion of mutations per patient in trunk (0.21–0.24), compared to lower mean proportion in SBS18, SBS2 and SBS13 (0.05 – 0.12) (Fig. 2g, Supplementary Data File 10), suggesting SBS18, SBS2 and SBS13 arise more often as localized or “later” events during the evolution of the BE tissue. Overall, mutation signatures are very similar between NCO and CO, suggesting the exposures that cause them are characteristic of BE rather than specific to ESAD development…