Molecular Medicine Israel

A gut microbial signature for combination immune checkpoint blockade across cancer types

Abstract

Immune checkpoint blockade (ICB) targeting programmed cell death protein 1 (PD-1) and cytotoxic T lymphocyte protein 4 (CTLA-4) can induce remarkable, yet unpredictable, responses across a variety of cancers. Studies suggest that there is a relationship between a cancer patient’s gut microbiota composition and clinical response to ICB; however, defining microbiome-based biomarkers that generalize across cohorts has been challenging. This may relate to previous efforts quantifying microbiota to species (or higher taxonomic rank) abundances, whereas microbial functions are often strain specific. Here, we performed deep shotgun metagenomic sequencing of baseline fecal samples from a unique, richly annotated phase 2 trial cohort of patients with diverse rare cancers treated with combination ICB (n = 106 discovery cohort). We demonstrate that strain-resolved microbial abundances improve machine learning predictions of ICB response and 12-month progression-free survival relative to models built using species-rank quantifications or comprehensive pretreatment clinical factors. Through a meta-analysis of gut metagenomes from a further six comparable studies (n = 364 validation cohort), we found cross-cancer (and cross-country) validity of strain–response signatures, but only when the training and test cohorts used concordant ICB regimens (anti-PD-1 monotherapy or combination anti-PD-1 plus anti-CTLA-4). This suggests that future development of gut microbiome diagnostics or therapeutics should be tailored according to ICB treatment regimen rather than according to cancer type.

Main

The past decade has seen an ‘immuno-oncology revolution’ largely driven by the rapid uptake of immune checkpoint blockade (ICB) agents targeting cytotoxic T lymphocyte protein 4 (CTLA-4), programmed cell death protein 1 (PD-1) or programmed death ligand 1 (PD-L1, the ligand of PD-1). Combination ICB (CICB) targeting both PD-1 and CTLA-4 has demonstrated synergistic antitumor activity preclinically1 and is now an approved standard of care for patients with diverse cancers, including melanoma2, clear-cell renal cell carcinoma3, non-small cell lung cancer (NSCLC)4, mesothelioma5 and hepatocellular carcinoma6. However, this success is tempered by the unpredictable nature of responses (seen in only 20–60% of patients across these cancer indications7) and the more frequent severe immune-related adverse effects experienced with CICB when compared to anti-PD-1 or anti-PD-L1 monotherapy8. Thus, despite the promise it offers, the judicious use of CICB is paramount. Additionally, predictive biomarkers for tumor response and/or toxicity would be highly valuable to guide patient management.

Currently approved tumor-agnostic biomarkers for PD-1 blockade include tumor mutational burden and mismatch repair deficiency9; however, both have limitations and rely on available, contemporaneous tumor tissue. A promising ‘tumor-extrinsic’ avenue for predicting ICB response and/or toxicity a priori is assessing a patient’s baseline gut microbiome composition, referring to the community of microbiota (predominantly bacteria) resident within the gastrointestinal tract. Culture-free methods to taxonomically profile fecal microbiomes have progressed from low-resolution 16S rRNA gene sequencing to high-resolution shotgun metagenomics, with studies of clinical cohorts finding associations between baseline Akkermansia muciniphila (lung cancer)10,11,12,13 and Faecalibacterium prausnitzii (melanoma)14,15,16 fecal abundances and tumor responses among anti-PD-1 recipients. Unfortunately, previous meta-analyses across metagenomic studies have found limited reproducibility of these candidate microbial biomarkers for ICB response17,18,19,20. Although this poor reproducibility may be partly attributable to methodological or geographic differences between studies, we hypothesize that species-level taxonomic biomarkers may lack the precision necessary to capture the specific microbial traits associated with ICB response or nonresponse. For example, there is growing awareness of the diversity of intraspecies (strain) variation among commensal bacteria (such as A. muciniphila and F. prausnitzii), with diverging functional potentials and differing associations with host phenotypes21,22.

Here, we performed deep shotgun metagenomic sequencing of baseline fecal samples from patients on the CA209-538 clinical trial of ipilimumab (anti-CTLA-4) and nivolumab (anti-PD-1) for 106 patients with diverse rare cancers (our discovery cohort). Using a bespoke, genome-resolved metagenomics approach, we discovered baseline subspecies (strain-level) gut microbial abundance signatures of response that reproduce between cancer subtypes and externally to published CICB cohorts despite marked cohort heterogeneity. Notably, we found that the predictiveness of signatures trained on CICB cohorts does not extend to anti-PD-1 monotherapy cohorts. This suggests that, although tumor agnostic, different microbiota–host relationships are relevant to distinct ICB regimens.

Results

Clinical characteristics of the CA209-538 cohort

The CA209-538 clinical trial, titled A phase 2 trial of ipilimumab and nivolumab for the treatment of rare cancers, is a prospective, multicenter clinical trial (NCT02923934) that enrolled 120 patients with histologically confirmed advanced rare solid-organ cancers across five Australian hospital networks (Methods). Notably, patients had diverse tumor histologies grouped into three prespecified cohorts: upper gastrointestinal and biliary cancers (UGB), neuroendocrine neoplasms (NEN) and rare gynecological tumors (GYN). Most patients (n = 108) had received prior systemic anticancer therapies (median of one line (range 0–6 lines)). All participants were treated on trial with combination nivolumab and ipilimumab for up to four doses (induction), followed by nivolumab maintenance for up to 2 years or until progressive disease (PD) or unacceptable toxicity (Fig. 1a). The prespecified secondary endpoint of the trial was to develop ‘tumor-agnostic’ biomarkers for CICB response by leveraging the unique clinical trial design of CA209-538, which included patients with diverse cancers, but with highly standardized clinical and experimental procedures. Therefore, a pretreatment fecal sample was collected from most (n = 106) participants (Table 1). No major clinical differences were observed between microbiome-evaluable patients and those who were not sampled (Supplementary Table 1).

The clinical efficacy and safety outcomes for subgroups from CA209-538 have been published previously23,24,25,26. As expected, overall survival (OS) significantly differed by histology (Extended Data Fig. 1a); however, progression-free survival (PFS) was more consistent (Extended Data Fig. 1b). Notably, the percentage of patients with an objective response (complete response (CR) or partial response (PR)) was remarkably stable across histological cohorts (24–25%) (Fig. 1b), with the Response Evaluation Criteria in Solid Tumors (RECIST) 1.1 best overall response (BOR) being strongly associated with PFS and OS (Fig. 1c,d). Using univariable statistical testing, we found a strong positive monotonic association between albumin and BOR (Kendall P = 0.0056) and a negative monotonic association between neutrophil-to-lymphocyte ratio (NLR) and BOR (Kendall P = 0.0033) (Extended Data Fig. 1c). This was particularly driven by patients with rapid clinical progression (clinical PD (cPD)) having significantly lower albumin and higher NLR, both responses to inflammation shown to be strongly prognostic across cancer types and treatment settings27,28.

Microbiome profiling of baseline fecal samples

To understand the composition of patient gut microbiomes, we performed deep shotgun metagenomic sequencing of the 106 available baseline fecal samples (median 20.4 million paired-end reads per sample). For precise taxonomic quantification, we used a genome-resolved approach of first assembling a study-specific strain reference database using metagenome-assembled genomes (MAGs), supplemented with relevant Genome Taxonomy Database (GTDB) species reference genomes (SRGs) (Methods). Ultimately, this database included 1,397 strain genomes covering 904 known species and additionally included 34 ‘new’ strains that could be taxonomically classified only to the genus level. The Bowtie 2 alignment rates to our tailored strain reference library were high (median 88.4%), with a median of 10.2 million mapped paired-end reads (50%) passing stringent quality control and used for precise strain quantification (Supplementary Fig. 1 and Methods).

We first evaluated whether there were gross compositional differences based on the patients’ BOR. Notably, we found a positive monotonic association between BOR and the fecal Shannon diversity index, a common alpha diversity metric (Fig. 1e). Associations between alpha diversity and cancer patient outcomes have been found in the setting of patients receiving hematopoietic cell transplant29 or cervical cancer chemoradiation30 but not in anti-PD-1 recipients with metastatic melanoma16,18; thus, such associations may be treatment regimen specific. We then assessed intersample beta diversity using the Aitchison distance and also found gross microbial compositional differences by BOR group (permutational multivariate analysis of variance (PERMANOVA) P = 0.0319) (Fig. 1f). Indeed, among the 23 pretreatment clinical and technical metadata tested, BOR group was the metadata variable explaining the most microbial variance (Extended Data Fig. 1d). By contrast, patient PFS at 12 months (PFS12) or OS at 12 months was associated with little microbial variance. A PERMANOVA of baseline microbial variance versus a moving PFS threshold revealed a peak association at <4 months (Extended Data Fig. 1e), indicating that, in our cohort, patients with rapid progression had the most distinct gross baseline microbial compositions.

Strain–response signatures are valid across cancer types

Given the gross compositional differences, we hypothesized that specific strains may allow for prediction of CICB efficacy in our cohort. We assessed objective response versus progression (RvsP), defined as a RECIST BOR of CR or PR versus PD or cPD, as our primary endpoint. In doing so, we excluded patients with a BOR of stable disease (SD) (n = 29), given its ambiguity in a pan-cancer cohort, in which it may represent disease control or simply indolent cancer behavior. As a sensitivity analysis, we also evaluated PFS12, with responders and those with PFS12 largely overlapping given the durability of CICB efficacy (Extended Data Fig. 2a).

We used a supervised machine learning (ML) workflow (Fig. 2a). As input features (predictors), we tested the 15 potentially relevant clinical factors (Methods) and the microbial factors (centered log ratio (CLR)-transformed strain abundances) separately and combined to assess their relative and synergistic performance, respectively. In addition to strain-level rank, we tested microbial abundances aggregated to higher taxonomic ranks (species, genus and family levels) to determine the influence of taxonomic resolution on predictive performance. For each feature set, we performed a thorough random hyperparameter search across 1,000 iterations of a 20 times repeated fivefold cross-validation (Methods). For predictions, we used a random forest (RF) classifier, previously shown to generally outperform other classical ML algorithms for microbiome–host predictions31….

Sign up for our Newsletter