Molecular Medicine Israel

High-confidence cancer patient stratification through multiomics investigation of DNA repair disorders

Abstract

Multiple cancer types have limited targeted therapeutic options, in part due to incomplete understanding of the molecular processes underlying tumorigenesis and significant intra- and inter-tumor heterogeneity. Identification of novel molecular biomarkers stratifying cancer patients with different survival outcomes may provide new opportunities for target discovery and subsequent development of tailored therapies. Here, we applied the artificial intelligence-driven PandaOmics platform (https://pandaomics.com/) to explore gene expression changes in rare DNA repair-deficient disorders and identify novel cancer targets. Our analysis revealed that CEP135, a scaffolding protein associated with early centriole biogenesis, is commonly downregulated in DNA repair diseases with high cancer predisposition. Further screening of survival data in 33 cancers available at TCGA database identified sarcoma as a cancer type where lower survival was significantly associated with high CEP135 expression. Stratification of cancer patients based on CEP135 expression enabled us to examine therapeutic targets that could be used for the improvement of existing therapies against sarcoma. The latter was based on application of the PandaOmics target-ID algorithm coupled with in vitro studies that revealed polo-like kinase 1 (PLK1) as a potential therapeutic candidate in sarcoma patients with high CEP135 levels and poor survival. While further target validation is required, this study demonstrated the potential of in silico-based studies for a rapid biomarker discovery and target characterization.

Introduction

Maintenance of genomic integrity has a pivotal role in preventing the development of age-associated diseases such as cancer and neurodegeneration. Exposure of somatic cells to multiple endogenous and exogenous stressors results in the accumulation of unrepaired DNA lesions and rearrangements, leading to overall genome instability, that is a hallmark of cellular transformation and cancer progression [12]. Molecular mechanisms underlying this condition include alterations in the DNA repair machinery, replication stress, altered transcriptional responses and changes in cell cycle regulation [3,4,5]. Multiple types of solid and bone marrow malignancies display distinct defects in certain pathways of the DNA damage response (DDR), and several therapeutic strategies targeting repair mechanisms have been previously developed and validated in clinical settings [67]. For example, higher levels of genome instability are seen in breast cancer cells carrying mutations in BRCA genes, which play a critical role in double-stranded breaks repair. BRCA-deficient cells with defective homologous recombination, rely on more error-prone non-homologous end joining repair and are sensitive to PARP inhibitors, providing a strategy for selectively inducing synthetic lethality in cancer cells [8]. Additionally, defects in DNA mismatch repair genes MLH1 and MSH2, associated with a subset of colorectal tumors with microsatellite instability, may lead to abundant mutation-derived neoantigens that trigger a robust immune response to checkpoint inhibitors therapy [910]. Bone marrow-derived cancers are also characterized by mutations in key DNA damage response and DNA repair genes. For instance, mutations in ATM, a key gene for DDR activation, and TP53 have frequently been detected in several types of lymphomas [1112].

Importantly, several premature aging diseases caused by genetic impairments in DNA repair machinery are also associated with increased cancer risks [1314]. For example, inherited mutations in the ATM lead to ataxia-telangiectasia (A-T), a rare premature aging disease with features of neurodegeneration and increased risks of developing lymphomas and various solid malignancies, including breast and digestive tract cancers [14,15,16,17]. Another example includes Nijmegen Breakage syndrome (NBS), where mutations in NBS1 gene, a member of the MRE11-RAD50-NBS1 (MRN) complex serving as sensor of DNA damage, lead to immunodeficiency and higher risk of developing cancer [1318,19,20,21,22]. Furthermore, mutations in the RecQL DNA helicase WRN may lead to Werner syndrome, evidenced by increased incidence of cardiovascular diseases and cancer development, in particular sarcomas, skin and thyroid malignancies. While heritable diseases with impaired DNA repair function are characterized by a significant genetic and phenotypic variability between each other [1314], increased cancer risk is a clinical phenotype shared across multiple DNA repair disorders. Since not all patients with DNA repair disorders develop malignant diseases, identification of altered “cancer-prone” genes associated with tumorigenic processes could, therefore, lead to the discovery of novel cancer risk stratification biomarkers and subsequent therapeutic targets.

Identification of therapeutic targets is a crucial step of the drug discovery process. Erroneous targets selected at the early stage of drug development may result in a costly drug discovery program and failed clinical trials. While development of the automated approaches for drug target discovery is critical for maximizing the success rate, it still remains a challenging task due to a number of limitations, such as complexity of the data, batch effects and others. While these challenges cannot be resolved by the traditional methods, such as gene expression arrays, artificial intelligence (AI)-driven approaches have recently demonstrated their efficacy in this setting across multiple diseases including embryonic-fetal transition [23] and muscle aging [24]. Advanced pathway analysis and AI algorithms applied to multiomics data are capable of identifying novel targets and biomarkers even when the prior evidence is insufficient, especially when it comes to the most frequently available dynamic omics data including gene expression and proteomics [25,26,27] as well as not as abundant data types such as phosphorylome [28] and even microbiome [29]. Moreover, AI has also been successfully applied to already existing targets where crystal structures are not available [30].

In the current paper, we applied a three-tier approach where (1) knowledge about diseases with cancer prevalence enabled (2) identification of biomarker genes and (3) subsequent discovery of possible therapeutic targets (Fig. 1). We took advantage of the cancer-prone phenotype overlapping between diverse DNA repair diseases with discrete phenotypic prevalence (neurodegeneration, immunodeficiency and cardiovascular disease), and applied differential gene expression analysis driven by AI-based PandaOmics platform to identify those genes that are commonly perturbed among selected diseases and that could be associated with cancer progression. The most significantly dysregulated gene CEP135 was further discovered to be used as a novel biomarker that stratifies sarcoma patients with better and poor survival outcomes among the TCGA database of 33 various cancer types (Fig. 1). Furthermore, using PandaOmics-based TargetID we revealed gene candidates that could be used as targets for drug discovery for more efficient elimination of cancer cells in sarcoma patients with high expression of CEP135 and lower survival probability….

Sign up for our Newsletter