Molecular Medicine Israel

Host methylation predicts SARS-CoV-2 infection and clinical outcome

Abstract

Background

Since the onset of the SARS-CoV-2 pandemic, most clinical testing has focused on RT-PCR1. Host epigenome manipulation post coronavirus infection2,3,4 suggests that DNA methylation signatures may differentiate patients with SARS-CoV-2 infection from uninfected individuals, and help predict COVID-19 disease severity, even at initial presentation.

Methods

We customized Illumina’s Infinium MethylationEPIC array to enhance immune response detection and profiled peripheral blood samples from 164 COVID-19 patients with longitudinal measurements of disease severity and 296 patient controls.

Results

Epigenome-wide association analysis revealed 13,033 genome-wide significant methylation sites for case-vs-control status. Genes and pathways involved in interferon signaling and viral response were significantly enriched among differentially methylated sites. We observe highly significant associations at genes previously reported in genetic association studies (e.g. IRF7OAS1). Using machine learning techniques, models built using sparse regression yielded highly predictive findings: cross-validated best fit AUC was 93.6% for case-vs-control status, and 79.1%, 80.8%, and 84.4% for hospitalization, ICU admission, and progression to death, respectively.

Conclusions

In summary, the strong COVID-19-specific epigenetic signature in peripheral blood driven by key immune-related pathways related to infection status, disease severity, and clinical deterioration provides insights useful for diagnosis and prognosis of patients with viral infections.

Plain language summary

Viral infections affect the body in many ways, including via changes to the epigenome, the sum of chemical modifications to an individual’s collection of genes that affect gene activity. Here, we analyzed the epigenome in blood samples from people with and without COVID-19 to determine whether we could find changes consistent with SARS-CoV-2 infection. Using a combination of statistical and machine learning techniques, we identify markers of SARS-CoV-2 infection as well as of severity and progression of COVID-19 disease. These signals of disease progression were present from the initial blood draw when first walking into the hospital. Together, these approaches demonstrate the potential of measuring the epigenome for monitoring SARS-CoV-2 status and severity.

Introduction

Coronaviruses (CoV) comprise a large group of human and animal pathogens, including the novel enveloped RNA betacoronavirus referred to as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)5. This pathogen is associated with coronavirus disease 2019 (COVID-19) first identified in Wuhan, China in 20196 and declared a pandemic on March 11, 20207. Since the onset of the pandemic, multiple tests for diagnosing COVID-19 have been launched, including real-time reverse transcriptase–polymerase chain reaction (RT-PCR), specific antibody detection, and next-generation sequencing assays that query for current or past infections1. With the exception of next-generation sequencing, which can discern viral subtypes, most diagnostic tests are viral strain dependent, can carry a high false negative rate, do not discern if the virus is viable and replicating, and do not predict clinical outcomes of infection1,8,9. For example, pre-symptomatic patients may test negative10,11 while patients who have recovered may continue to test positive though they are no longer infectious12. Accurate diagnostics are urgently required to control continued communal spread, to better understand host response, and for the development of vaccines and antivirals13.

Individuals infected with SARS-CoV-2 have a variable course of infection, ranging from asymptomatic to death. Although the fatality rate varies tremendously according to demographic characteristics and co-morbidities14, the U.S. ranks as one of the countries with the highest COVID-19 mortality rates15. Identification of which SARS-CoV-2-infected patients are most likely to develop severe disease would enable clinicians to triage patients via augmented clinical decision support. Having more information on disease severity has recently become critical due to widespread lack of hospital and intensive care unit (ICU) capacity, necessitating difficult decisions about resource triage. To our knowledge, no test can predict COVID-19 clinical course or severity, although work on cytokine abundance ratios after hospitalization has been proposed as a prognostic indicator of severe outcomes16

Sign up for our Newsletter