Molecular Medicine Israel

Mutations in the monkeypox virus replication complex: Potential contributing factors to the 2022 outbreak

Highlights

L108F in MPXV DNA polymerase emerged in the 2022 outbreak.•

L108F can enhance DNA binding affinity, processivity and drug sensitivity of F8L.•

Mutation L108F can change the fidelity and sensitivity to nucleoside inhibitors.•

G9R mutations that emerged in 2022, are likely to affect interaction of G9R with E4R.

Abstract

Attributes contributing to the current monkeypox virus (MPXV) outbreak remain unknown. It has been established that mutations in viral proteins may alter phenotype and pathogenicity. To assess if mutations in the MPXV DNA replication complex (RC) contribute to the outbreak, we conducted a temporal analysis of available MPXV sequences to identify mutations, generated a DNA replication complex (RC) using structures of related viral and eukaryotic proteins, and structure prediction method AlphaFold. Ten mutations within the RC were identified and mapped onto the RC to infer role of mutations. Two mutations in F8L (RC catalytic subunit), and two in G9R (a processivity factor) were ∼100% prevalent in the 2022 sequences. F8L mutation L108F emerged in 2022, whereas W411L emerged in 2018, and persisted in 2022. L108 is topologically located to enhance DNA binding affinity of F8L. Therefore, mutation L108F can change the fidelity, sensitivity to nucleoside inhibitors, and processivity of F8L. Surface exposed W411L likely affects the binding of regulatory factor(s). G9R mutations S30L and D88 N in G9R emerged in 2022, and may impact the interaction of G9R with E4R (uracil DNA glycosylase). The remaining six mutations that appeared in 2001, reverted to the first (1965 Rotterdam) isolate. Two nucleoside inhibitors brincidofovir and cidofovir have been approved for MPXV treatment. Cidofovir resistance in vaccinia virus is achieved by A314T and A684V mutations. Both A314 and A684 are conserved in MPXV. Therefore, resistance to these drugs in MPXV may arise through similar mechanisms.

1. Introduction

According to the Centers for Disease Control and Prevention (CDC), the recent Monkeypox virus (MPXV) outbreak has spread to 106 countries (as of September 26, 2022), of which 99 had not previously reported monkeypox (MPX) cases (https://www.cdc.gov/poxvirus/monkeypox/response/2022/world-map.html). A total of 65,415 confirmed cases have been reported as of September 26, 2022 around the world. Due to the rapid spread of MPXV, the FDA has granted emergency use authorization of the Jynneos smallpox vaccine, and drugs tecovirimatbrincidofovir, and cidofovir for MPXV. The Jynneos vaccine, like all smallpox vaccines, is an attenuated vaccinia virus (VACV) [1,2].

MPXV (genus Orthopoxvirus; family Poxviridae) has a double-stranded DNA genome of ∼196,858 base pairs that encodes ∼200 proteins [3], including F8L, which is a family B DNA polymerase (DNA pol). DNA polymerases are critical enzymes for the replication and repair of genomic DNA among various organisms, spanning from archaea to mammals. These enzymes synthesize DNA in template-dependent and -independent manners. DNA polymerases have been classified into seven families: A, B, C, D, X, Y, and RT (reverse transcriptase), based on amino acid conservation and structural homology [[4][5][6]]. DNA polymerases that showed sequence homology with E.coli DNA pols I, II, and III were classified into families A, B, and C, respectively. Family D pols are unique to archaea, whereas eukaryotic DNA pols β, λ, μ, and TdT (terminal transferases) belong to family X. DNA pols sharing sequence homology with E.coli pols IV/V belong to family Y. The RT family encompasses DNA pols from retroviruses and eukaryotic telomerase. Regardless of the family, all DNA pols have conserved active site and template-primer (TP) binding modes [7,8]. Family B DNA pols are found in a wide range of organisms (eukaryotes, prokaryotesbacteriophages, viruses, etc.) [5]. In eukaryotes, at least four DNA pols (α, δ, ε, and ζ) belong to family B. Additionally, viruses such as herpes simplex virus 1 (HSV1), human cytomegalovirus (HCMV), and poxviruses, encode family B DNA pols.

Mutations in DNA pols can alter the phenotype of the virus. For example, a point mutation, N752D, in equine herpes virus 1 (EHV-1) polymerase causes inflammation in the central nervous system and causes poor muscle control leading to gauche movements termed ataxia [9]. In VACV E9 (DNA polymerase catalytic subunit), mutations A498V, A684V, and S851Y in VACV confer an increased mutation frequency in forward-mutagenesis screens [10,11]. The three residues A498, A684 and S951 are conserved between MPXV F8L (DNA polymerase) and VACV E9. To assess if such mutations exist in the MPXV DNA RC, and whether these mutations are contributing to the current outbreak, we conducted a temporal sequence analysis of available MPXV isolates to identify mutations. We leveraged existing data including the crystal structure of VACV E9 protein (family B pol) [12], which shares ∼98% homology with MPXV F8L, to generate a molecular model of MPXV F8L. Additionally, a number of solved structures comprising pols α, δ, ε, ζ, HSV1 pol in complex with TP and inhibitor [[13][14][15][16][17]], and a state-of-the-art protein structure prediction method, AlphaFold [18], was used to build a DNA RC, and to infer the impact of mutations in 2022 MPXV viruses as well as predict the efficacy of existing vaccines and antivirals in mitigating the 2022 MPXV outbreak.

2. Materials and methods

2.1. Sequence acquisition

The sequences used in the analysis were acquired from the NCBI nucleotide sequence database through the NCBI Virus portal. As of June 28, 2022, all available and complete MPXV nucleotide sequences were downloaded from the portal (n = 205). The sequence dataset included sequences from over 20 countries, during several outbreaks. This included the earliest MPXV sequence in 1965 from Rotterdam, Netherlands, and several sequences from the 2022 outbreak (n = 122).

2.2. Sequence processing and analysis

Using Python, an in-house bioinformatics pipeline was developed to rapidly process, and align MPXV sequences [19]. All sequences were compared to a monkeypox virus sequence that was isolated at a Rotterdam Zoo in 1965 (referred to as 1965 isolate hereafter) (NCBI: KJ642614.1). The sequences were cropped to the gene in focus using coordinates provided by NCBI and translated to protein sequences automatically through Expasy [20]. After translation, each sequence was compared with that of the reference protein sequence using the Biopython library [21]. Comparisons were deposited into a data file for further interpretation and the process was applied for the various genes studied (F8L, A22R, E5R, E4R, G9R). The pipeline code is available at GitHub (https://github.com/bluesk1/RapidSequenceAnalysis-MPXV). Using the resultant data files from the pipeline, mutations in 2001 (NC_003310.1), and 2018 (NC_063383.1) were mapped. For the 2022 outbreak, the most prevalent mutations (i.e., present in ≥50% of sequences) in 2022 sequences were included.

Sequences with collection in our dataset were divided into four time periods: 1965–1999 (n = 11), 2000–2009 (n = 26), 2010–2020 (n = 17), and 2021–2022 (n = 126). The sequences from each time period were independently processed through the above-mentioned pipeline to identify mutation prevalence. An in-house circos configuration script (available upon request) was used for the generation of the circos representation of mutations within each time period.

2.3. Construction of MPXV replication complex

A low-resolution structure of VACV DNA replication machinery consisting of E9, A20, D4, and D5 has been reported [22]. A PCNA ortholog in VACV has also been identified [23]. Since PCNA is an integral part of the eukaryotic DNA replication machinery, the PCNA ortholog in MPXV (G9R) is likely a part of the MPXV replication fork. Therefore, G9R was added as a fifth component of the MPXV replication machinery, and a minimal MPXV replication fork was constructed in multiple steps as detailed below.

2.3.1. Modeling of the structures of replication complex components

Due to high homology between VACV E9 and MPXV F8L, a homology-derived structure of F8L was constructed by Modeller software [24] using the crystal structure of VACV E9 DNA pol (PDB entry 5N2E) [12] as a template. Similarly, the structure of MPXV E4R in complex with A22R (1-50) was modeled using the crystal structure of the VACV D4/A20 N-terminal (1-50) complex [25]. The structure of the A22R homolog in VACV (A20) is only partially solved. Therefore, we used AlphaFold [18], a state-of-the-art deep-learning molecular modeling program, to generate a molecular model of A22R. To generate a trimeric structure of G9R, we used ColabFold [26], which is based on AlphaFold. Since a high-resolution structure of VACV helicase is not known, and AlphaFold is computationally expensive for generation of a hexamer of VACV/MPXV helicase, we omitted MPXV E5R in our analyses.

2.3.2. Assembly of the RC

To assemble a reliable RC, we first superposed the VACV E9-insert 3 peptide fused to the C-terminal domain of VACV A20 (PDB entry 6ZXP) [27] onto the helix of insert 3 in the modeled structure of MPXV F8L. Next, AlphaFold-generated A22R was superposed onto the A20/E9-insert 3 peptide (PDB entry 6ZXP). Following this step, the crystal structure of the VACV D4 in complex with the A20 N-terminus (PDB entry 4OD8) [25] was superposed on A22R. To this structure, the molecular model of MPXV E4R in complex with the N-terminus of A22R was superposed. Thus, we obtained a complex consisting of F8L, A22R and E4R. To obtain the position of TP and G9R, we superposed the palm subdomain of the S.cerevisiae pol δ holoenzyme consisting of TP, accessory subunits (Pol31 and Pol32), and the PCNA clamp (PDB entry 7KC0) [13] onto the palm subdomain of modeled MPXV F8L. We then superposed the modeled structure of G9R on the PCNA structure. This superposition provided an approximate position of G9R in the MPXV RC. Minor adjustments were made to accommodate for bad contacts of G9R with A22R. A similar guide for the helicase position in VACV, and therefore in MPXV was not available. Hence, MPXV E5R was not included in the MPXV RC.

3. Results

3.1. Mutations in the MPXV RC

We conducted a temporal analysis of available MPXV sequences (n = 204) and identified mutations within MPXV RC components F8L, E5R, A22R, E4R, and G9R (see below) with respect to the 1965 isolate (Table 1), which was considered here as a reference isolate. A total of 10 residue positions, where mutations emerged at different time periods were identified (Table 1). Four mutations (2 in F8L, and 2 in G9R) were most prevalent in the 2022 isolates. F8L mutation L108F emerged in 2022, whereas mutation W411L emerged in 2018 and persisted in 2022 isolates. G9R mutations S30L and D88 N emerged in 2022 isolates. Six mutations that appeared in the 2001 outbreak reverted to the 1965 sequence, and revertant mutations persisted in 2022 isolates (Table 1Fig. 1)….

Sign up for our Newsletter