Genome Landscapes
: A Window into the Evolution of Human Viruses

Student thesis: Doc typesDocteur en Sciences


The genomic sequences of human viruses are the product of long-term host-virus coevolution. Exploring the nucleotide composition of these genomes offers the opportunity to unravel the dynamics involved in their evolution, which can be influenced by the cellular environment, like for instance by the APOBEC3 innate effectors.
The APOBEC3 cytidine deaminase family plays a crucial role in the human innate immune system by restricting the life cycle of viruses through viral genome editing at 5'-TC-3' dinucleotides. The resulting selective pressure can be observed in the genomes of human viruses. A constant but incomplete restriction by APOBEC3 leads to an underrepresentation of the APOBEC3 5’-TC-3’ target site. To identify the viral species targeted by APOBEC3, we explored the presence of the APOBEC3 footprint (i.e. TC depletion) among 33,500 human virus sequences. This extensive investigation revealed that at least 22% of the tested human virus species are targeted by APOBEC3. Importantly, we observed strong APOBEC3 footprint on a wide range of virus species, including ssDNA, dsDNA, ssRNA+, and retro-transcribed viruses. Additionally, by exploring the footprint at the genic level, we made a novel discovery of APOBEC3-mediated editing in the EBV herpesvirus and mastadenovirus. This investigation highlights the significant evolutionary constraints imposed by innate immune factors on the genomes of numerous human viruses. It provides a comprehensive view of the broad range of APOBEC3's action and highlights its importance in shaping viral evolution.
APOBEC3 editing is one of many mutational processes shaping virus evolution. We next intended to identify which are the other mutational processes using an approach without a priori. Albeit this task has not been completed yet, we laid the foundations for such analysis. Practically, we reconstructed 487 phylogenetic trees from 55 viral species spanning 23 families and the 7 Baltimore groups. Ancestral sequences were predicted for each node of the phylogenetic trees. By systematically comparing sequences to their ancestor, we generated a collection of over 2.4 million substitutions. For each of the 12 substitution types, the immediate 5’ and 3’ bases were taken into account, dividing the substitution types into 192 subclasses, the so-called the substitution landscape. For most of the viruses, we observed a high degree of symmetry within the substitution landscapes, where each substitution class appears to be canceled out by its opposite (e.g. the C>T substitutions are as frequent as the T>C). Recent zoonotic viruses, like the MERS-CoV, display an asymmetric landscape suggesting that their sequence did not reach equilibrium yet. We also observed that a significant proportion of the substitutions are back-and-forth, i.e. a succession of a first mutation followed by its reversion at a later time-point along the same branch. Surprisingly, the sole feature that distinguishes back-and-forth substitutions from non back-and-forth substitutions (called uncompensated substitutions) is their substitution rate (higher for the back-and-forth). We propose that reversion is a frequent phenomenon in viral history and may contribute to long-term viral sequence stability.
Taken together, these investigations contribute to a better understanding of the forces driving virus evolution and pave the way for the identification of yet unknown mutational processes.
la date de réponse7 juil. 2023
langue originaleAnglais
L'institution diplômante
  • Universite de Namur
SuperviseurNicolas Gillet (Promoteur), Benoit Muylkens (Président), Simon Dellicour (Jury), Jean-François Flot (Jury), Michael Herfs (Jury) & Philippe Lemey (Jury)

Contient cette citation