Abstract
RNA genomes are vulnerable to corruption by a range of activities, including inaccurate replication by the error-prone replicase, damage from environmental factors, and attack by nucleases and other RNA-modifying enzymes that comprise the cellular intrinsic or innate immune response. Damage to coding regions and loss of critical cis-acting signals inevitably impair genome fitness; as a consequence, RNA viruses have evolved a variety of mechanisms to protect their genome integrity. These include mechanisms to promote replicase fidelity, recombination activities that allow exchange of sequences between different RNA templates, and mechanisms to repair the genome termini. In this article, we review examples of these processes from a range of RNA viruses to showcase the diverse approaches that viruses have evolved to maintain their genome sequence integrity, focusing first on mechanisms that viruses use to protect their entire genome, and then concentrating on mechanisms that allow protection of the genome termini, which are especially vulnerable. In addition, we discuss examples in which it might be beneficial for a virus to ‘lose’ its genomic termini and reduce its replication efficiency.
Maintenance mechanisms to protect the entire virus genome
RNA virus genome replication is performed by the viral replicase complex, which for most viruses is likely to be an assembly of multiple viral and cellular proteins. The catalytic subunit of this complex is referred to as the RNA-dependent RNA polymerase (RdRp). RdRps are notoriously error-prone and so RNA virus genomes are subject to potentially catastrophic alteration from their own error-prone RNA-synthesis machinery. In addition, some non-segmented, negative-strand RNA viruses might be susceptible to insertions within the coding region of the genome as a consequence of their editing programme. Viral genomes may also be susceptible to physical or chemical damage by environmental factors (Fig. 1⇓). Below, we describe examples of mechanisms that viruses have evolved to overcome these insults.
Summary of the mechanisms involved in genome maintenance and repair. Mutations (indicated by red lightning symbols) can occur either internally on a viral RNA genome or at the extreme termini. The figure shows mechanisms by which mutations at these two regions on the genome could occur and possible mechanisms by which the different types of mutation could be repaired. Details are described in the text.
RdRp proofreading
The error rate of the viral RdRp is estimated to be between 1.5×10−3 bp−1 (Qβ phage) and 7.2×10−5 bp−1 (influenza virus) (Drake, 1993). This relaxed polymerase fidelity is an important facet of RNA virus biology, providing a source of sequence diversity that can allow virus quasispecies to form, enabling the virus to adapt successfully to changing environments. Indeed, it has been shown that a mutant RdRp with increased fidelity, whilst able to replicate efficiently in cell culture, was unable to replicate efficiently in the complex environment of an animal host because of its inability to yield virus variants (Vignuzzi et al., 2006). However, the flip side of the coin is that polymerase error can also lead to the generation of non-viable templates that reduce overall viral fitness. The term ‘error catastrophe’ has been coined to describe the outcome of an RdRp error rate at which too many non-viable templates are generated and a virus population becomes unsustainable, and it is believed that many viral RdRps operate close to this threshold (Crotty & Andino, 2002). Conventional wisdom holds that the RdRp error rate constrains viral RNA genome sizes to a relatively low upper limit. However, there is a significant range in RNA virus genome length, with the filoviruses and some paramyxoviruses having genome lengths of 18–19 kb, more than 7-fold longer than the genomes of the smallest RNA viruses, and the coronaviruses being even larger, with maximum genome sizes between 27 and 32 kb. Viruses with longer genome lengths probably require RdRps with inherently greater fidelity, which could be conferred passively by RdRp structure, with other factors such as genome structure also playing a role (Castro et al., 2005; Duffy et al., 2008).
Interestingly, recent evidence suggests that coronaviruses might also have an active mechanism to promote fidelity. The nsp14 protein of these viruses possesses sequence motifs that are analogous to the active-site motifs of cellular exonuclease enzymes, and the nsp14 protein of severe acute respiratory syndrome (SARS)-related coronavirus has been shown to have 3′→5′ exonuclease activity in vitro (Chen et al., 2007; Minskaia et al., 2006). Evidence that this protein functions as a proofreading enzyme comes from the fact that mutant viruses that have a defective nsp14 protein are more error-prone than wild-type virus by a factor of at least 15-fold (Eckerle et al., 2007). These data suggest that nsp14 acts as an exonuclease to remove and repair misincorporated nucleotides, thus helping to maintain the sequence integrity of the large coronavirus RNA genomes.
RNA editing and the ‘rule of six’
Some non-segmented, negative-stranded RNA viruses possess a specialized ‘editing’ sequence that can allow the RdRp to sometimes add non-templated nucleotides during mRNA transcription. Because the RdRp sometimes copies the template faithfully and sometimes adds non-templated nucleotides, this results in generation of qualitatively distinct transcripts from the same gene (Cattaneo et al., 1989; Galinski et al., 1992; Sanchez et al., 1996; Thomas et al., 1988; Vidal et al., 1990). This mechanism allows the expression of multiple polypeptides from a single transcriptional unit and is essential for the virus to express its full complement of proteins. However, if non-templated additions were to occur during genome replication, this could lead to accumulation of defective genomes, which could be lethal for the virus. Interestingly, most (but not all) viruses that have an RNA-editing function share a strict requirement for their genome nucleotide length to be divisible by six, a phenomenon known as the ‘rule of six’ (Calain & Roux, 1993; Kolakofsky et al., 1998). In these viruses, the viral nucleocapsid protein (N) encapsidates the replicative RNA in a 5′→3′ direction as the RNA is being synthesized and remains associated with the genomic and antigenomic RNA throughout the virus replication cycle. The ‘rule of six’ is thought to reflect the stoichiometry of the RNA nucleotides to the N protein (Egelman et al., 1989). The viral RdRp recognizes RNA sequences within the context of the nucleocapsid structure, and the promoter sequences must be phased correctly in relation to each N protein monomer for efficient virus replication to occur (Vulliemoz & Roux, 2001). These observations led to a model proposed by Kolakofsky et al. (2005), which is that, if non-templated insertions were to occur during genome replication, generating genomes of non-hexameric length, the consequent disruption of N–RNA phase in the promoter region would preclude these mutant replication products from acting as templates for further rounds of replication (Fig. 2⇓). If this model is correct, then the ‘rule of six’ is a non-catalytic ‘screening’ mechanism to eliminate genomes containing non-templated insertions from the genome pool and to help maintain genomic integrity in the virus population.
The ‘rule of six’ and genome replication in the subfamily Paramyxovirinae. (a) Accurate genome replication yields an antigenome RNA that is the exact complement of the genome, which places the RNA promoter sequence at the 3′ end of the antigenome in the correct context relative to the associated N protein (red ovals). This allows the RdRp (grey) to bind specific nucleotides in the promoter (in upper case) to initiate genome RNA synthesis. (b) If the RdRp stutters at the transcriptional editing site during genome replication, it will insert non-templated residues in the RNA product (in this example, 1 nt). This results in an antigenome in which the RNA sequence is displaced by 1 nt relative to the N protein, thus preventing the RdRp from recognizing the promoter efficiently.
Repair of alkylation damage to genomic RNA
Virus genomes may also be susceptible to chemical modification. One such modification is alkylation, the addition of carbon chains to N and O atoms in RNA bases, in the form of either methylation (addition of a single carbon) or addition of longer carbon chains. Alkylation could inhibit genome expression and replication and might cause base-mispairing to occur, resulting in an increased mutation rate. Alkylating agents can be found in the environment and inside cells, and prokaryotes and eukaryotes have evolved mechanisms to protect their genomic DNA from their effects. One of these mechanisms involves a protein called AlkB, which has homologues in all multicellular organisms, as well as in some bacteria and fungi, and which has been shown to repair damage to RNA, in addition to DNA (Aas et al., 2003). Bioinformatic analysis revealed that a number of different positive-strand RNA plant viruses contain an AlkB domain, and functional studies have shown that some of these domains can repair RNA damage by oxidative demethylation (Bratlie & Drablos, 2005; van den Born et al., 2008). These findings suggest that some viruses have acquired this protective mechanism and that it has a role in the virus replication cycle. Interestingly, the presence of the AlkB domains does not correlate with specific virus taxonomic lineages, suggesting that the gene was acquired relatively recently, and it has been suggested that it exists within viruses that are highly exposed to alkylating environments, such as pesticides and/or the phloem of woody or perennial plants (Bratlie & Drablos, 2005; van den Born et al., 2008).
Recombination in RNA viruses
RNA recombination is a property of positive- and negative-stranded RNA viruses that infect plant (Bujarski & Kaesberg, 1986), animal (Copper et al., 1974) and bacterial (Munishkin et al., 1988) hosts. Recombination in a viral context generally occurs when a replicating RdRp stops copying one RNA strand and transfers to another (Copper et al., 1974). When the transfer sites share common sequences, the recombination event is said to be homologous; conversely, when the sites are unrelated, the recombination event is non-homologous. Recent evidence suggests that recombination of the tripartite, positive-stranded RNA virus brome mosaic virus (BMV) is exceedingly common. By analysing the progeny of co-infections with marked genomes, observed and calculated BMV recombination frequencies implied that each replicating RNA molecule recombined at least once during the amplification of its lineage, and thus few viral progeny genomes contained nucleotides copied from a single parental template (Urbanowicz et al., 2005).
In addition to being a source of sequence diversity, RNA recombination has been shown to be a mechanism that also allows genome repair, a property that was recognized early on in studies of virus evolution, and was predicted to allow RNA viruses to repair defective or weakened genomes in the face of intense selective pressure. Recombination events could allow restoration of genomes damaged by a number of insults, including RdRp error, chemical modification and damage from UV light (Fig. 1⇑). Genome repair by recombination has been reported for several viruses and can be extremely efficient. Bacteriophage MS2 possesses a critical secondary-structure element at an internal location within the genome, between the maturation and coat protein genes, that is involved in regulating coat protein expression. Disruption of this structural element by deleting 19 nt, including the Shine–Dalgarno sequence, reduced virus titres by a massive 10 orders of magnitude, demonstrating the critical nature of this motif. However, despite the removal of such a large sequence, revertants capable of amplifying to high virus titres arose within overnight cultures, and analysis of the revertant MS2 RNA sequence showed that the structural element had been repaired and the Shine–Dalgarno sequence restored. Sequence analysis of the revertant suggested that repair was mediated through recruitment of additional nucleotides through a non-homologous recombination event (Olsthoorn & van Duin, 1996). Additional examples of internal sequence repair are the repair of 87 missing nucleotides of the mouse hepatitis virus nucleocapsid gene (Koetzner et al., 1992) and the repair of missing nucleotides of the bacteriophage Qβ replicase gene (Palasingam & Shaklee, 1992). In both instances, the missing nucleotides were supplied through homologous recombination from RNA templates supplied in trans and, whilst these systems may not reproduce the conditions of natural infections, they clearly highlight the capability of the respective viral RdRps to perform recombination-mediated repair given the presence of suitable donor and acceptor templates. The RdRp of the positive-stranded Sindbis virus is also thought to be capable of both homologous and non-homologous recombination, as demonstrated by the production of replication-competent genomes from donor and acceptor templates that were individually replication-incompetent (Raju et al., 1995). In a similar way, homologous recombination of two partially deleted RNA3 variants by the cowpea chlorotic mottle virus RdRp was shown to be responsible for rapid and frequent generation of an intact RNA3 sequence (Allison et al., 1990). It is also worth noting that recombination can occur very close to the end of templates to repair terminal sequences. An experiment with mutated bunyavirus templates showed that its RdRp was capable of correcting promoter deletions of 2 or 15 nt. This repair was dependent on the presence of an intact promoter supplied in tandem, which was lost during the repair process. These findings suggested that the bunyavirus RdRp could initiate on the external intact promoter and then ‘jump’ to the damaged internal promoter sequence, thus generating a corrected product (Walter & Barr, 2010). Whilst this is an artificial situation, it suggests that the bunyavirus RdRp would have the capability to repair damaged termini, provided that there was a source of intact templates for it to initiate on. Taken together, these experiments serve to demonstrate the promiscuous nature of some viral RdRps during genome replication and the potential benefit that might occur.
These varied examples of recombination have one aspect in common: all are mediated by the viral RdRp and the recombination event occurs co-transcriptionally during copying of the RNA template. However, recent evidence with poliovirus now indicates that recombination can occur in the absence of active polymerase. By introducing purified poliovirus genomes split into two fragments, either with or without sequence overlap, intact genomes were recovered, despite neither fragment possessing an intact polymerase-encoding region. Although the mechanism responsible remains elusive, the absence of viral polymerase involvement strongly implies a role for host-cell components (Gmyl et al., 1999, 2003). Whether this non-replicative recombination acts during the course of natural infections is unknown. However, this mechanism possesses the potential not only to repair truncated genomes, but also to act as a potent source of sequence variation through combining both host and viral RNA sequences.
It should be noted that, whilst some RNA viruses demonstrate promiscuous recombination, in others it is a rare event. Template jumping or switching by the RdRp of non-segmented, negative-strand RNA viruses can occur, as demonstrated by the production of defective interfering particles during high-titre virus amplification (Huang & Baltimore, 1970; Lazzarini et al., 1981). There are also examples of sequence duplications arising in circulating viruses, consistent with the idea that the RdRp can ‘jump’ either within or between templates (McClure et al., 1992; Trento et al., 2003). In addition, the observation of significant phylogenetic incongruence in the genes of Ebola, mumps, Hantaan and Newcastle disease viruses also lends support to the idea that recombination may play a role in negative-strand RNA virus evolution (Chare et al., 2003; Han et al., 2008a; Wittmann et al., 2007). However, there are very few examples of recombination-mediated repair for this group of viruses. An experiment aimed specifically at testing the capacity for repair was performed with respiratory syncytial virus. Despite experimental conditions that were optimized deliberately to detect recombination, only one functional recombinant virus was generated in six co-infection experiments (Spann et al., 2003). Similarly, sequence analysis of circulating strains of influenza virus, a segmented, negative-strand RNA virus, demonstrated that intragenic recombination occurs only rarely (Boni et al., 2008; Han et al., 2008b). The infrequency of intragenic recombination in the negative-strand RNA viruses could be a consequence of their genome template structure, in which the RNA remains encapsidated throughout the replication cycle; however, there are examples of positive-strand RNA viruses that also demonstrate negligible levels of recombination (Taucher et al., 2010). Thus, efficient RNA recombination might be an activity that a subset of viruses has evolved that provides them with the opportunity for genetic variation and genomic repair.
Mechanisms to protect and restore the genome termini
The terminal sequences of RNA virus genomes are particularly vulnerable to deletion or degradation. RNA viruses have evolved sophisticated mechanisms to avoid truncation of terminal sequences during replication initiation and termination (Poranen et al., 2008; Tayon et al., 2001; van Dijk et al., 2004). However, despite these measures, the termini of virus genomes remain susceptible to host cell-mediated RNA degradation, either through specific antiviral host proteins (Bick et al., 2003; Silverman, 2007) or due to the components of the normal host-cell RNA-biosynthesis machinery (reviewed by Houseley & Tollervey, 2009). Virus genomes possess various adaptations helping them to evade these nuclease activities. For example, some positive-strand RNA virus genomes have a covalently bound cap, poly(A) tail and/or protein at their termini, and the genomes of negative-strand RNA viruses are sequestered within a helical nucleocapsid throughout the infection cycle. In addition, many RNA viruses have been shown to replicate in membraneous compartments that might offer protection from nucleases (Mackenzie, 2005). As a second line of defence, RNA viruses have evolved means to repair their genome termini in the event that they are damaged (Fig. 1⇑). Examples of terminal-repair mechanisms span numerous virus taxonomic lineages (Table 1⇓), indicating that possession of a terminal-repair activity is a fundamental element of RNA virus molecular biology. The sections below describe examples of these repair mechanisms.
Examples of terminal repair in RNA viruses, with suggestions of possible repair mechanisms
Repair as a consequence of the initiation process
Because virus RNA genomes are linear, they require replication mechanisms that allow either initiation opposite the 3′-terminal nucleotide of the template or recreation of terminal nucleotides during the initiation process. RNA viruses utilize a variety of strategies to achieve this, including protein-primed initiation, prime/realign mechanisms (in which the RdRp initiates internally on the template and then realigns the nascent RNA and utilizes it as a primer), and RdRps structured to ensure initiation opposite the 3′ terminus of a single-stranded template with a strong preference for the correct initiating NTP (reviewed by van Dijk et al., 2004). In some cases, these highly coordinated mechanisms not only facilitate accurate replication, but might also offer the possibility of repairing termini lacking a small number of nucleotides. For example, for several viruses, such as the picornaviruses coxsackie B3 virus (CB3V) and poliovirus and the potyviruses potato virus Y and plum pox virus, it has been shown that small deletions of 1 or 2 nt from genome termini are restored to wild-type sequence (Harmon et al., 1991; Jakab et al., 1997; Klump et al., 1990; Simon-Buela et al., 2000). Whilst the mechanism(s) for repair in these cases has not been established and could involve one of the mechanisms described in the sections below, the small size of the deletions could potentially allow repair as a consequence of replication initiation. For example, picornavirus RNA replication begins with uridylylation of a viral protein, VPg, to create a molecule VPg–pU–pU, which acts as a primer for RNA synthesis initiation. Typically, the primer anneals to adenylate residues at the 3′ terminus of the template; however, there are multiple factors that are important for positioning the replicase complex (Liu et al., 2009). Thus, it is possible that the primer can still function to initiate RNA replication, even if it cannot base-pair with the template. This would restore the missing nucleotides and allow rapid amplification of the repaired genome. A virus that initiates using a prime/realign mechanism has also been shown to be able to repair the terminus of its replication product. Alteration or deletion of the extreme 3′ residue of an influenza virus template resulted in subsequent restoration of the wild-type nucleotide. Evidence suggests that, in this case, the influenza virus RdRp initiates internally at position 4 and generates a dinucleotide primer complementary to positions 4 and 5, and then the RdRp and primer realign to recreate an intact 5′ end (see Table 1⇑ for sequence information) (Deng et al., 2006).
Primer-mediated repair using abortive transcripts
One means that the viral RdRp can use to repair damaged termini is to use genetic material derived from intact viral termini or from the termini of associated satellite RNAs. An example of this type of repair mechanism is primer-mediated repair. Initiation of RNA synthesis is a complex process involving several distinct stages, including a stage in which the RdRp transitions from an initiating to an elongating complex. If the RdRp fails to enter the elongation mode successfully, this results in the production of a short oligonucleotide abortive transcript, of approximately 3–12 nt, in a process known as abortive cycling (Carpousis & Gralla, 1980). Abortive initiations are a frequent event in normal RNA synthesis, with abortive transcripts reaching 10- to 100-fold molar excesses over full-length transcripts in some cases (Nagy et al., 1997). Primer-mediated repair involves initiating RNA replication using abortive transcripts generated from an intact viral RNA as primers to repair a damaged RNA.
Repair using abortive transcripts has been shown to occur for a satellite RNA of turnip crinkle virus (TCV) (Carpenter & Simon, 1996; Nagy et al., 1997). Satellite RNA C (satC) is one of several small RNAs that are associated with the TCV genome and replicated by the viral RdRp. The 3′ end of satC shares sequences with the 3′ terminus of the TCV genome that are required for efficient RNA replication, including a stable stem–loop structure followed by the sequence 5′-CUGCCC-3′. If a satC RNA lacking the terminal 6 nt of this sequence was introduced into plants together with wild-type TCV, satC was restored efficiently to its wild-type sequence. Marker mutations showed that the source of restored nucleotides was the TCV genome. The TCV RdRp was shown to generate abundant short (4–8 nt) abortive transcripts and was capable of using these in vitro to extend templates with partial terminal deletions (Fig. 3⇓). The exact mechanism by which primer-mediated repair occurs is unclear, but it appears to involve specific interactions between the structures at the 3′ end of the RNA and the RdRp, and between the RdRp and the oligoribonucleotide primers. The stem–loop structure at the 3′ end of satC needed to be intact, indicating that the RdRp required a binding site on the truncated RNA to be able to use it as a template and mediate repair. Interestingly, no base-pairing between the 3′ end of the satC RNA and the oligoribonucleotide primer was necessary, but repair could only occur using certain oligoribonucleotide sequences (Nagy et al., 1997). These findings suggest that this repair mechanism involves binding of the TCV RdRp to the defective template RNA, recruiting oligoribonucleotides of specific sequences into its active site and using these to initiate RNA synthesis.
Mechanism for TCV primer-mediated repair. (a) The TCV genome (black line) and satC RNA (red line) share common 3′-terminal sequences. In this scenario, the genome is a template for abortive transcripts, which are generated from the 3′-terminal sequences during replication initiation; the satC RNA 3′ terminus is degraded. (b) An abortive transcript generated from the intact TCV genome can be used to prime initiation from a truncated satC RNA template and be extended by the TCV RdRp to generate a restored complement of the satC RNA.
These findings suggest that the strategy of utilizing a pool of abortive transcripts as primers may be a useful way to overcome the problem of terminal deletions. To our knowledge, repair of TCV-associated RNAs represents the only described example of primer-mediated template repair for an RNA virus to date, but RdRps of positive-, negative- and double-stranded RNA viruses have been shown to generate abortive transcripts (Dupuy et al., 1999; Farsetta et al., 2000; Klumpp et al., 1998; Sun & Kao, 1997) and short oligonucleotides can be used as primers to initiate RNA synthesis in several virus systems (Chen & Patton, 2000; Garcin & Kolakofsky, 1992; Honda et al., 1986; Kao & Sun, 1996; Kawakami et al., 1981; Nomaguchi et al., 2003). Interestingly, in some of these cases, it was shown that the RdRp demonstrated specificity for primers of the correct sequence or specific length (Chen & Patton, 2000; Kao & Sun, 1996; Kawakami et al., 1981). Thus, it might be a common feature of viral RdRps to have a preference for binding particular oligoribonucleotides in their substrate pocket during the initiation phase of RNA synthesis, allowing them to repair defective templates.
Non-templated polymerization by the viral replicase to generate random primers
In another series of experiments studying TCV satC RNA repair, templates with shorter deletions of 3–5 nt were examined. It was found that these deletions were also repaired to create RNA that was similar in length to the wild-type satC RNA, but, in this case, the satC terminus was not restored consistently to its wild-type sequence, but rather consisted of apparently random sequence, indicating that an alternative repair mechanism was employed (Guan & Simon, 2000). If the 3′ terminus of the deleted satC was fused to non-specific sequence, the non-specific sequence was removed and replaced with random sequence of appropriate length. This occurred in the context of an in vitro replicase reaction, suggesting that this repair occurred in a single step. Guan & Simon (2000) provide a model for these results in which the TCV RdRp generates a small random RNA independently of the template, then uses this RNA as a primer to initiate RNA synthesis at the 3′ end of the satC RNA sequence, positioning the primer using the stem–loop structure near the 3′ end of the satC RNA.
If the model proposed by Guan & Simon (2000) is correct, it raises the question of how the replicase generates RNA primers if there is no viable promoter sequence for it to initiate on. A possible explanation comes from studies with the replicase of the RNA phage Qβ. Some studies have suggested that the Qβ replicase is capable of generating RNA de novo in the absence of a template (Biebricher et al., 1986; Sumper & Luce, 1975). Other studies have suggested that the Qβ replicase is not truly template-independent, but rather can readily utilize any RNA as a template, even if the RNA is only present at low levels as a contaminant, and that, whilst the Qβ replicase can initiate RNA synthesis randomly, it does not enter a stable elongation mode if initiation occurs in the absence of the appropriate promoter (Chetverin et al., 1991; Hill & Blumenthal, 1983; Ugarov et al., 2003). Thus, an explanation for the TCV-mediated repair is that, in the absence of an appropriate promoter sequence, the viral replicase generates RNA in a template-independent manner or initiates randomly on any RNA available, thus generating a pool of random oligonucleotides, some of which could fit appropriately with the RdRp to be utilized as primers to initiate replication, similarly to the mechanism described above. Once the RdRp has generated a ‘patched’ RNA that is able to support even a low level of replication, the RNA would have the opportunity to evolve towards a wild-type sequence over multiple replication cycles.
TCV satC RNA is not the only example of an RNA virus (or virus-associated RNA) acquiring heterologous sequence at its termini. Evidence for additional non-templated residues has been found for the positive-strand RNA viruses cucumber mosaic virus (Burgyan & Garcia-Arenal, 1998) and dengue virus (Teramoto et al., 2008), and the negative-strand RNA viruses Borna disease virus (BDV), lymphocytic choriomeningitis virus (LCMV) and Hantaan hantavirus (Meyer & Schmaljohn, 2000; Meyer & Southern, 1994; Schneider et al., 2005). Although the mechanism by which the additional nucleotides were added to these virus genomes is not known and could involve a terminal transferase activity, such as that described below, the random nature of the additional sequence is consistent with the non-templated initiation mechanism described for TCV.
Terminal transferase and poly(A) tail repair mechanisms
Many positive-stranded RNA viruses possess a 3′ poly(A) tract, which serves roles in translation, RNA stability and genome replication. These poly(A) tails are heterogeneous, but must be of a minimum length to support efficient virus replication, and there are data that point to the existence of a viral mechanism that restores adenylates that are lost from the 3′ end of the poly(A) tail during normal genome replication (van Ooij et al., 2006). There is also evidence that completely deleted poly(A) tails can be repaired. For example, an engineered clone of cowpea mosaic virus regained a deleted poly(A) tail in vivo (Eggen et al., 1989) and similar examples exist for a number of other viruses (Chen & Frey, 1999; Guilford et al., 1991; Hill et al., 1997; Kusov et al., 2005; Riechmann et al., 1990; Tacahashi & Uyeda, 1999). It is possible that the poly(A) tails are restored by a cellular poly(A) polymerase activity and, in some cases, there is evidence for this: white clover mosaic virus can recover a deleted poly(A) tail, but this is dependent on an AAUAAA motif, which is a signal sequence for cellular poly(A) polymerase (Fitzgerald & Shenk, 1981; Guilford et al., 1991). Alternatively, some viruses have been shown to possess virus-encoded terminal adenyltransferase activity that enables them to transfer poly(A) sequences to the 3′ end of their templates (Neufeld et al., 1994; Tomar et al., 2006).
Interestingly, some flaviviruses that do not possess poly(A) tails have also been shown to encode terminal transferase activity (Behrens et al., 1996; Ranjith-Kumar et al., 2001). In vitro studies of hepatitis C virus terminal transferase activity showed that, in this case, the specificity of the added nucleotide is guided by the 3′-terminal sequence of the template RNA, and the terminal transferase activity allows the RdRp to add a terminal cytidylate residue to a mutated 3′ terminus and restore template function (Ranjith-Kumar et al., 2001). Indeed, terminal transferase activity is apparently widespread, having been reported for other positive-strand RNA viruses, a double-stranded RNA virus and a negative-strand RNA virus (Fullerton et al., 2007; Poranen et al., 2008; Rohayem et al., 2006; Smallwood & Moyer, 1993), suggesting that it might be a common mechanism to facilitate terminal repair.
The mechanism for viral terminal transferase activity has not been elucidated completely. The activity is dependent on the active site of the RdRp (Poranen et al., 2008; Ranjith-Kumar et al., 2001; Tomar et al., 2006), which is not surprising, as cellular ribonucleotidyltransferases use a similar catalytic motif (Martin & Keller, 2007). However, this raises the question of how does an RdRp catalyse terminal transferase activity. The RdRp would normally position the 3′ end of template RNA in its template channel, add nucleotides onto the 3′ end of the nascent RNA in its active site and then extrude nascent RNA in a 5′→3′ direction via its exit channel. If the RdRp bound to the defective template in its usual orientation, the 3′ end of the RNA would not be positioned appropriately relative to the catalytic site for nucleotide additions. Studies on the RdRp/terminal transferase of bacteriophage φ6 suggest that the most likely explanation is that, for viral terminal transferase activity to occur, the RdRp subunit becomes oriented in the opposite direction with respect to the RNA than it would be for template-directed RdRp activity and draws the 3′ end of the RNA through its exit channel into the catalytic site, so that the 3′ terminus of the RNA is positioned appropriately in the active site for catalysis to occur (Fig. 4⇓) (Poranen et al., 2008). However, it remains to be determined why terminal transferase activity can be specific for a particular NTP substrate, or how the terminal sequence of the template RNA can determine the specificity of the added nucleotides. Regardless of the specific mechanism, it seems likely that, if all of the elements for initiation of template-directed polymerization were present, including all of the 3′-terminal signals, the RdRp would be most likely to bind the template in the orientation illustrated in Fig. 4(a)⇓ and initiate RNA replication. However, if 3′-terminal signals were not intact, the RdRp would be unable to form a stable initiation complex, increasing its propensity to adopt a terminal transferase orientation until the RNA has been repaired sufficiently (Fig. 4c⇓).
Putative model for the RdRp terminal transferase mechanism. (a) Replication initiation involves assembly of the replicase complex, including the RdRp, onto the template RNA. In the case of de novo initiation at the 3′ terminus, the 3′ end of the template is located in the active site. (b) During RNA synthesis, the 3′ OH group of the nascent strand is positioned appropriately in the active site for addition of the next nucleotide, and the newly synthesized RNA is extruded through an exit channel. (c) For terminal transferase activity, it is proposed that the RdRp and the RNA ‘template’ are rotated relative to each other. In this case, the RNA enters via the exit channel, positioning the 3′ OH group appropriately in the active site for further nucleotides to be added.
A common but uncharacterized mechanism of repair: a possible role for the cellular RNA-degradation machinery
Some polyadenylated virus genomes that were shown to undergo repair to restore the poly(A) tail were found to have acquired novel U-rich sequences. This phenomenon was first described for beet necrotic yellow vein virus. In this case, if the poly(A) tail was removed completely, progeny viruses contained a novel heterogeneous U-rich region adjacent to the viral sequence, followed by a poly(A) tail (Jupin et al., 1990). Similar results have been observed for Sindbis virus (Raju et al., 1999), coxsackie B virus (van Ooij et al., 2006) and hepatitis C virus (van Leeuwen et al., 2006). The mechanism that generates the heterogeneous U-rich linker followed by the poly(A) tail is so far undetermined and it has been suggested that it might involve a virus-mediated activity, such as virus-encoded terminal transferase, or internal initiation followed by primer jumping (Raju et al., 1999). However, the fact that this type of repair has been observed for a diverse range of viruses, some of which have been shown to possess only adenyltransferase activity (with no evidence for uridyltransferase activity), is intriguing and suggests that this repair mechanism might involve cellular enzymes.
A possible candidate for the U-rich insertions is the cellular RNA-decay machinery. A growing body of evidence has identified relationships between replication complexes of a number of plant and animal viruses and cytoplasmic RNA-processing sites, such as processing bodies and stress granules (Beckham & Parker, 2008). This suggests a close involvement between the cellular degradation machinery and virus replication sites. A group of newly discovered factors in mRNA turnover are cellular poly(U) polymerases; the poly(U) tail added by these enzymes stimulates removal of the 5′ cap, promoting mRNA degradation (Song & Kiledjian, 2007; Wickens & Kwak, 2008). Poly(U) polymerases are widespread, from yeast to humans, and they are non-specific in their RNA substrate (Guschina & Benecke, 2008; Wickens & Kwak, 2008); thus, they have the potential to act on a variety of viral RNAs. Poly(U) polymerases can also be relaxed in their nucleotide specificity: there is an example of a poly(U) RdRp that can switch between adding U or A residues (Mellman et al., 2008). Thus, one possible explanation for how virus genomes acquire non-viral U-rich sequences is that a truncated viral RNA [lacking its poly(A) tail] becomes tailed by poly(U) polymerase to mark it for degradation (Fig. 5⇓). Some viruses might have evolved mechanisms that allow them to then intercept the degradation pathway and promote polyadenylation, either by using a virus-encoded terminal transferase activity, as described above, or by provoking the poly(U) polymerase to switch NTP specificity to ATP. Addition of the poly(A) tail would be expected to confer stability to the virus genome and, in combination with cis-acting signals elsewhere in the virus genome, might be sufficient to enable genome replication and propagation.
Model for the acquisition of a non-viral poly(U) sequence. According to this model, following removal or loss of the poly(A) tail at the 3′ end of the viral genome RNA, cellular poly(U) polymerase uridylates the truncated RNA to mark it for degradation. In some cases, viral proteins might be able to intervene in this process, either by altering the activity of the poly(U) polymerase, so that it adds adenylate residues onto the 3′ terminus, or by enabling the viral RdRp to polyadenylate the RNA by terminal transferase activity.
Repair of 3′ termini by tRNA mimicry
The final mechanism of terminal repair almost certainly involves a cellular enzyme and is facilitated by virus mimicry of cellular tRNAs. A 3′-terminal modification found in the RNA genomes of many plant RNA viruses is amino acylation to charge the 3′ terminus of the genome RNA with the amino acids valine, histidine or tyrosine. The mechanism by which amino acylation of viral genomes occurs is related to tRNA amino acylation: these viruses have 3′-terminal RNA sequences that can adopt a highly organized secondary/tertiary structure akin to that of host-cell tRNAs, allowing them to be charged by the respective amino acyl tRNA synthetase (reviewed by Dreher, 2009). The similarity between the virus genome and a tRNA structure not only allows acylation to occur, but also offers an opportunity for 3′-terminal sequence repair.
Cellular tRNAs have a characteristic cloverleaf structure with an unpaired CCA-OH motif at the 3′ terminus. The CCA motif is not encoded by the eukaryotic genome, but rather is added post-transcriptionally by the cellular enzyme tRNA nucleotidyltransferase, which also functions to maintain the tRNA 3′ terminus in the face of constant nuclease degradation. The tRNA nucleotidyltransferase functions independently of a nucleic acid template, with nucleotide selection apparently being guided by pockets within the protein structure, created by conformational changes following addition of each nucleotide (reviewed by Martin & Keller, 2007). The 3′ terminus of the BMV genome ends in CC, but is modified to CCA upon entry into the cell. The similarity between viral 3′ tRNA structures and authentic cellular tRNAs is thought to afford the virus genome the ability to bind to tRNA nucleotidyltransferase, allowing this enzyme to restore the 3′ A residue in the same manner as it would repair a cellular tRNA (Joshi et al., 1983). This tRNA mimicry also provides an opportunity for repair of genomes from which the 3′-terminal cytidylates have been removed. Experiments with BMV have shown that genomes containing 3′-terminal deletions to the CCA motif are repaired to wild-type sequence very rapidly in cells (Hema et al., 2005; Rao et al., 1989). This repair activity is not dependent on the other virus genome segments acting as a template (Hema et al., 2005) and does not occur in reactions performed in vitro using purified replicase (Miller et al., 1986). Thus, the most likely mechanism for repair is addition of the CCA motif by tRNA nucleotidyltransferase.
Potential benefits of terminal deletions: truncated genomes and viral persistence
Many RNA viruses that cause acute infections rely on rapid and efficient multiplication to enable virus spread to new host cells before clearance by the host immune system. This viral strategy requires maintenance of a genome sequence that is able to support sufficient levels of virus replication, and this requirement probably contributes to the selection pressure that drives development of the repair activities described above. A number of RNA viruses are also associated with persistent infections, particularly in certain tissues or organisms. During persistent infection, mutations that arise can play an important role. For example, mutations in the coding region of hepatitis C virus can allow the virus to evade immune attack (Burke & Cox, 2010). In addition, persistence is also frequently associated with truncations in terminal cis-acting sequences. Indeed, the high prevalence of terminally deleted genomes in some persistent virus infections suggests that terminal deletions might be involved directly in the establishment and/or maintenance of the persistent state.
Examples of associations between terminal truncations and persistent infections
LCMV, a member of the family Arenaviridae, is a segmented, negative-sense RNA virus. LCMV infection of newborn mice progresses from an initial acute phase to a persistent phase associated with a sustained low level of virion production. Extended time-course experiments in cell culture (Meyer & Southern, 1994) or mice (Meyer & Southern, 1997) showed that both of the viral genome segments were corrupted by short terminal deletions and nucleotide additions of up to 4 nt. The abundance of these corrupted genomes increased throughout the time-course to the extent that, by 28 days post-infection, between 60 and 70 % of all genome sequences exhibited deletions. Interestingly these genomic sequence changes appeared to be matched by complementary changes in the corresponding antigenomic strands, suggesting that these altered RNAs were replication-competent; however, they were deficient in mRNA transcription (Meyer & Southern, 1997). The presence of replication-competent, but transcription-incompetent, genome segments is consistent with the diminished viral protein expression and low yield of infectious virus that are characteristics of the persistent infection. Terminally deleted genomes are also a feature of Seoul virus, a member of the genus Hantavirus, which is another segmented, negative-sense RNA virus that establishes long-term persistent infections in rodent hosts (Meyer & Schmaljohn, 2000). Extended time-course experiments in cell culture revealed that the abundance of genome segments with terminal deletions exhibited a cyclical pattern, rising and falling throughout the duration of the experiment. The abundance of viral RNA and virus progeny mirrored the proportion of truncated genomes, and thus strongly implicated the deletions in maintaining the persistent state (Meyer & Schmaljohn, 2000).
Terminal sequence deletions also have been shown to play a role in the transition between acute and persistent infections of a positive-strand RNA virus. CB3V, a human enterovirus, is frequently associated with an acute myocarditis that can often lead to heart failure. CB3V genomes can be detected in tissue from previously infected individuals, yet infectious viruses have been almost impossible to isolate, suggesting that CB3V may be capable of viral persistence. Evidence for a persistence mechanism came from sequence analysis, which showed that CB3V genomes within infected tissues possessed 5′-terminal deletions of between 7 and 49 nt (Kim et al., 2005). It was found that there was a correlation between the presence of complete or truncated termini, and acute versus persistent infection, respectively (Kim et al., 2005). More recent work has detected the presence of terminally deleted genomes in human tissue collected from a fatal case of myocarditis, suggesting that there is significance for terminal deletions in the context of a human infection (Chapman et al., 2008).
Evidence for a specific mechanism for terminal truncation
The mechanism by which these viruses acquire truncations to their termini might differ. In the case of CB3V, terminal deletions arise more rapidly in either intact heart tissues or primary myocardial cell cultures than in other cells, suggesting that the cellular environment has a strong influence (Kim et al., 2005). However, one virus associated with persistent virus infections appears to have evolved a specific strategy for truncating its termini. BDV can establish a persistent infection in cell-culture systems and also brain cells of infected animals. This virus is a member of the order Mononegavirales, members of which typically possess genomes that exhibit terminal complementarity. However, the 5′ termini of the viral RNA and cRNAs of BDV isolates extracted from infected cells and animals are recessed by 4 nt compared with the 3′ termini (Rosario et al., 2005; Schneider et al., 2005). Experiments have shown that recombinant virus with perfectly complementary 3′ and 5′ ends quickly reverts to having recessed 5′ termini, similar to the naturally occurring virus (Schneider et al., 2005). The high level of precision of this terminal truncation suggests that it occurs by a specific mechanism, possibly involving replication initiation at an internally located nucleotide. Although that mechanism has not yet been elucidated, it apparently involves an activity that leaves a monophosphate moiety at the 5′ terminus, instead of the 5′ triphosphate that would typically be found at the 5′ end of viral RNA. Gene-expression studies indicate that trimmed genomes are efficient templates for transcription, but inefficient templates for replication, thus limiting the generation of infectious particles. In addition, the 5′ monophosphate moiety and the recessed 5′ terminus of genomic and antigenomic strands enable the viral RNA to avoid detection by retinoic acid-induced gene 1 (RIG-I) (Habjan et al., 2008). This raises the interesting possibility that, in addition to modulating virus replication and gene expression, BDV genome trimming may be a mechanism to facilitate virus persistence by evading cellular antiviral responses.
Concluding remarks
The studies described above demonstrate a range of mechanisms that exist to enable maintenance and restoration of genome integrity in RNA viruses. These studies reveal that, despite their often fastidious mechanisms to ensure accurate replication initiation and termination, viral RdRps are sufficiently flexible to accommodate alternative modes of initiation and elongation, enabling terminal repair, terminal transferase activity and recombination. For any given virus, the behaviour of the RdRp at the 3′ end of a template might be impacted by the nature of the 3′-terminal sequence. It seems likely that, if the genome termini are intact and all promoter and accessory sequences required for replication initiation are present, the RdRp will engage preferentially in accurate replication initiation. However, if key terminal sequences are missing, the replicase complex might not be able to assemble correctly, releasing the RdRp to perform ‘abnormal’ actions, such as non-templated polymerization or terminal transferase activity. Once a genome has been repaired, the RdRp could revert to its ‘normal’ function of template-dependent polymerization. Viruses that replicate in particularly harsh environments or which have no passive defences to protect their genomes may have replicases that accommodate alternative initiation mechanisms to facilitate terminal repair more readily, and fundamental differences in virus genome and replicase architecture probably affect the propensity for RNA recombination. It is striking that most examples of RNA virus genome repair involve positive-strand viruses, whose genomes might be more vulnerable than those of the double-stranded or negative-sense viruses, in which the genomes are sequestered in protein capsids during the entire cycle of infection. The data also suggest that formation of truncated genomes, whilst hindering virus replication kinetics, might allow or aid some viruses to become persistent, which ultimately could aid their propagation within the host population. Thus, the capacity for genome repair could be an important factor in virus pathogenesis.
Acknowledgments
We thank Elke Mühlberger, Sean Whelan and Sarah Noton for constructive comments on the manuscript.