Abstract
Eel herpesvirus or anguillid herpesvirus 1 (AngHV1) frequently causes disease in freshwater eels. The complete genome sequence of AngHV1 and its taxonomic position within the family Alloherpesviridae were determined. Shotgun sequencing revealed a 249 kbp genome including an 11 kbp terminal direct repeat that contains 7 of the 136 predicted protein-coding open reading frames. Twelve of these genes are conserved among other members of the family Alloherpesviridae and another 28 genes have clear homologues in cyprinid herpesvirus 3. Phylogenetic analyses based on amino acid sequences of five conserved genes, including the ATPase subunit of the terminase, confirm the position of AngHV1 within the family Alloherpesviridae, where it is most closely related to the cyprinid herpesviruses. Our analyses support a recent proposal to subdivide the family Alloherpesviridae into two sister clades, one containing AngHV1 and the cyprinid herpesviruses and the other containing Ictalurid herpesvirus 1 and the ranid herpesviruses.
-
The GenBank/EMBL/DDBJ accession number for the complete genome sequence of anguillid herpesvirus 1 strain 500138 determined in this study is FJ940765.
One of the commonly observed and economically most relevant viruses in wild and cultured freshwater eels of the genus Anguilla is anguillid herpesvirus 1 (AngHV1) (Haenen et al., 2002; van Ginneken et al., 2004), also known as eel herpesvirus and herpesvirus anguillae (Sano et al., 1990). Other formerly used names include eel herpesvirus in Formosa (Ueno et al., 1992, 1996), gill herpesvirus of eel (Lee et al., 1999) and European eel herpesvirus (Chang et al., 2002). AngHV1 was first isolated from cultured European eels (Anguilla anguilla) and Japanese eels (Anguilla japonica) in Japan in 1985 (Sano et al., 1990). Several herpesviral disease outbreaks in Japanese eels (Kobayashi & Miyazaki, 1997; Lee et al., 1999; Ueno et al., 1992) and European eels (Chang et al., 2002; Davidse et al., 1999; Haenen et al., 2002; Jakob et al., 2009; van Ginneken et al., 2004) have since been reported. Serological, molecular and sequence data indicated that the Asian and European eel herpesvirus isolates can be considered as a single virus species (Chang et al., 2002; Rijsewijk et al., 2005; Waltzek et al., 2009). Clinical and pathological findings of the infection vary among and within outbreaks but characteristically include haemorrhages in skin, fins, gills and liver, and a significantly increased mortality (Chang et al., 2002; Davidse et al., 1999; Haenen et al., 2002; Kobayashi & Miyazaki, 1997; Sano et al., 1990; Ueno et al., 1992; van Ginneken et al., 2004).
Herpesviruses are large and complex linear double-stranded DNA viruses with a distinctive morphology (Davison et al., 2005). Accordingly, AngHV1 virions consist of a core, an icosahedral nucleocapsid made up of hollow capsomers (T=16) with a diameter of about 110 nm, a proteinaceous tegument, and a host-derived envelope with a diameter of about 200 nm containing virus-encoded glycoproteins (Davidse et al., 1999; Sano et al., 1990). At the genome sequence level, only a single gene, encoding the putative ATPase subunit of the terminase (hereafter terminase), is convincingly conserved among all herpesviruses and to a lesser extent also in T4-like bacteriophages, implying descent from a common ancestor (Davison, 1992, 2002).
Based on genome sequence comparisons, the order Herpesvirales has recently been subdivided into three families: the family Herpesviridae, comprising the mammalian, avian and reptilian herpesviruses, the family Alloherpesviridae, comprising the piscine and amphibian herpesviruses, and the family Malacoherpesviridae, comprising a single invertebrate herpesvirus (Davison et al., 2009). To date, only four members of the family Alloherpesviridae have been sequenced completely, namely ictalurid herpesvirus 1 (IcHV1) (Davison, 1992), ranid herpesviruses 1 and 2 (RaHV1 and RaHV2) (Davison et al., 2006) and cyprinid herpesvirus 3 (CyHV3) (Aoki et al., 2007). Several other fish herpesviruses, including AngHV1, have been only poorly characterized genetically and are considered unassigned members of the family Alloherpesviridae (Davison et al., 2009). The objective of the present study was to sequence the complete genome of AngHV1 in order to determine its taxonomic position within the family Alloherpesviridae based on gene conservation and phylogenetic analyses.
The Dutch AngHV1 strain 500138 (van Nieuwstadt et al., 2001) was isolated and propagated in monolayers of eel kidney (EK-1) cells (Chen et al., 1982) in sterile plastic flasks in Leibovitz L15 medium (Gibco, Invitrogen), supplemented with 2 % (v/v) fetal bovine serum (Biochrom), 0.075 % (w/v) sodium bicarbonate (Gibco), 2 mM l-glutamine (Gibco) and antibiotics, in a CO2-incubator (Nuaire) at 26 °C. After the appearance of moderate cytopathic effect at 4 days post-infection, cell debris was cleared from the culture medium by centrifugation at 3500 g for 20 min at 10 °C (Hermle Laboratortechnik, Z400K) and cell-released virus was concentrated by ultracentrifugation at 87 300 g for 90 min at 10 °C (Optima L70K Ultracentrifuge, Beckman Coulter). DNA was extracted using a QIAamp DNA Blood Mini kit (Qiagen).
The complete AngHV1 genome sequence was determined by shotgun sequencing. Briefly, the DNA was randomly sheared into 2 and 8 kbp-sized fragments by nebulization, ligated into EcoRV-digested pBlueScriptSKI+ vector (Stratagene), transformed into Escherichia coli XL2-Blue (Stratagene), sequenced from both ends by using a DYEnamic ET Terminator Cycle Sequencing kit (GE Healthcare) and analysed on a 3730 XL DNA analyser (Applied Biosystems). Raw traces were preprocessed using pregap4 (Bonfield & Staden, 1996), basecalled with phred (Ewing & Green, 1998; Ewing et al., 1998), assembled using gap4 (Bonfield et al., 1995) with a mismatch threshold of 5 %, and parsed into a gap4 assembly database. Consensus calculations with a quality cutoff score of 40 were performed within gap4 using a probabilistic consensus algorithm based on the expected error rates output by phred. Remaining gaps in the assembly were closed by sequencing of PCR products. The final consensus sequence represented an average redundancy of 10.9. The terminal repeats (TRs) and genome termini were identified from the database and confirmed by a PCR method for amplifying termini using the Marathon cDNA amplification kit (Clontech Laboratories) as previously described (Davison et al., 2003).
The double-stranded, non-segmented DNA genome of the Dutch AngHV1 isolate 500138 is 248 531 bp in size, including a 10 634 bp terminal direct repeat, which is in line with earlier estimations based on restriction enzyme fragment analysis (Rijsewijk et al., 2005). Hence the AngHV1 genome belongs, with the ∼295 kbp cyprinid herpesvirus 1 (CyHV1) and CyHV3 genomes (Aoki et al., 2007; Waltzek et al., 2005), to the largest herpesvirus genomes, sizes of which range from 124 to 295 kbp (McGeoch et al., 2006). AngHV1's G+C content is 53.0 mol%, which falls within the wide overall nucleotide composition range of 32–75 mol% observed for herpesviruses (McGeoch et al., 2006) and the narrower range of 52.8–59.2 mol% for alloherpesviruses.
ATG-initiated open reading frames (ORFs) were identified using GeneMarkS (Besemer et al., 2001) and other criteria commonly applied in herpesvirus genome analysis, such as a minimum ORF size of 60 codons, rarity of ORF splicing and rarity of extensive ORF overlap (McGeoch et al., 2006). A total of 136 unique protein-coding ORFs was predicted in the genome, of which 7 were duplicated in the TRs. A tentative gene layout was composed, as depicted in Fig. 1⇓. AngHV1 has an ORF density of 0.57 per kbp, taking the TR into account once. This resembles that of CyHV3, but is lower than the ORF densities of RaHV1, RaHV2 and IcHV1. Like the other completely sequenced alloherpesvirus genomes, the AngHV1 genome consists of one long unique region flanked by two short direct repeat regions at the termini. However, this genome organization does not seem to be a general feature of alloherpesviruses, since salmonid herpesvirus 1 is known to have a long unique region linked to a short unique region flanked by an inverted repeat (Davison, 1998).
Tentative gene layout in AngHV1. Both strands are shown, with the locations of forward- and reverse-orientated predicted protein-coding ORFs shown on the respective strands. Conservation degree and gene families are defined in the key at the foot. Introns are depicted as thin lines connecting the exons. Terminal repeats are boxed. The scale is in bp. Nomenclature is given with the ORF prefix omitted.
Similarity searches were carried out against non-redundant protein sequences, available alloherpesvirus sequences and the AngHV1 ORF database itself by using blastp (Altschul et al., 1997) for proteins (blosum62 matrix); the results are shown in Fig. 1⇑ and Table 1⇓. The AngHV1 genome contains 12 of the 13 genes convincingly conserved among all members of the Alloherpesviridae sequenced so far (Aoki et al., 2007). As indicated by the available data for the other alloherpesviruses (Aoki et al., 2007; Davison, 1992; Davison et al., 2006), these genes encode proteins putatively involved in capsid morphogenesis (ORF36, ORF57 and ORF104), DNA replication (ORF21, ORF37 and ORF55) and DNA packaging (ORF10). The other five identified conserved genes encode proteins with unknown functions (ORF22, ORF52, ORF82, ORF98 and ORF100). With only 12 or 13 genes conserved among all family members (Aoki et al., 2007), the family Alloherpesviridae appears to be considerably more divergent than the family Herpesviridae, in which 43 genes are inherited from a common ancestor (McGeoch & Davison, 1999; McGeoch et al., 2006).
Predicted functional properties and similarity of selected AngHV1 genes
AngHV1 gene properties, putative functions and similarities predicted using blastp for proteins carried out against non-redundant protein sequences are shown. Only AngHV1 genes with convincing homologues are included (E-values <10−5). Marginal sequence similarities (E-values >10−5) and doubtful homologies based on short repetitive sequences were omitted. AngHV1 genes in italics represent spliced genes (ORF10 presumably contains five exons; ORF100 contains two exons). Identity is shown as percentage sequence identity and alignment length. E-values and identities with completely sequenced CyHV3, IcHV1, RaHV1 and RaHV2 were determined using blastp for proteins carried out against available alloherpesvirus sequences. Identities with IcHV1, RaHV1 and RaHV2 for the 12 presumably conserved genes are given regardless of E-value; identities with E-values >10−5 are presented in italics or omitted when no sequence similarity was found.
Forty genes are convincingly conserved between AngHV1 and the completely sequenced CyHV3 (E-values <10−5, Table 1⇑). The arrangement of the homologous genes in AngHV1 and CyHV3 appears to be positionally conserved in blocks, either in the same or in reverse orientation. This has been shown previously for IcHV1, salmonid herpesvirus 1 and the ranid herpesviruses (Davison, 1998; Davison et al., 2006). At Alloherpesviridae family level, the 12 or 13 conserved genes seem to be conserved within five or six of these blocks.
The relationships within the nine sequence similarity based gene families within the AngHV1 genome (E-values <10−5) are generally distant and most ORFs encode proteins with unknown functions (Fig. 1⇑, Table 1⇑). ORF79 and ORF123 both encode proteins related to deoxynucleoside kinases. The proteins specified by ORF101 and ORF124 both contain a conserved domain of tumour necrosis factor receptors (TNFR) and presumably represent, together with an interleukin-10-related protein encoded by ORF25, host immune response modulating factors. The closely related CyHV3 also encodes these potential immunomodulatory proteins, suggesting an important role in the evolution of pathogenesis. Homology is, however, not entirely convincing and AngHV1 and CyHV3 may hence have acquired these genes through separate gene capture events rather than by common evolutionary descent.
In order to determine AngHV1's position within the family Alloherpesviridae, a concatenation of the five conserved regions of the terminase (Davison, 1992, 2002) was phylogenetically analysed. Homologous amino acid sequences were retrieved from RefSeq and GenBank (National Center for Biotechnology Information, ) (Benson et al., 2008): alloherpesviruses IcHV1 (accession no. NC_001493), RaHV1 (NC_008211), RaHV2 (NC_008210), CyHV1 (EU349288), cyprinid herpesvirus 2 (CyHV2; EU349285) and CyHV3 (NC_009127); human herpesvirus 1 (HHV1; NC_001806), human herpesvirus 5 (HHV5; NC_006273) and human herpesvirus 4 (HHV4; NC_007605) as representatives of the Herpesviridae subfamilies Alphaherpesvirinae, Betaherpesvirinae and Gammaherpesvirinae, respectively; ostreid herpesvirus 1 (OsHV-1; NC_005881) as representative of the family Malacoherpesviridae; and T4 bacteriophage (NC_000866). In addition, the full-length amino acid sequences for DNA polymerase, DNA helicase, the major capsid protein and capsid triplex protein 2 of AngHV-1 were compared with those of the four completely sequenced alloherpesviruses and CyHV1 (accession nos AY939868, AY939858, AY939865 and AY939860). Multiple alignments of the amino acid sequences were generated with clustal_x version 1.81 (Thompson et al., 1997), using default settings with minor manual modifications in the sequence. The concatenated alignments resulted for the five conserved regions of the terminase in a dataset with a length of 266 residues and for the four conserved genes in a dataset of 2873 residues.
For analyses of the phylogenetic trees, three computational approaches were used. The first approach used the neighbour-joining (NJ) method with the program mega version 4 (Tamura et al., 2007), using the Jones–Taylor–Thornton probability model with uniform rates of substitution and 10 000 bootstrap replicates. The second approach used phylogeny inference according to the maximum-likelihood criterion using the phylip package version 3.68 (Felsenstein, 2008); bootstrap datasets (1000 replicates) were generated, the likelihood for each dataset was calculated using the Jones–Taylor–Thornton probability model, and the consensus tree with estimated branch lengths was established. Bayesian inference analysis was used as the third approach with the program MrBayes version 3.1.2 (Ronquist & Huelsenbeck, 2003), using the Jones model with equal rates of substitution; the Monte Carlo Markov chain was run with four chains for 1 000 000 generations, sampling the Markov chain every 100 generations, and the first 2500 trees collected (25 %) were discarded to allow the process to reach stationarity. For the latter two approaches the trees were displayed with TreeView version 1.6.6 (Page, 1996). The phylogenetic analyses of the four conserved genes were carried out for each of the four individual genes and for the genes concatenated, which showed similar results.
The maximum-likelihood trees shown in Fig. 2⇓, with reliability of the branching for all three different phylogenetic methods indicated, visualize the relationships within the family Alloherpesviridae. Fig. 2(a)⇓ represents the tree based upon the concatenated amino acid sequences of the five conserved regions of the terminase for seven alloherpesviruses and representatives of the families Herpesviridae and Malacoherpesviridae, rooted with T4 bacteriophage. Fig. 2(b)⇓ shows the unrooted tree based upon the concatenated amino acid sequences of four conserved genes from six alloherpesviruses. Regardless of the phylogenetic methodology used for both trees, the topology of the resulting trees is equivalent. AngHV-1 is closely related to the cyprinid herpesviruses, together forming a phylogenetic group more distantly related to IcHV1 and both ranid herpesviruses. These findings are consistent with previous phylogenetic reconstructions of this family (Doszpoly et al., 2008; Waltzek et al., 2009), complementing those with inclusion of complete AngHV1 amino acid sequences.
Phylogenetic trees depicting the relationship between the alloherpesviruses. (a) Maximum-likelihood tree based upon the concatenated amino acid sequences of five conserved regions of the terminase from members of the families Alloherpesviridae, Herpesviridae and Malacoherpesviridae, rooted with the T4 bacteriophage. (b) Unrooted maximum-likelihood tree based upon the concatenated amino acid sequences of DNA polymerase, DNA helicase, the major capsid protein and capsid triplex protein 2 from six alloherpesviruses. Reliability of the branching is indicated at the nodes as replicates for maximum-likelihood analysis (regular type), percentages for neighbour-joining (italic type) and inference support for the Bayesian analysis (bold type). Scale bars show the divergence for each tree.
Based on gene conservation and phylogenetic analyses using AngHV1's genome sequence, we propose that AngHV1 is a new virus species within the family Alloherpesviridae. The large genome sizes of the anguillid and cyprinid herpesviruses, the relatively high number of homologous genes and the phylogenetic analyses support the suggested subdivision of the family Alloherpesviridae into two sister clades, with one clade including herpesviruses from anguillid and cyprinid hosts and one clade including herpesviruses infecting other fish species and frogs (Waltzek et al., 2009). Future genome sequencing of other alloherpesviruses combined with the proper definition of criteria for the establishment of new subfamilies and genera within the family Alloherpesviridae should bring a higher resolution to the phylogenetic relationships and taxonomic organization.
Acknowledgments
We are grateful to Ineke Roozenburg for technical support.