Genomic characterization of the first insectivoran papillomavirus reveals an unusually long, second non-coding region and indicates a close relationship to Betapapillomavirus

Abstract

Knowledge about biological diversity is the prerequisite to reliably reconstruct the evolution of pathogens such as papillomaviruses (PV). However, complete genomes of non-human PV have only been cloned and sequenced from 8 out of 18 orders within the Placentalia, although the host-specific variety of PV is considered much larger. We isolated and sequenced the complete genome of the first insectivoran PV type from hair follicle cells of the European hedgehog (Erinaceus europaeus), designated EHPV. We conducted phylogenetic analyses (maximum-likelihood criterion and Bayesian inference) with the genomic information of a systematically representative set of 67 PV types including EHPV. As inferred from amino acid sequence data of the separate genes E1, E2 and L1 as well as of the gene combination E6–E7–E1–E2–L1, EHPV clustered within the β-γ-π-ξ-PV supertaxon and constituted the closest relative of genus Betapapillomavirus infecting primates. Beside the typical organization of the PV genome, EHPV exhibited a 1172 bp, non-coding region between the E2 and the L2 open reading frames. This trait has been previously described for the only distantly related Lambdapapillomavirus, but a common evolutionary origin of both non-coding regions is unlikely. Our results underscore the modular organization of the PV genome and the complex natural history of PV.

The GenBank/EMBL/DDBJ accession number for the sequence reported in this paper is FJ379293.

Supplementary material is available with the online version of this paper.

The diversity of papillomaviruses (PV) has not been sufficiently studied. Viruses from approximately 40 different host species have been isolated and (partly) sequenced, but it is assumed that all members of the Amniota potentially harbour specific PV (Van Ranst et al., 1995; de Villiers et al., 2004; Rector et al., 2007). The currently known hosts do not cover more than 8 of the 18 orders of the Placentalia, but PV from taxa such as Proboscidea and "Insectivora" have not been isolated or completely sequenced at this moment in time.

The dsDNA PV genome comprises approximately 8000 bp. It is structurally conserved and organized in up to eight well-defined open reading frames (ORF), including the four major genes E1, E2, L2 and L1 and additional, smaller genes. A non-coding region (NCR1) involved in regulation of gene expression is located upstream of the early genes. The genomic region between the genes E2 and L2 is an evolutionary hotspot. In genus Betapapillomavirus (β-PV, infecting primates) and other PV, this transitional region is short and non-coding, while it encodes the E5 oncogene(s) in genera Alphapapillomavirus (α-PV, infecting primates) and Deltapapillomavirus (δ-PV, infecting ungulates; Bravo & Alonso, 2004). In Lambdapapillomavirus (λ-PV, infecting Carnivora), this region spans more than 1000 bp and does not appear to encode any protein (Terai & Burk, 2002; Rector et al., 2007). It has been therefore described as the second non-coding region (NCR2).

PV systematics traditionally relies on the sequence comparison of the L1 gene. More rigorous evaluation of PV evolution, applying computation of confidence values for internal nodes and appropriate outgroup choice, have become standard procedures for phylogenetic analyses during the past years (García-Vallvé et al., 2005; Narechania et al., 2005; Rector et al., 2007). Using these approaches, four PV supertaxa have been identified, namely alpha-omikron (α+o)-, beta-gamma-pi-xi (β+γ+π+ξ)-, delta-epsilon (δ+ε)-, and kappa-lambda-mu-nu-sigma (κ+λ+µ+ν+σ)-PV. The interpretation of the corresponding molecular trees suggest multiple evolutionary mechanisms driving PV diversification (Gottschling et al., 2007b). They include (i) co-divergence between the viruses and their hosts (Van Ranst et al., 1995; Rector et al., 2007), (ii) infection across species borders (zoonosis: Myers et al., 1996; Rector et al., 2005; Gottschling et al., 2007a), (iii) establishment of new ecological niches by adaptive radiation (García-Vallvé et al., 2005; Jackson, 2005), and/or (iv) recombination (Narechania et al., 2005; Varsani et al., 2006).

Well-resolved host trees are the necessary prerequisite to test the hypothesis of co-phylogeny between PV and their hosts (Page et al., 1996). During the past years, molecular and morphological studies have tremendously improved the phylogenetic systematics of the Placentalia (Murphy et al., 2001; Reyes et al., 2004; Springer et al., 2004; Bininda-Emonds et al., 2007). Within Laurasiatheria, Erinaceus europaeus Linnaeus, 1758 (Erinaceidae) belongs to the Eulipotyphla that are a part of the former Mammalian order "Insectivora" (hence recognized polyphyletic). The Eulipotyphla are the sister group of the remaining laurasiatherian taxa such as Carnivora, Cetartiodactyla, Chiroptera, and Perissodactyla. If co-divergence with hosts alone drove PV evolution, a virus isolated from E. europaeus should be closely related to those laurasiatherian PV. This hypothesis would be rejected by significantly different host and PV topologies.

In this study, we present the complete genome of the first insectivoran PV isolated from E. europaeus. In order to avoid violating the principle of priority, we have designated the new type as EHPV (European hedgehog), because EEPV is given to a PV type from the European elk (Moreno-Lopéz et al., 1981; Ahola et al., 1986). We describe here the genome organization of EHPV and determine its phylogenetic relationships within the family Papillomaviridae by using the maximum-likelihood (ML) criterion and Bayesian inference.

DNA isolation and sequencing of the complete viral genome.
Approximately 30 hairs including the follicle cells were collected under sterile conditions from the facial region of a free ranging European hedgehog (Erinaceus europaeus; male adult, field sample, Raul-Wallenberg-Straße, 126 79 Berlin-Marzahn-Hellersdorf, Germany; Supplementary Table S1, available in JGV Online) and were stored at –20 °C. Extraction of genomic DNA was performed as previously described (Gottschling et al., 2008), and DNA was stored at –20 °C until further analysis. From this sample, a 574 bp L1 sequence fragment, generated and verified in two independent PCR experiments using degenerated FA-primers (Forslund et al., 1999; Antonsson & Hansson, 2002), has been published previously (GenBank accession number EF396272; Gottschling et al., 2008).

Rolling circle amplification (RCA) of the viral, episomal DNA was performed with a TempliPhi 500 Amplification kit (Amersham Biosciences), following an optimized protocol using additional nucleotides to obtain higher folds of amplification. Genomic DNA (1 µl) was added to 5 µl sample buffer, denatured at 95 °C for 3 min, and was subsequently stored on ice. The denatured DNA was mixed with 6.2 µl of a solution containing 5 µl reaction buffer, 0.2 µl enzyme mix, and 1 µl 4.5 mM additional dNTPs. Incubation was performed at 30 °C overnight, and the reaction was stopped at 65 °C for 10 min.

Synthetic hexamers (2 nM) containing the EcoRI cleavage site were added to replenish single stranded regions of the concatemeric RCA products. They were digested using 10 units of EcoRI (New England Biolabs). Restriction fragments were separated by electrophoresis on a 0.8 % agarose gel, and a band of approximately 8000 bp was excised. DNA was purified with a QIAquick Gel Extraction kit (Qiagen), following the manufacturer's instructions.

Purified DNA was cloned into pBK-CMV plasmid (Stratagene) at the EcoRI restriction site using T4 Ligase (New England Biolabs). After ligation, One Shot Top 10 competent E. coli (Invitrogen) were transformed with the construct, and plated on kanamycin-containing selective agar plates. Five white colonies were picked and plasmid DNA was isolated using a QIAquick Spin Miniprep kit (Qiagen) and analysed after digestion with EcoRI. Flanking regions of DNA fragments with a size of approximately 8000 bp were sequenced using standard M13 primers. A bacterial clone harbouring the complete PV genome was transferred to 300 ml Luria–Bertani medium containing kanamycin (50 ng µl^–1) and incubated at 250 r.p.m. and 37 °C overnight. Two stock cultures containing 30 % glycerine as well as aliquots of the plasmid maxi preps were stored at –80 °C. Plasmid maxi preps were generated with a Qiagen Plasmid Maxi kit. The insert containing the complete genome of the novel PV was sequenced by primer walking (GENterprise).

Protein prediction and codon usage.
Open reading frames (ORF) were predicted using ORF Finder (). They were confirmed by the manual alignment of nucleotide (nt) and amino acid (aa) sequences to homologous regions of most similar PV types in Se-Al (v2.0a72; ). Primary sequence analysis of the predicted proteins was performed with ProtParam (Gasteiger et al., 2005; ), Prosite (Hulo et al., 2008; ), and PSORT II (Nakai & Horton, 1999; /form2.html). The different codon usage of E4 with respect to the remaining ORF of EHPV (see below) was the basis to identify the putative start of the E4 gene, as previously described (Puigbó et al., 2008a).

Transcription factor binding sites (TFBS) were predicted with Match ( ), using a library of mononucleotide weight matrices from TRANSFAC 9.3 (Matys et al., 2003). A positive match was considered true positive, if the score value was higher or equal to the core similarity cut-off. We chose a high cut-off value to minimize the number of false-positive and false-negative results. The presence of the predicted TFBS in the NCR2 was confirmed with Cister (), using posterior decoding and based on a hidden Markov model (Frith et al., 2001). Only elements with a posterior probability higher than 0.8 were considered positive, and only results calculated by both algorithms were considered true positives.

The Codon Adaptation Index (CAI) is a measure of the similarity of synonymous codon usage between a gene and a reference set. An interval of expected CAI values were calculated using the E-CAI server, accounting for biases in nucleotide and/or amino acid composition (Puigbó et al., 2008b; ). We aimed to discern whether the differences in CAI between viruses and hosts were statistically significant, or whether they were merely artefacts arising from internal biases in the G+C content and/or aa composition of query sequences. Codon usage for E. europaeus was downloaded from the codon usage database ().

Sequence alignment and phylogenetic analyses.
The taxon sample for phylogenetic analyses covered the currently known diversity of PV and comprised 67 complete PV sequences (including 20 systematically representative human PV types; see Supplementary Table S1). After exclusion of the highly divergent E4 gene region, an aa alignment (comprising the genes E6–E7–E1–E2–L2–L1) was constituted using MAFFT (v6.523; Katoh et al., 2005; ), with the default settings. The final matrix is available at . A sequence identity matrix of the L1 was generated by BioEdit (v7.0.0; Hall, 1999). Phylogenetic analyses were performed for each of the four large genes separately, and a multi-gene analysis was conducted after exclusion of the L2 gene because it perturbs phylogenetic reconstructions (Gottschling et al., 2007b). All computations were run on a Linux cluster at the North German High Performance Computer.

ML-based phylogenetic analyses were conducted using the parallel Message Passing Interface (MPI) version of RAxML (v7.0.3; Stamatakis, 2006; ) applying the rtREV matrix (Dimmic et al., 2002), as previously described (Gottschling et al., 2007b). We executed 10-tree searches from distinct random stepwise addition sequence maximum-parsimony (MP) starting trees. Thereafter, we executed 1000 non-parametric bootstraps with RAxML under the partition data mode, and the bootstrap support values were drawn on the best-scoring ML-tree.

Bayesian phylogenetic analyses were performed with BEAST (v1.4.7; Drummond & Rambaut, 2007; ). For the WAG aa substitution model (Whelan & Goldman, 2001) with four discrete γ categories, we used an uncorrelated relaxed clock. The unweighted pair group method with arithmetic mean was used to construct a starting tree for BEAST analyses, and the final topology was estimated by combining three independent 20 000 000 chains using 2 000 000 burn-in cycles and sampling every 1000 iteration. Trees were rooted using the two known bird PV sequences, based on a previous E1 tree topology (García-Vallvé et al., 2005).

EHPV genome structure
The EHPV genome (GenBank accession no. FJ379293[GenBank] ) consisted of 8256 bp. As inferred from protein prediction software programs and sequence evaluation in a comparative alignment, seven ORFs were encoded on the same strand and in the same orientation, namely the genes E6, E7, E1, E2, E4, L2 and L1 (Fig. 1). The potential E4 ORF nested within the E2 ORF and was characterized by proline-rich stretches. All gene descriptions and protein primary sequence analyses are compiled in Table 1.

(12K):

Fig. 1. A 1172 bp non-coding region between the genes E2 and L2 of EHPV. Organization of the EHPV genome and distribution of potential cis regulative elements (see legend). Gene lengths are to scale and are indicated by boxes, including information on start and end nucleotide positions. BS, Binding site.

Table 1. Primary sequence analysis of the EHPV genes and optimal model parameters of the different alignment partitions

The genome of EHPV contained two NCRs. The NCR1 was located upstream the E6 gene and comprised 472 nt (6 % of the complete genome), whereas the NCR2 was 1172 bp (14 % of the complete genome) and was located between the genes E2 and L2. A compilation of the predicted TFBS (mainly located in the NCR1) and other non-coding elements as well as of structural elements of the EHPV genome is provided in the Supplementary Material (Supplementary Tables S2 and S3, available in JGV Online).

The nucleotide composition of the EHPV genome differed from that of the host. Besides, all ORF in EHPV were poorly adapted to the codon usage of the host, E. europaeus, even after accounting for these compositional differences. The only exception to this de-adaptation was E4, resembling the host genes more than any other EHPV gene in terms of composition and codon usage (Table 2).

Table 2. Open reading frames in the EHPV genome and codon adaptation index (CAI) either to the codon usage of the virus (EHPV) or to the host (Erinaceus europaeus) The expected CAI average was the average CAI value for 500 random sequences of the same length, same G+C content, and same amino acid composition as the corresponding viral sequence. The normalized CAI value was the quotient of CAI value and expected CAI average. The 95 % confidence was the upper one-side 95 % tolerance interval, representing the upper limit or estimated maximum value of the CAI of a random sequence. G+C content is given for the full-length gene as well as for each of the three possible coding positions. Reference values for the host genome are given in parentheses in the column head. The E4 gene is better adapted to the codon usage of the host than any other of the EHPV genes.

Systematic position of EHPV within the family Papillomaviridae
The L1 sequence of EHPV was most similar to that of McPV-2 (63.6 % similarity), as inferred from a sequence similarity matrix (available as Supplementary Data, available in JGV Online). The corresponding region in the complete genome was identical with an L1 sequence fragment (GenBank accession number EF396272), generated by using the primer pair FAP59 and FAP64. The alignment for the multi-gene phylogenetic analysis (i.e. excluding the genes E4 and L2) was 2262 aa positions in length. Of these sites, 1574 (70 %) were parsimony-informative (23.5 per terminal taxon). Details of the different partitions are given in Table 1.

Fig. 2 shows the best-scoring ML-tree (–ln=150 954.879), with the statistical support values for the two phylogenetic approaches used. Mammalian PV were monophyletic, irrespectively whether the data were analysed under the likelihood criterion (bootstrap support value from ML analysis: 100 LBS) or the Bayesian approach (Bayesian posterior probability: 1.00 BPP). They segregated into the four monophyletic supertaxa α+o- (50 LBS), β+γ+π+ξ- (60 LBS, 1.00 BPP), δ+ε- (100 LBS, 1.00 BPP), and κ+λ+µ+ν+σ-PV (99 LBS, 1.00 BPP). Within the β+γ+π+ξ-PV, EHPV was the sister of genus Betapapillomavirus in the multi-gene alignment (87 LBS, 1.00 BPP) as well as in separate gene analyses of E1 (67 LBS, 1.00 BPP), E2 (0.99 BPP), and L1 (63 LBS, 1.00 BPP, see Supplementary Figures S1–S4, available in JGV Online). In the L2 gene analysis, EHPV was the closest relative of CfPV-2 (58 LBS), and together they constituted the sister group of β-PV+MnPV-1+RaPV.

(46K):

Fig. 2. EHPV is the closest known relative of genus Betapapillomavirus. ML tree of PV comprising a representative set of 20 human and 47 non-human types and putatively novel types, as inferred from a combined E6–E7–E1–E2–L1 aa analysis (1574 parsimony-informative aa positions). Generic PV clades (de Villiers et al., 2004) are indicated by Greek lettering, and EHPV is highlighted by a pink arrow. ART, Artiodactyla; CAR, Carnivora; CET, Cetacea; CHI, Chiroptera; INS, Insectivora; LAG, Lagomorpha; PAS, Passeriformes; PER, Perissodactyla; PRI, Primates; PSI, Psittaciformes; ROD, Rodentia; SIR, Sirenia. Supertaxa in the sense of Gottschling et al. (2007b) are coloured red (α+o-PV), green (β+γ+π+ξ-PV), blue (δ+ε-PV) and ochre (κ+λ+µ+ν+σ-PV), respectively. Branch lengths are drawn to scale. Scale bar, number of aa substitutions per site. Numbers on branches are bootstrap support values to clusters of the right of them (above: ML criterion; below: Bayesian posterior probabilities; values under 50 and 0.90, respectively, are not shown). PV taxa with an E5 gene (i.e. α- and δ-PV) and a second, long non-coding region (i.e. λ-PV and EHPV) are highlighted by shaded boxes; note that the corresponding taxa are only distantly related to each other.

Knowledge about PV diversity is still sparse. In this study, we present the complete genome of the first PV type from the "Insectivora", EHPV, and have investigated its phylogenetic relationships within the family Papillomaviridae. We render the preliminary systematic position of this PV type more precisely than it has been possible in a previous L1 sequence fragment analysis (Gottschling et al., 2008). Despite the limited knowledge about PV diversity, the viruses have been generally regarded as co-diverging with their corresponding hosts (Chan et al., 1995; Van Ranst et al., 1995; Bernard et al., 2006). If co-phylogeny was the only driving force for viral diversification, EHPV should be closely related to types isolated from other Laurasiatheria. However, the sister group relationship between EHPV (host: E. europaeus, Eulipotyphla, Laurasiatheria) and β-PV (hosts: Primates, Afrotheria) contradicts this view, because the host taxa are only distantly related (Murphy et al., 2001; Springer et al., 2004; Bininda-Emonds et al., 2007).

Two possible interpretations are conceivable for the systematic position of EHPV: (i) either the close relationship between EHPV and β-PV results from alternative evolutionary processes such as interspecies transmission, as it has been also reconstructed for other regions of the PV tree (Gottschling et al., 2007a, b); or (ii) numerous PV lineages from various mammalian host taxa have not been identified so far (due to the limited investigation of non-human hosts) that would show a co-phylogenetic structure in a PV subtree if they were known. However, even the latter interpretation would lead to the rejection of the co-phylogenetic hypothesis in general because it necessarily postulates the origin of several (subsequently parallel and co-diverging, paralogous; Jackson, 2005) lineages by another evolutionary mechanism than co-phylogeny.

The main core of the genome, comprising the four major genes E1, E2, L2 and L1, is highly conserved among the family Papillomaviridae. PV appear to have a modular organization of the genome, and some clades are characterized by apomorphic traits such as deletions or insertions of several smaller genes. One such trait is the presence of a long NCR2 between the genes E2 and L2, as described for the λ-PV infecting carnivores (Terai & Burk, 2002; Rector et al., 2007). EHPV infecting an insectivoran species similarly exhibits a long NCR2 between E2 and L2, but homology with the NCR2 from λ-PV is unlikely. This assumption is reinforced by the distant relationships between λ-PV and EHPV, as inferred from the phylogenetic analysis.

Some of the predicted TFBS present in the NCR1 of EHPV, such as E2, AP-1, NF-Y, NF-1, GATA, USF or the CCAAT-box as promoter element, have been previously reported from other PV types and may constitute the core of the regulation repertoire of PV gene expression (García-Vallvé et al., 2006). Additionally, other predicted TFBS might be present in the NCR2, but their physiological importance remains to be investigated experimentally.

All ORFs in EHPV are poorly adapted to the codon usage of the host, E. europaeus, as is generally the case for PV (Bravo & Müller, 2005). Since PV do not encode any molecular machinery involved in translation, systematic deviation from the codon usage of the host may lead to inefficient translation, decreased amount of translated protein, or decreased amount of properly folded protein (Bravo & Müller, 2005; Drummond & Wilke, 2008). The only exception to this de-adaptation is E4 (Zhao et al., 2003), usually the most highly expressed gene during PV infection (Doorbar, 2005). The shift in codon usage in E4, nested within the E2 gene, can in fact be used to predict the E1^E4 annotation, despite the absence of a well-defined splice acceptor site (Puigbó et al., 2008a). However, an explanation integrating the evolution systematic codon usage bias and translation in PV is still wanting.

In conclusion, EHPV belongs to the β+γ+π+ξ-PV supertaxon, which thus comprises viruses infecting Eulipotyphla, Primates, Carnivora, Rodentia and "Artiodactyla" (the latter is a paraphyletic mammalian taxon). The new PV type constitutes the sister group of β-PV infecting primates. The intermingled phylogenetic relationships within the β+γ+π+ξ-PV supertaxon with respect to their hosts, and particularly the position of EHPV therein, underline the complex evolutionary history of the family Papillomaviridae.

We thank B. Kallies and T. Steinke (Konrad-Zuse-Zentrum für Informationstechnik, Berlin) for technical guidance during the establishment of our HPC access. I. G. B. was supported by the Volkswagen Foundation.

References

Ahola, H., Bergman, P., Ström, A. C., Moreno-Lopéz, J. & Pettersson, U. (1986). Organization and expression of the transforming region from the European elk papillomavirus (EEPV). Gene 50, 195–205.[CrossRef][Medline]

Antonsson, A. & Hansson, B. G. (2002). Healthy skin of many animal species harbors papillomaviruses which are closely related to their human counterparts. J Virol 76, 12537–12542.[Abstract/Free Full Text]

Bernard, H.-U., Calleja-Macias, I. E. & Dunn, S. T. (2006). Genome variation of human papillomavirus types: phylogenetic and medical implications. Int J Cancer 118, 1071–1076.[CrossRef][Medline]

Bininda-Emonds, O. R. P., Cardillo, M., Jones, K. E., MacPhee, R. D. E., Beck, R. M. D., Grenyer, R., Price, S. A., Vos, R. A., Gittleman, J. L. & Purvis, A. (2007). The delayed rise of present-day mammals. Nature 446, 507–512.[CrossRef][Medline]

Bravo, I. G. & Alonso, Á. (2004). Mucosal human papillomaviruses encode four different E5 proteins whose chemistry and phylogeny correlate with malignant or benign growth. J Virol 78, 13613–13626.[Abstract/Free Full Text]

Bravo, I. G. & Müller, M. (2005). Codon usage in papillomavirus genes. Papillomavirus Rep 16, 63–72.[CrossRef]

Chan, S.-Y., Delius, H., Halpern, A. L. & Bernard, H.-U. (1995). Analysis of genomic sequences of 95 papillomavirus types: uniting typing, phylogeny, and taxonomy. J Virol 69, 3074–3083.[Abstract/Free Full Text]

de Villiers, E.-M., Fauquet, C., Broker, T. R., Bernard, H.-U. & zur Hausen, H. (2004). Classification of papillomaviruses. Virology 324, 17–27.[CrossRef][Medline]

Dimmic, M. W., Rest, J. S., Mindell, D. P. & Goldstein, R. A. (2002). rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny. J Mol Evol 55, 65–73.[CrossRef][Medline]

Doorbar, J. (2005). The papillomavirus life cycle. J Clin Virol 32 (Suppl. 1), S7–S15.[Medline]

Drummond, A. J. & Rambaut, A. (2007). BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7, 214[CrossRef][Medline]

Drummond, D. A. & Wilke, C. O. (2008). Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134, 341–352.[CrossRef][Medline]

Forslund, O., Antonsson, A., Nordin, P., Stenquist, B. & Hansson, B. G. (1999). A broad range of human papillomavirus types detected with a general PCR method suitable for analysis of cutaneous tumours and normal skin. J Gen Virol 80, 2437–2443.[Abstract/Free Full Text]

Frith, M. C., Hansen, U. & Weng, Z. P. (2001). Detection of cis-element clusters in higher eukaryotic DNA. Bioinformatics 17, 878–889.[Abstract/Free Full Text]

García-Vallvé, S., Alonso, Á. & Bravo, I. G. (2005). Papillomaviruses: different genes have different histories. Trends Microbiol 13, 514–521.[CrossRef][Medline]

García-Vallvé, S., Iglesias-Rozas, J. R., Alonso, Á. & Bravo, I. G. (2006). Different papillomaviruses have different repertoires of transcription factor binding sites: convergence and divergence in the upstream regulatory region. BMC Evol Biol 6, 20[CrossRef][Medline]

Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M. R., Appel, R. D. & Bairoch, A. (2005). Protein identification and analysis tools on the ExPASy server. In The Proteomics Protocols Handbook, pp. 571–607. Edited by J. M. Walker. Totowa, NJ: Humana Press.

Gottschling, M., Köhler, A., Stockfleth, E. & Nindl, I. (2007a). Phylogenetic analysis of beta-papillomaviruses as inferred from nucleotide and amino acid sequence data. Mol Phylogenet Evol 42, 213–222.[CrossRef][Medline]

Gottschling, M., Stamatakis, A., Nindl, I., Stockfleth, E., Alonso, Á. & Bravo, I. G. (2007b). Multiple evolutionary mechanisms drive papillomavirus diversification. Mol Biol Evol 24, 1242–1258.[Abstract/Free Full Text]

Gottschling, M., Wibbelt, G., Wittstatt, U., Stockfleth, E. & Nindl, I. (2008). Novel papillomavirus isolates from Erinaceus europaeus (Erinaceidae, Insectivora) and the Cervidae (Artiodactyla), Cervus timorensis and Pudu puda, and phylogenetic analysis of partial sequence data. Virus Genes 36, 281–287.[CrossRef][Medline]

Hall, T. A. (1999). BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41, 95–98.

Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Cuche, B. A., de Castro, E., Lachaize, C., Langendijk-Genevaux, P. S. & Sigrist, C. J. A. (2008). The 20 years of PROSITE. Nucleic Acids Res 36, D245–D249.[Abstract/Free Full Text]

Jackson, A. (2005). The effect of paralogous lineages on the application of reconciliation analysis by cophylogeny mapping. Syst Biol 54, 127–145.[Abstract/Free Full Text]

Katoh, K., Kuma, K., Toh, H. & Miyata, T. (2005). MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33, 511–518.[Abstract/Free Full Text]

Matys, V., Fricke, E., Geffers, R., Gößling, E., Haubrock, M., Hehl, R., Hornischer, K., Karas, D., Kel, A. E. & other authors (2003). TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 31, 374–378.[Abstract/Free Full Text]

Moreno-Lopéz, J., Pettersson, U., Dinter, Z. & Philipson, L. (1981). Characterization of a papilloma virus from the European elk (EEPV). Virology 112, 589–595.[CrossRef][Medline]

Murphy, W. J., Eizirik, E., O'Brien, S. J., Madsen, O., Scally, M., Douady, C. J., Teeling, E., Ryder, O. A., Stanhope, M. J. & other authors (2001). Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294, 2348–2351.[Abstract/Free Full Text]

Myers, G., Lu, H., Calef, C. & Leitner, T. (1996). Heterogeneity of papillomaviruses. Semin Cancer Biol 7, 349–358.[CrossRef][Medline]

Nakai, K. & Horton, P. (1999). PSORT: a program for detecting the sorting signals of proteins and predicting their subcellular localization. Trends Biochem Sci 24, 34–35.[CrossRef][Medline]

Narechania, A., Chen, Z., DeSalle, R. & Burk, R. D. (2005). Phylogenetic incongruence among oncogenic genital alpha human papillomaviruses. J Virol 79, 15503–15510.[Abstract/Free Full Text]

Page, R. D. M., Clayton, D. H. & Paterson, A. M. (1996). Lice and cospeciation: a response to Barker. Int J Parasitol 26, 213–218.[CrossRef][Medline]

Puigbó, P., Bravo, I. G. & Garcia-Vallvé, S. (2008a). CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct 3, 38[CrossRef][Medline]

Puigbó, P., Bravo, I. G. & García-Vallvé, S. (2008b). E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI). BMC Bioinformatics 9, 65[CrossRef][Medline]

Rector, A., Van Doorslaer, K., Bertelsen, M., Barker, I. K., Olberg, R. A., Lemey, P., Sundberg, J. P. & Van Ranst, M. (2005). Isolation and cloning of the raccoon (Procyon lotor) papillomavirus type 1 by using degenerate papillomavirus-specific primers. J Gen Virol 86, 2029–2033.[Abstract/Free Full Text]

Rector, A., Lemey, P., Tachezy, R., Mostmans, S., Ghim, S.-J., Van Doorslaer, K., Roelke, M., Bush, M., Montali, R. J. & other authors (2007). Ancient papillomavirus-host co-speciation in Felidae. Genome Biol 8, R57[CrossRef][Medline]

Reyes, A., Gissi, C., Catzeflis, F., Nevo, E., Pesole, G. & Saccone, C. (2004). Congruent mammalian trees from mitochondrial and nuclear genes using Bayesian methods. Mol Biol Evol 21, 397–403.[Abstract/Free Full Text]

Springer, M. S., Stanhope, M. J., Madsen, O. & de Jong, W. W. (2004). Molecules consolidate the placental mammal tree. Trends Ecol Evol 19, 430–438.[CrossRef][Medline]

Stamatakis, A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690.[Abstract/Free Full Text]

Terai, M. & Burk, R. D. (2002). Felis domesticus papillomavirus, isolated from a skin lesion, is related to canine oral papillomavirus and contains a 1.3 kb non-coding region between the E2 and L2 open reading frames. J Gen Virol 83, 2303–2307.[Abstract/Free Full Text]

Van Ranst, M., Kaplan, J. B., Sundberg, J. P. & Burk, R. D. (1995). Molecular evolution of papillomaviruses. In Molecular Basis of Virus Evolution, pp. 455–476. Edited by A. Gibbs, C. H. Calisher & F. García-Arenal. Cambridge: Cambridge University Press.

Varsani, A., van der Walt, E., Heath, L., Rybicki, E. P., Williamson, A. L. & Martin, D. P. (2006). Evidence of ancient papillomavirus recombination. J Gen Virol 87, 2527–2531.[Abstract/Free Full Text]

Whelan, S. & Goldman, N. (2001). A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18, 691–699.[Abstract/Free Full Text]

Zhao, K. N., Liu, W. J. & Frazer, I. H. (2003). Codon usage bias and A+T content variation in human papillomavirus genomes. Virus Res 98, 95–104.[CrossRef][Medline]

Received 17 October 2008; accepted 27 November 2008.