Abstract
We present the complete genomic sequence of Mycoplasma fermentans, an organism suggested to be associated with the pathogenesis of rheumatoid arthritis in humans. The genome is composed of 977 524 bp and has a mean G+C content of 26.95 mol%. There are 835 predicted protein-coding sequences and a mean coding density of 87.6 %. Functions have been assigned to 58.8 % of the predicted protein-coding sequences, while 18.4 % of the proteins are conserved hypothetical proteins and 22.8 % are hypothetical proteins. In addition, there are two complete rRNA operons and 36 tRNA coding sequences. The largest gene families are the ABC transporter family (42 members), and the functionally heterogeneous group of lipoproteins (28 members), which encode the characteristic prokaryotic cysteine ‘lipobox’. Protein secretion occurs through a pathway consisting of SecA, SecD, SecE, SecG, SecY and YidC. Some highly conserved eubacterial proteins, such as GroEL and GroES, are notably absent. The genes encoding DnaK-DnaJ-GrpE and Tig, forming the putative complex of chaperones, are intact, providing the only known control over protein folding. Eighteen nucleases and 17 proteases and peptidases were detected as well as three genes for the thioredoxin-thioreductase system. Overall, this study presents insights into the physiology of M. fermentans, and provides several examples of the genetic basis of systems that might function as virulence factors in this organism.
- ADI, arginine deiminase
- CDS, coding DNA sequence
- COG, Clusters of Orthologous Groups
- IS, insertion sequence
-
The GenBank/EMBL/DDBJ accession number for the genome sequence of Mycoplasma fermentans is CP001995.
Edited by: C. Citti
INTRODUCTION
The mycoplasmas (class Mollicutes) form a large group of prokaryotic micro-organisms that are divided into nine genera with over 200 species. They are distinguished from ordinary bacteria by their small size, minute genome (0.58–2.20 Mb) and the total lack of a cell wall (Rottem, 2003). Phylogenetically, the mycoplasmas are related to Gram-positive bacteria, from which they developed by genome reduction (Maniloff, 2002). The smallest genome of an organism capable of growing in a microbial culture is that of Mycoplasma genitalium [0.58 Mb (Fraser et al., 1995)]. Comparative genomic studies suggest that the minute genome of M. genitalium still carries almost double the number of genes included in the theoretical minimal gene set essential for cellular function (Glass et al., 2006; Koonin, 2000).
Owing to their limited biosynthetic capabilities, most mycoplasmas are parasites, exhibiting strict host and tissue specificities (Razin et al., 1998). The mycoplasmas enter an appropriate host in which they multiply and survive for long periods of time. These micro-organisms have evolved the molecular mechanisms needed to deal with the host immune response and the transfer to and colonization of a new host. These mechanisms include mimicry of host antigens, survival within phagocytic and non-phagocytic cells, and generation of phenotypic plasticity. Mycoplasmas have long resisted detailed genetic analyses because of their complex nutritional requirements, poor growth yields and a paucity of useful genetic tools (Rottem & Naot, 1998).
Mycoplasma fermentans, isolated decades ago from the urogenital tract (Ruiter & Wentholt, 1952), has been implicated in several disease conditions. Interest in this organism has recently increased because of its possible role in the pathogenesis of rheumatoid arthritis (Horowitz et al., 2000; Kawahito et al., 2008; Schaeverbeke et al., 1996; Williams et al., 1970). Over the last decade, intensive studies have been carried out in order to understand the strategy employed by M. fermentans to interact with host cells and to avoid or subvert host protective measures. The identification of mycoplasmal membrane components that participate in the adhesion of the parasite and the finding that some mycoplasmas can reside intracellularly (Rottem, 2003) open up new horizons in the study of the role of mycoplasma and host surface molecules in mycoplasma–host cell interactions. Intracellular organisms are resistant to host defence mechanisms as well as to antibiotic treatment, which may account for the difficulty in eradicating mycoplasmas from cell cultures (Yavlovich et al., 2004). The fusion of M. fermentans with eukaryotic host cells raises exciting questions on how microinjection of mycoplasmal components into eukaryotic cells affects host cells (Franzoso et al., 1992; Rottem, 2003). The fusion process as well as the invasion of host cells by M. fermentans brings up an emerging theme in mycoplasma research, the subversion by M. fermentans of host cell functions mainly in signal-transduction pathways and cytoskeletal organization (Yavlovich et al., 2004).
The discovery of genetic systems that enable mycoplasmas including M. fermentans to rapidly change their antigenic characteristics has been one of the major developments in recent mycoplasma research (Wise, 1993; Wise et al., 1993). It is now clear that these minute, wall-less micro-organisms possess an impressive ability to maintain a surface architecture that is antigenically and functionally versatile (Razin et al., 1998). These variable surface antigens undoubtedly contribute to the implication of M. fermentans in rheumatoid arthritis, which is usually chronic in nature (Haier et al., 1999; Horowitz et al., 2000; Johnson et al., 2000). In the present study, we describe the complete genome sequence composition of M. fermentans JER, closely related to M. fermentans GII (ATCC 15474).
METHODS
Organisms and growth conditions.
M. fermentans JER [from our culture collection, derived by in vitro passages from a strain (ATCC 15474) isolated from a human patient] was used throughout the study. The organisms were grown in Hayflick's medium (Hayflick, 1965) supplemented with 5 % (v/v) horse serum. The cultures were grown for 24–48 h at 37 °C. Growth was monitored by measuring the OD640 of the culture and by recording pH changes. The organisms were collected at the mid-exponential phase of growth by centrifugation at 12 000 g for 20 min, washed twice, and resuspended in cold 10 mM Tris/HCl, adjusted to pH 7.5, and 10 mM EDTA in 250 mM NaCl. Total protein was determined by the Bradford assay (Bradford, 1976) and the protein concentration was adjusted to 0.5–1 mg ml−1.
Genome sequencing, assembly and gap closure.
A Sanger/pyrosequencing hybrid approach was used for whole-genome sequencing of M. fermentans JER. Total genomic DNA of M. fermentans JER was extracted by using the Bio-Rad AquaPure Genomic DNA Isolation kit (Bio-Rad) and sheared mechanically (Nebulizer, Invitrogen). For the Sanger sequencing approach, a shotgun library was constructed by using 1.5–4 kb size fractions. The fragments were cloned into vectors pTZ19R (cmR) (Amersham) and pCR4-TOPO (Invitrogen) or pCR2.1-TOPO (TOPO TA Cloning kit for Sequencing, Invitrogen). Plasmid DNAs were isolated using BioRobot 8000 instruments (Qiagen). Insert ends of the recombinant plasmids were sequenced with MegaBACE 1000 and 4000 (GE Healthcare) and ABI 377 or ABI Prism 3730XL (Applied Biosystems) automated DNA sequencers. Sequences were processed with Phred, and 16 779 sequences were assembled into contigs by using the Phrap assembly tool (). A second portion of total genomic DNA of M. fermentans JER was pyrosequenced by using a fraction of nebulized chromosomal DNA, and Roche FLX sequencing was done according to the manufacturer's protocols (Roche Applied Science). The 454 reads were assembled into 41 large contigs (contigs bigger than 500 bp) with a mean length of 22 475 bp and a maximum length of 214 917 bp by using the Newbler Assembler software (454 Life Sciences, Roche Applied Science). Editing of shotgun sequences and 454 sequences was done by using GAP4 as part of the Staden software package (Staden et al., 2000). To solve problems with misassembled regions caused by repetitive sequences and to close remaining sequence gaps, PCR, combinatorial multiplex PCR and primer walking with recombinant plasmids were used. PCR was carried out with the BioXact kit (Qiagen) as described by the manufacturer, with product-dependent variations of the cycling program and the amount of enzyme.
Gene prediction and annotation.
Coding DNA sequences (CDSs) and ORFs were predicted with yacop (Tech & Merkl, 2003) using the ORF finders Glimmer, Critica and Z-curve therein. The output was verified and edited manually by using criteria such as the presence of a ribosome-binding site, GC frame plot analysis and similarity to known protein-encoding sequences. Annotation was done with a two-step approach. Initially, all proteins were screened against Swiss-Prot data and publicly available protein sequences from other completed genomes. All predictions were verified or modified manually by using the ergo software package (Overbeek et al., 2003) licensed by Integrated Genomics, comparing the protein sequences with the Pfam, GenBank, ProDom, Clusters of Orthologous Groups (COG) and PROSITE public databases. TMpred () was used to predict transmembrane helices within the CDSs.
Bioinformatics.
Functional annotations were compared utilizing the Integrated Microbial Genomes (IMG) database of the Joint Genome Institute, US Department of Energy (). Multiple sequence alignment was carried out using the clustal w2 program (). Amino acid shading was performed using BoxShade 3 ().
Complete genome comparisons were done with act (Carver et al., 2008), based on replicon-specific nucleotide blast (Altschul et al., 1990, 1997), and with protein-based BiBlast comparisons (A. Wollherr, personal communication) with all known sequenced Mycoplasma genomes.
RESULTS AND DISCUSSION
General genome features
The genome of M. fermentans JER was sequenced and annotated as described above and deposited in GenBank. A comparison of the chromosomal map of M. fermentans JER with those of M. fermentans PG18 (GenBank accession no. AP009608) and the closely related Mycoplasma agalactiae strains PG2 and 5632 (CU179680 and FP671138, respectively) is presented in Fig. 1⇓. The chromosome of M. fermentans JER is a single circular DNA of 977 524 bp with a mean G+C content of 26.95 mol% and 835 predicted CDSs. No plasmids were found, and the genome does not harbour genes of a cryptic prophage (Röske et al., 2004). The JER genome is at least 26.5 kb smaller than that of the partially sequenced genome of M. fermentans PG18. Pulsed-field analyses of clinical isolates of M. fermentans have revealed differences in the overall genome size (Schaeverbeke et al., 1998). Possible explanations for these differences may be the presence of a prophage (Röske et al., 2004), integrative conjugal elements [ICEF (Calcutt et al., 2002)] or different numbers of copies of the insertion sequences (ISs). Table 1⇓ shows that out of the 835 CDSs of M. fermentans JER, 633 have high similarity with those of M. fermentans PG18 and about 30 have high similarity with those of M. agalactiae strains.
Comparison of the chromosomal map of M. fermentans JER with those of M. fermentans PG18 and two M. agalactiae strains. The two outermost circles (blue) are positive and negative strand protein-coding sequences of M. fermentans JER (835 CDSs). The following three circles (from the outside inwards) are the results of a two-way genome comparison between M. fermentans JER and M. fermentans PG18, M. agalactiae 5632 and M. agalactiae PG2 performed as described in Methods. Genes shared between the pair under comparison are indicated in grey (core) and variable genome regions are indicated in red (pan). The innermost circle represents the percentage G+C content. A high G+C content is indicated in green and a low G+C content in purple.
Comparison of the CDSs of M. fermentans JER with those of M. fermentans PG18 and the closely related M. agalactiae
Pan, number of CDSs with no similarity; Core, number of CDSs with similarity.
The features of the M. fermentans JER genome and those of seven other Mycoplasma species that belong to the hominis phylogenetic group sequenced so far are presented in Table 2⇓, and the functional classification of proteins according to the COG database (Tatusov et al., 1997) of these species are presented in Table 3⇓, showing that the general breakdown of proteins into functional categories is similar to that of other members of the hominis group, with an emphasis on transport of compounds and protein synthesis (Chambaud et al., 2001; Dybvig et al., 2008; Jaffe et al., 2004; Minion et al., 2004; Pereyre et al., 2009; Sirand-Pugnet et al., 2007; Vasconcelos et al., 2005). The main difference between M. fermentans JER and the other members of the hominis group tested was the higher number of proteins involved in replication, recombination and repair.
General genome features of Mycoplasma species of the hominis phylogenetic group
nr, Not reported.
Functional classification of protein CDSs of Mycoplasma species of the hominis phylogenetic group according to COG (Tatusov et al., 1997)
nd, Not detected.
As shown elsewhere, not all M. fermentans strains have been found to carry ICEF elements (Calcutt et al., 2002). The genomic location of ICEF insertion sites in M. fermentans PG18 and the possible ICEF excision sites in the genome of M. fermentans JER are shown in Fig. 2⇓. In the genome of strain JER, of 22 ORFs present in the ICEF of strain PG18 (Calcutt et al., 2002), only ORF9 has close homologues that are present in strain JER (MFE_04960, which shows 95 % identity and 97 % similarity, and MFE_02760, which shows 94 % identity and 97 % similarity). In addition, several distant homologues of ORF9 were identified (MFE_08000, which shows 64 % identity and 70 % similarity, and MFE_04903 and MFE_02860, which both show 25 % identity and 49 % similarity).
Genomic location of ICEF insertion sites in M. fermentans PG18 and the possible ICEF excision sites in the genome of M. fermentans JER. The locations of the four ICEF insertion sites (black arrows) in M. fermentans PG18 are shown, together with the aligned genomic regions of strain JER. CDSs and ISs flanking each of the four ICEF units are indicated by open arrows and open rectangles, respectively. The ORF numbers of M. fermentans PG18 and their designation (Calcutt et al., 2002), and the gene ID (MFE_) of M. fermentans JER are marked within the arrows and rectangles. A truncated transposase CDS in strain PG18 is labelled tnp. Genomic regions equivalent to insertion sites of ICEF-IIA and ICEF-IIB that are not present in strain JER are indicated by double-headed arrows.
Twenty-seven full or truncated ISs, comprising about 1.7 % of the genome, were found in M. fermentans JER. ISs are small transposable DNA fragments, which may provide the structural basis necessary to enable the rearrangement of genomic fragments, incorporating foreign DNA and mediating homologous recombination between multiple copies present in a given genome (Mahillon et al., 1999). Their expansion, genome location and composition may differ among related bacteria, representing an important source of genomic diversity.
The IS elements detected belong to the IS30, IS3 and mutator gene families, some of which have been identified in M. fermentans (Calcutt et al., 1999; Hu et al., 1990). For example, nine copies of IS1630 are present in the genome of M. fermentans strain JER. However, only four IS1630 isoforms (MFE_06860, MFE_07450, MFE_07770 and MFE_07880) translate to the full copy, a basic protein of either 366 or 387 aa (depending on the start codon utilized) with 27 bp inverted repeats (IRs). Similarly to IS1630 of M. fermentans PG18, four copies of M. fermentans JER IS1630 elements generated upon insertion long direct repeats (DRs), which varied between 19 and 34 bp and are derived from rho-independent transcription terminators of neighbouring genes. Interestingly, sequence analysis of the flanking regions of these ISs revealed that only one of the four IS1630 elements was inserted in the same place (downstream of p63 and upstream of malp), as shown for IS1630A of M. fermentans PG18 (Calcutt et al., 1999). The insertion of the other three IS1630 elements at different locations suggests that IS1630 has the ability to transpose. In addition, one copy of an N-truncated IS30-related element showing 43 % identity and 68 % similarity to ISMbov1 of Mycoplasma bovis (Lysnyansky et al., 2009; Thomas et al., 2005), and 42 % identity and 63 % similarity to the MAGa5890 transposase of M. agalactiae 5632 (Nouvel et al., 2010), was identified in M. fermentans JER.
The family of IS3-related elements consists of six truncated or disrupted ISMi1 elements, as described by Hu et al. (1990). In addition, MFE_01730 (80 aa), demonstrating 100 % identity to transposase (tnp), which has been identified downstream of ICEF-IIA in M. fermentans PG18 (Calcutt et al., 2002), was found in M. fermentans JER. This IS element showed 73 % identity and 86 % similarity to a pseudogene of the IS1138 transposase of Mycoplasma hominis (Pereyre et al., 2009).
Finally, screening of the M. fermentans JER genome in silico revealed the presence of 10 CDSs, most of which were truncated or disrupted, belonging to the family of mutator-like transposable elements (MULEs). One such element has been identified in close proximity to ICEF-IIA and ICEF-IIB in M. fermentans PG18 and designated ISMf1 (Calcutt et al., 2002). MULEs have also been identified in the genomes of mycoplasmas such as Mycoplasma gallisepticum (Papazisi et al., 2003), Mycoplasma hyopneumoniae J (Vasconcelos et al., 2005) and Mycoplasma penetrans HF-2 (Sasaki et al., 2002), and also in Ureaplasma parvum (Glass et al., 2000).
The density of ISs in M. fermentans JER is much lower than in the phylogenetically related M. bovis PG45 (Lysnyansky et al., 2009), but is higher than in M. agalactiae strain PG2 (Sirand-Pugnet et al., 2007) and Mycoplasma synoviae strain 53 (Vasconcelos et al., 2005). Most of the IS elements identified in M. fermentans JER are truncated, suggesting their inability to transpose within the genome. However, high sequence similarity among IS isoforms (full and truncated) may allow them to be involved passively in homologous DNA recombination. Analysis of the impact of IS insertions revealed that most elements are inserted within intergenic regions. However, the IS30-like element (IS1630 isoform, MFE_00510) was found to be inserted into the 3′ region of MFE_00500, which encodes a hypothetical protein.
The comparative organization of the two hsd loci, encoding type I restriction-modification systems in M. fermentans JER, is presented in Fig. 3⇓, and shows two clusters of type I restriction-modification systems containing nine CDSs. In addition, analysis of the M. fermentans genome revealed three genes that encode a site-specific recombinase (MFE_08380) and two phage family integrases [MFE_03330 and MFE_03370 (truncated)]. MFE_08380 belongs to the class of serine resolvase/invertases (Pin homologues), which may promote integration, excision and inversion of defined DNA segments through site-specific recombination.
Comparative organization of the two hsd loci encoding type I restriction-modification systems in M. fermentans JER. The gene ID (MFE_) and the designation of the CDSs are indicated below and above the arrows, respectively. Within the hsdS genes the positions of putative recombination sequence cassettes are indicated by vertical black bars.
Screening of the non-redundant protein sequences (nr) database revealed that the M. fermentans MFE_08380 protein shows 63 % identity and 77 % similarity to the DNA invertase of Clostridium perfringens CPE strain F4969 and 60 % identity and 77 % similarity to the putative resolvase of Clostridium difficile QCD-23m63. However, with the exception of the MFE_08380 homologue MBIO_0716 in M. fermentans PG18, no similarity was observed with mycoplasmal proteins deposited in GenBank. Notably, in M. fermentans JER, the CDS encoding the MFE_08380 protein is located downstream of the genes for the restriction-modification enzyme subunits (i.e. loci MFE_08330 to MFE_08370) and about 10.5 kb upstream of the CDS encoding the MFE_08440 protein that shows 50 % identity and 66 % similarity to ExiS of bacteriophage phiMFV1 (Röske et al., 2004). The location of MFE_08380 as well as MFE_08440 suggests that the former is involved in phase variation of the restriction-modification enzymes and that the latter is implicated in integration into or excision from the host chromosome of genetic elements such as bacteriophages. To the best of our knowledge, no data regarding the mycoplasmal serine site-specific recombinases and their involvement in recombination processes in mycoplasmas have yet been obtained. However, it has been shown that members of the other large family of proteins, i.e. the lambda integrases and tyrosine site-specific recombinases Xer1, MYPE2900 and HvsR, mediate site-specific rearrangements in the vpma loci of M. agalactiae, the mpl loci of M. penetrans, and in the hsd loci and the vsa locus of Mycoplasma pulmonis, respectively (Chopra-Dewasthaly et al., 2008; Horino et al., 2009; Sitaraman et al., 2002). Two tyrosine site-specific integrases, MFE_03330 and MFE_03370 (truncated), showing 28–32 % identity to the phage integrase (Int) of M. fermentans PG18 phiMFV1 bacteriophage (Röske et al., 2004), were detected. However, MFE_03330 and MFE_03370 do not show homology to previously described mycoplasmal tyrosine site-specific recombinases, which mediate rearrangements in loci that encode variable antigens. No gene encoding Int of phage phiMFV1 was found in M. fermentans JER.
Replication, transcription and translation
As in many other bacterial genomes, dnaA (MFE_00010) and dnaN (MFE_00020) were detected in the vicinity of the putative replication origin, assigned by GC skewing analysis (Kuwahara et al., 2004; Lobry, 1996), but the recA (MFE_05620), gyrA (MFE_00990) and gyrB (MFE_00980) genes were dispersed through the chromosome.
Little is known about the regulation of gene expression in mycoplasmas. We detected 16 putative transcriptional regulators, among them a single sigma factor as well as the transcriptional repressor HrcA and also the cis-acting conserved palindromic repeated sequence (CIRCE) found upstream of the heat-shock genes (Chang et al., 2008). In M. fermentans JER, as in all other Mycoplasma species sequenced so far, UGA is read as tryptophan. The chromosome of M. fermentans JER contains 835 CDSs, with a mean size of 1026 bp, covering 87.6 % of the whole chromosome sequence. The components of a bacterial translational system are conserved in M. fermentans JER, and 51 CDSs encoding ribosomal proteins, two complete rRNA operons located on the leading strand and transcribed in the same direction, 36 tRNAs and 20 CDSs encoding tRNA synthetases were detected. CDSs encoding a single protein initiation factor, three elongation factors and a single peptide release factor (RF-1) were also found. However, as in other mycoplasmas, the peptide chain release factor 2 (RF-2) was not detected.
Energy metabolism and the F0F1-ATPase
We found 77 CDSs associated with energy metabolism, among them 11 components of the phosphotransferase (PTS) systems, a complete glycolysis pathway and the three components of the arginine deiminase (ADI) pathway (arcA, arcB and arcC; see Table 4⇓). Arginine is utilized in the arginine-utilizing Mycoplasma species through the ADI pathway. This pathway is an example of the metabolic versatility of these organisms. In M. fermentans, the arc genes encoding the ADI pathway appear to be clustered in an operon-like structure in the order arcA (arginine deiminase), arcB (ornithine carbamoyltransferase) and arcC (carbamate kinase), followed by a putative arginine transporter (MFE_04130), supporting the notion that although M. fermentans is a glycolytic organism, it is also an arginine utilizer (Olson et al., 1993). Table 4⇓ also shows the degree of identity of the ADI pathway CDSs of M. fermentans with those of the other Mycoplasma species fully sequenced so far. The three arc genes encoding the pathway are present in five of the 15 genomes of the Mycoplasma species. M. hominis and Mycoplasma arthritidis utilize arginine via the ADI system as their major energy source (Dybvig et al., 2008; Pereyre et al., 2009), whereas M. fermentans, M. penetrans and Mycoplasma capricolum are fermentative species that are capable of growing on arginine in the absence of carbohydrate (Sasaki et al., 2002). M. pneumoniae also possesses all the genes for the ADI pathway enzymes, but is incapable of growing on arginine (Himmelreich et al., 1996), probably because only arcC is present in its full length in this organism, whereas arcA is frameshifted and arcB is truncated (Himmelreich et al., 1996).
Comparison of the ADI operon of M. fermentans JER with the ADI protein CDSs of other Mycoplasma species
The degree of identity of the ADI pathway CDSs of M. fermentans JER with those of Mycoplasma species fully sequenced to date was analysed by protein blast. GenBank accession numbers of the analysed genomes are shown in parentheses. nd, Not detected. No arc genes were detected in: M. agalactiae (accession no. CU179680), M. pulmonis (AL445566), M. hyopneumoniae (AE017332), M. synoviae (AE017245), Mycoplasma conjunctivae (FM864216) and M. genitalium (L43967).
In bacteria, the F0F1-type ATPase serves two purposes. The major purpose is to catalyse the synthesis of ATP in response to an electrochemical proton gradient, and the second is to hydrolyse ATP for the generation of a transmembrane proton gradient (Futai et al., 1989; Groth, 2000; Holland & Blight, 1999; Junge et al., 2001). Mycoplasmas lack a cytochrome-containing electron transport chain and their F0F1-ATPase function seems to be restricted to maintaining a proton gradient (Futai et al., 1989). Lacking a rigid cell wall, mycoplasmas depend on mechanisms that regulate the osmotic balance between the intracellular space and the external environment. This is accomplished mostly by the active extrusion of one or more ions in excess of their Donnan distribution (Donnan, 1911). It has been previously suggested that for regulation of cell volume and prevention of lysis, mycoplasmas have an active ion extrusion mechanism, which may consist of the F0F1-type ATPase and a Na+/H+ antiporter that actively drives Na+ across the cell membrane against its electrochemical potential (Shirvan & Rottem, 1993). Table 5⇓ shows that the F0F1-type ATPase operon of M. fermentans JER contains a cluster of eight genes encoding the structural subunits of the ATPase. The structural genes were preceded by a gene (MFE_02110) putatively encoding the regulatory subunit I that may be analogous to the regulator uncI gene of the Escherichia coli ATP synthase operon (Brusilow et al., 1983; Hansen et al., 1981). The atp gene order on the chromosome is as listed in Table 5⇓. The α and β subunits of the M. fermentans JER F0F1-type ATPase showed a high degree of identity with the corresponding subunits of 16 mycoplasma genomes sequenced to date (54–79 % and 71–85 %, respectively), suggesting that these genes are highly conserved in mycoplasmas. The a, b, c, γ, δ and ϵ subunits showed a lower degree of identity with the corresponding mycoplasmal ATPase genes. In the previously investigated M. gallisepticum and M. pneumoniae b subunits, two hydrophobic helical stretches were detected at the N terminus (Pyrowolakis et al., 1998; Rasmussen et al., 1992). One of them is a predicted signal peptide with a cleavage site analogous to the consensus sequence of a prolipoprotein (Pyrowolakis et al., 1998). This observation led to the suggestion that the lipoprotein nature of the b subunit and its proposed membrane topology are characteristic of mycoplasmas (Pyrowolakis et al., 1998). Our DNA sequence-based analysis indicated that the b subunit of M. fermentans is not a lipoprotein but an integral membrane protein that transverses the membrane through a single hydrophobic helical stretch at the N terminus (Fig. 4⇓). Comparison of the 16 mycoplasma genomes harbouring the F0F1-type ATPase sequenced to date revealed that only in the pneumoniae group [M. gallisepticum (Papazisi et al., 2003), M. pneumoniae (Himmelreich et al., 1996), M. genitalium (Fraser et al., 1995) and U. parvum (accession nos NC_012503 and NC_002162, previously regarded by Glass et al. (2000) as U. urealyticum] can the lipoprotein nature of the b subunit be predicted.
Multiple sequence alignment of the atpF gene products (F0F1-ATPase b subunit) of representatives of the mycoplasma groups analysed. The hominis group (M. fer, M. fermentans; M. aga, M. agalactiae; M. hom, M. hominis), the pneumoniae group (M. gen, M. genitalium; M. pne, M. pneumoniae; M. gal, M. gallisepticum) and the spiroplasma group (M. myc, M. mycoides subsp. mycoides SC; M. cap, M. capricolum) are shown. Predicted lipoprotein signal sequences are boxed. A black background represents identical amino acids, while a grey background represents highly similar amino acids. NCBI GenBank accession numbers of the genomes are presented in Table 5. cons, Consensus sequence.
Comparison of the F0F1-ATPase operon of M. fermentans JER with those of representative Mycoplasma species of the hominis, pneumoniae and spiroplasma phylogenetic groups
Characterization of the M. fermentans F0F1-ATPase subunits encoded by the atp operon genes and their degree of identity, analysed by blast, with the homologous subunits of representatives of mycoplasma groups. The hominis group (M. aga, M. agalactiae; M. hom, M. hominis), the pneumoniae group (M. gen, M. genitalium; M. pne, M. pneumoniae; M. gal, M. gallisepticum) and the spiroplasma group (M. myc, M. mycoides subsp. mycoides SC; M. cap, M. capricolum) are shown.
Membrane-associated proteins
The 42-member ABC transporter family and the group of 28 lipoproteins are the largest gene groups detected in the genome of M. fermentans. Assuming that each ABC transporter contains three to four proteins (Holland & Blight, 1999), one would expect that M. fermentans JER contains 10–13 transporters. The lipoproteins which serve in mycoplasmas as a major mechanism used to evade host defence systems (Chambaud et al., 1999) are anchored to the plasma membrane through their N-terminal lipid moiety (Hantke & Braun, 1973). Lipidation is directed by the presence of a cysteine-containing ‘lipobox’ within the lipoprotein signal peptide sequence. The properties of lipoprotein signal peptides have been described based on the analysis of the signal peptide features of these lipoproteins, yielding detailed and specific patterns that may be used for the identification of microbial lipoproteins (Klein et al., 1988). The usefulness of this pattern to identify probable lipoprotein sequences was demonstrated in our study by analysing the genome of M. fermentans in comparison with sequences identified in other mycoplasmas. We identified 28 CDSs of putative lpp genes on the basis of the presence of possible lipobox sequences at the N termini of translated protein sequences. However, it remains possible that some of these putative lpps are ‘false-positives’, misidentified due to the coincidental presence of a cysteine within the signal sequences of exported proteins or proteins targeted for insertion into plasma membranes.
We were unable to detect local homopolymeric tract sequences upstream of the lipoprotein CDSs that might generate phase-variable frameshift mutations, as has been seen in lipoproteins of some other mycoplasmas (Chambaud et al., 2001).
The number of acyl chains attached to the N-terminal cysteine of mycoplasmal lipoproteins remains unclear. Three acyl chains have been found in most bacterial lipoproteins investigated so far, two attached by an ester linkage to the glycerol moiety and one by an amide linkage to the free NH2 moiety. The amide linkage is formed by an N-acyltransferase (Hantke & Braun, 1973), but the gene for this enzyme was not found in our study, supporting an earlier study that showed that lipopeptides from M. fermentans have a free N terminus (Okusawa et al., 2004). The N-acyltransferase gene has not been found in other completely sequenced genomes of Mycoplasma species such as M. genitalium (Fraser et al., 1995), M. pneumoniae (Himmelreich et al., 1996), M. pulmonis (Chambaud et al., 2001), U. parvum (Glass et al., 2000), M. gallisepticum (Papazisi et al., 2003), M. penetrans (Sasaki et al., 2002), Mycoplasma mobile (Jaffe et al., 2004), Mycoplasma mycoides subsp. mycoides SC (Westberg et al., 2004), M. synoviae (Vasconcelos et al., 2005) and M. hyopneumoniae (Minion et al., 2004), questioning the presence of an N-acyltransferase in mycoplasmas.
Searching the database of the Mycoplasma species sequenced so far has revealed that these micro-organisms harbour up to 46 lipoprotein genes, some of which occur in multigene families (Hallamaa et al., 2006). In M. pneumoniae, the multigene families described include genes that encode proteins with sequence similarity to lipoproteins but which lack the characteristic N-terminal prolipoprotein signal sequence (Hallamaa et al., 2006). The large number of mycoplasmal membrane proteins harbouring the lipobox may allow a stronger anchoring of these proteins to the mycoplasmal membrane through additional acyl chains (Pyrowolakis et al., 1998). This might be important, as mycoplasmas do not have a rigid cell wall to protect them against osmotic and mechanical stress (Razin, 1992). The functions of most of the mycoplasmal lipoproteins are currently unknown. Our annotation studies suggest that three of the lipoprotein genes (CDSs MFE_07470, MFE_07870 and MFE_01840) are associated with ABC transporters. Furthermore, earlier studies have shown that a 48 kDa lipoprotein (MFE_07870) of M. fermentans is associated with the ability of this organism to modulate the host immune system (Hall et al., 1996; Mühlradt & Schade, 1991), and a 29 kDa lipoprotein (MFE_00220) is the major adhesin of this organism, playing a role in the adhesion of the mycoplasma to eukaryotic host cells (Leigh & Wise, 2002; Theiss et al., 1996). In addition to lipoproteins, 119 membrane-spanning proteins were predicted, 12 of them possessing six transmembrane domains. Three of these proteins contained a lipobox sequence in addition to the transmembrane domain.
As shown in other mycoplasmas (Dandekar et al., 2000; Fraser et al., 1995; Himmelreich et al., 1996; Jaffe et al., 2004; Minion et al., 2004; Papazisi et al., 2003), protein secretion in M. fermentans JER occurs through a pathway consisting of SecA, SecD, SecE, SecG, SecY and YidC. The prolipoproteins detected are apparently processed by a signal peptidase II, which has been identified in M. fermentans JER. However, as in many of the mycoplasma genomes sequenced so far, signal peptidase I, which carries out signal peptide cleavage in the processing of protein precursors in the membranes of prokaryotic and eukaryotic cells, was not detected.
Membrane lipids
The phospholipid (PL) composition of mycoplasmas is rather simple, comprising a few de novo-synthesized PLs, mainly phosphatidylglycerol (PG) and cardiolipin (CL). Phosphatidylcholine (PC) and sphingomyelin (SPM) are incorporated from the growth medium (Rottem, 1980). Most of the genes associated with classical PG biosynthesis employing CDP-diacylglycerol (Raetz & Dowhan, 1990) were detected in the genome of M. fermentans JER, but the CL synthase gene (cls) was not detected. Analysis of the 16 mycoplasma sequences deposited to date in GenBank revealed that 10 species contain cls and six do not. It is intriguing to note that lipid analyses of the Mycoplasma species that possess cls has revealed higher levels of exogenously incorporated PLs with a preferential incorporation of SPM (Rottem, 1980), whereas lower amounts of exogenous PLs were detected in species not containing cls, where the ratio of SPM to PC was similar to that found in the growth medium (Rottem, 1980).
The major polar lipid in the cell membrane of M. fermentans JER has been shown to be a phosphocholine-containing glycolipid [MfGLII (Deutsch et al., 1995)], similar to the glycolipid described by Matsuda et al. (1994) in M. fermentans PG18 but much more polar, apparently due to the presence of a 2-amino-1,3-propanediol moiety and an additional phosphate residue (Zähringer et al., 1997).
A gene (MFE_01510) encoding the cholinephosphotransferase in the biosynthesis of MfGLII was identified in our genomic analysis as well as in the genome of M. fermentans PG18 (Ishida et al., 2009). The gene contains an ORF of 762 bp encoding 254 aa. It has 27 % amino acid identity with LicD of Haemophilus influenzae (GenBank accession no. P14184) and 26 % identity with LicD of Streptococcus pneumoniae (CAI34638). Out of the 16 mollicute genomes sequenced so far and deposited in GenBank, CDSs homologous to LicD were detected in M. pulmonis (NP_325836) and M. arthritidis (ACF07060).
Virulence factors
Only a few recognizable genes likely to be involved in virulence have been detected in the mycoplasmal genomes analysed so far. The lack of a rigid cell wall allows direct and intimate contact of the mycoplasma membrane with the cytoplasmic membrane of the host cell. In M. fermentans, under appropriate conditions, such contact may lead to cell fusion (Dimitrov et al., 1993). During the fusion process, mycoplasmal components are delivered into the host cell and affect the normal functions of the cell. An array of potent hydrolytic enzymes has been identified in M. fermentans. As in other mycoplasmas, the wealth of nucleases and proteases is most remarkable. Eighteen RNases and DNases that may degrade host cell nucleic acids and 17 proteases and peptidases, mainly aminopeptidases and endopeptidases, were detected in M. fermentans JER.
Oxidative stress resistance is one of the key properties that enables pathogenic bacteria to survive the effects of reactive oxygen within the host. Therefore, the bacterial factors that are involved in resisting oxidative stress are important for bacterial colonization, survival and pathogenesis. Although the presence of classical antioxidant enzymes would be expected in mycoplasmas, important genes directly related to protection against reactive oxygen species (ROS), such as superoxide dismutases and catalases, were not identified by sequence homology in M. fermentans JER or in the other mycoplasmal genomes sequenced so far. In addition, glutathione peroxidase, found in M. penetrans (Sasaki et al., 2002), was not detected in our study. Among the few identified M. fermentans genes encoding proteins possibly involved in the suppression of ROS-mediated damage, a gene encoding a peroxiredoxin (MFE_01390) was recognized. Sequence and phylogenetic analyses indicate that M. fermentans peroxiredoxin is closely related to the atypical 2-Cys peroxiredoxin subfamily. We detected two copies of a low-molecular-mass thioredoxin (MFE_06630 and MFE_07930) as well as one copy of an NADPH thioredoxin reductase (MFE_07420), suggesting that this organism also possesses a thioredoxin reductase system. Thioredoxin reductase activity has previously been detected in M. capricolum (Ben-Menachem et al., 1997), and it was concluded that this system functions as a detoxifying system to protect mycoplasmas from reactive oxygen compounds, either by direct catalysis of the reduction of protein disulphides or by maintaining intracellular pools of low-molecular-mass thiol metabolites in their reduced form.
In summary, this paper presents the first complete genome of M. fermentans as well as the primary features of the M. fermentans strain JER genome, and provides an analysis of this genome in comparison with the completely sequenced genomes of closely related species belonging to the hominis phylogenetic group. As such, it should provide the foundation for future experiments aimed at fully understanding the physiology of this organism and the molecular mechanisms that this organism, implicated in rheumatoid arthritis (Horowitz et al., 2000; Kawahito et al., 2008; Schaeverbeke et al., 1996; Williams et al., 1970), employs in the colonization of human tissues and the development of disease.
Acknowledgments
We thank Mechthild Bömeke, Frauke-Dorothee Meyer (Göttingen) and Avigail Katzenell (Jerusalem) for excellent technical assistance. We are grateful to Shmuel Razin and Richard Herrmann for critical reading of the manuscript. Hagai Rechnitzer was supported by a FEMS research fellowship and the Göttingen group was supported by a grant from the Niedersächsisches Ministerium für Wissenschaft und Kultur.