Abstract
The aim of this study was to determine the prevalence, virulence factors (stx, eae, ehxA and astA) and phylogenetic relationships [PFGE and multilocus sequence typing (MLST)] of Shiga toxin-producing Escherichia coli (STEC) strains isolated from four previous cohort studies in 2212 Peruvian children aged <36 months. STEC prevalence was 0.4 % (14/3219) in diarrhoeal and 0.6 % (15/2695) in control samples. None of the infected children developed haemolytic uraemic syndrome (HUS) or other complications of STEC. stx1 was present in 83 % of strains, stx2 in 17 %, eae in 72 %, ehxA in 59 % and astA in 14 %. The most common serotype was O26 : H11 (14 %) and the most common seropathotype was B (45 %). The strains belonged mainly to phylogenetic group B1 (52 %). The distinct combinations of alleles across the seven MLST loci were used to define 13 sequence types among 19 STEC strains. PFGE typing of 20 STEC strains resulted in 19 pulsed-field patterns. Comparison of the patterns revealed 11 clusters (I–XI), each usually including strains belonging to different serotypes; one exception was cluster VI, which gathered exclusively seven strains of seropathotype B, clonal group enterohaemorrhagic E. coli (EHEC) 2 and phylogenetic group B1. In summary, STEC prevalence was low in Peruvian children with diarrhoea in the community setting. The strains were phylogenetically diverse and associated with mild infections. However, additional studies are needed in children with bloody diarrhoea and HUS.
-
A table showing additional strains used for MLST sequence comparisons is available with the online version of this paper.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Introduction
Shiga toxin-producing Escherichia coli (STEC) has emerged as a group of foodborne pathogens that can cause severe human disease, such as haemolytic uraemic syndrome (HUS) (Banatvala et al., 2001; Nataro & Kaper, 1998). Enterohaemorrhagic E. coli (EHEC), a subclass of STEC, is also capable of causing haemorrhagic colitis. STEC produces two phage-encoded cytotoxins called Shiga toxins (encoded by stx1 and stx2). In addition to toxin production, STEC frequently possesses other virulence factors such as intimin (eae) (Boerlin et al., 1999), a haemolysin (EHEC-HlyA; ehxA) (Paton & Paton, 1998; Schmidt et al., 1995) and the enteroaggregative E. coli heat-stable enterotoxin EAST1 (astA) (Girardeau et al., 2005; Vaz et al., 2004).
Although human STEC strains belong to a large number of serotypes, most outbreaks and sporadic cases of haemorrhagic colitis and HUS are caused by serotype O157 : H7. As non-O157 STEC strains are more prevalent in animals and as contaminants in foods, humans are probably exposed more often to these strains. STEC serogroups have been classified into five seropathotypes (A–E) according to incidence and association with HUS and outbreaks (Karmali et al., 2003). STEC can be classified into four phylogenetic groups (B1, A, D and B2) (Clermont et al., 2000; Escobar-Páramo et al., 2004; Girardeau et al., 2005). Based on multilocus sequence typing (MLST), Whittam and co-workers studied the clonal relationships of STEC strains (STEC Reference Center, ). Two EHEC clonal groups and 11 STEC groups have been identified.
In our experience, HUS is common in Peru. We had a retrospective case series of patients with HUS admitted during the past 10 years at one paediatric hospital in Lima; however, STEC was not looked for adequately (only routine stool cultures were performed) during that time period in Peruvian HUS patients (unpublished data). There is little information on the prevalence, virulence factors and phylogenetic distribution of STEC strains in Peru. The aims of this study were to: (i) determine the prevalence of STEC in diarrhoea and control samples from Peruvian children; (ii) determine the distribution of critical virulence factors (stx1, stx2, eae, ehxA and astA); and (iii) determine the phylogenetic distribution (by MLST and PFGE) of the isolated STEC strains.
Methods
Bacterial strains.
We determined the prevalence of STEC in 3219 samples from children with diarrhoea and 2695 samples from healthy controls without diarrhoea from four prospective cohort studies conducted previously in 2212 Peruvian children aged <36 months. All studies were in the community setting: three in peri-urban communities of Lima [Villa el Salvador (N. Zavaleta, Instituto de Investigación Nutricional), Chorrillos (Ochoa et al., 2009) and Independencia (E. Chea-Woo, Universidad Peruana Cayetano Heredia)]; and one in the Andean region of the country [Huaraz (C.F Lanata, Instituto de Investigación Nutricional)] (Table 1). STEC strains were identified by the presence of stx1, stx2 and eae using a previously validated multiplex real-time PCR system (Guion et al., 2008). For all studies, five lactose-positive colonies isolated from MacConkey plates were used for the PCR assay. The strain STEC W147 (stx1+ sxt2+ eae+ ehxA+) provided by Dr C. Torres (Universidad de Rioja, Spain) was used as a positive control.
Detection of virulence factors.
One stx1- and/or stx2-positive colony per patient was tested to identify the presence of virulence genes. The sequences of the primers and amplicon sizes are described in Table 2. PCR for the other virulence genes (ehxA and astA) was performed in a 25 µl reaction mixture containing 2.5 µl each dNTP (2.5 mM; Bioline), 1.5 µl 50 mM MgCl2, 0.5 µl each primer (10 mM; Isogen Life Science), 2.5 µl 10× NH4 buffer (Bioline), 1.5 U Biotaq DNA Polymerase (Bioline) and 5 µl DNA template. For all amplification reactions, the mixture was heated to 94 °C for 10 min prior to thermocycling (iCycler; Bio-Rad). The mixture was held at 72 °C for 7 min after the final cycle before cooling at −20 °C. Amplified products were analysed by 1.5 % agarose gel electrophoresis and visualized by staining with ethidium bromide.
Serotyping.
Serotyping was performed at the E. coli Reference Center (Pennsylvania State University, PA, USA) for O (Orskov et al., 1977) and H (Machado et al., 2000) antigen typing. STEC strains were assigned to one of the five seropathotypes (A–E), as described previously (Karmali et al., 2003).
EHEC haemolysin production.
EHEC haemolysin production was detected using blood agar base (Difco) supplemented with 10 mM CaCl2 and 5 % defibrinated sheep blood. Plates were incubated at 37 °C and examined after 24 and 48 h for zones of haemolysis around colonies (Vieira et al., 2001).
Clermont’s phylogenetic group determination.
STEC strains were assigned to Clermont’s phylogenetic groups according to the presence or absence of the genes chuA, yjaA and tspE4C2 (Clermont et al., 2000).
MLST.
MLST was performed on seven conserved housekeeping genes (aspC, clpX, fadD, icdA, lysP, mdh and uidA) as described elsewhere (). PCR products were purified using a Wizard SV Gel and PCR Clean-Up System (Promega). Sequencing was performed by Macrogen using an automatic DNA 3730xl sequencer (Applied Biosystems) and concatenated for phylogenetic analyses.
DNA sequence analyses.
The sequences were reviewed and edited by visual inspection using Chromas Lite v.2.01 software (Technelysium Pty). After editing, the sequences were exported to BioEdit v.7.0.9 () and aligned with the clustal w module. Differences of a single nucleotide allowed us to classify the sequences as different alleles. The different alleles of each housekeeping gene were numbered, and allelic profiles or sequence types (STs) were determined based on the seven studied loci. ST designations were assigned in accordance with the numbering system used by the STEC Center at Michigan State University (MI, USA; ). Strains belonging to the same ST were considered to be the same clone; one member of each ST was used in the phylogenetic analyses.
Phylogenetic analyses.
The MLST sequences of the strains were combined with those from 33 published E. coli and Shigella species genomes for comparison (see Supplementary Table S1, available in JMM Online). Sequences were aligned by clustal w using the megalign module of the Lasergene software (DNASTAR). Neighbour-joining trees were constructed using the Kimura two-parameter model of nucleotide substitution with mega4 software (Tamura et al., 2007), and the inferred phylogenies were each tested with 500 bootstrap replications. Phylogenetic network analysis was conducted with the SplitsTree 4 program (Huson & Bryant, 2006) using the neighbour-net algorithm (Bryant & Moulton, 2004) and untransformed distances (p-distances). The Φw recombination test (Bruen et al., 2006) as implemented by SplitsTree 4 was used to distinguish recurrent mutation from recombination in generating genotypic diversity. The numbers of synonymous substitutions per synonymous site (dS) and non-synonymous substitutions per non-synonymous site (dN) were estimated by the modified Nei–Gojobori method using mega4. Allelic sequences were fitted to a nucleotide substitution model using the Datamonkey website (), and the single likelihood ancestor counting method was used to fit a codon model to detect selection on individual codons (Pond & Frost, 2005).
PFGE.
Preparation of genomic DNA and PFGE were performed as described previously (Gautom, 1997). Samples were digested with 40 U XbaI (Promega), and DNA fragments were resolved in 1 % agarose gels using a CHEF-DR-II system (Bio-Rad Laboratories). Lambda concatemers (New England Biolabs) with a molecular size range of 50–1000 kb were used as DNA size markers. Evaluation of PFGE profiles for similarity was performed using InfoQuest FP v.5 software (Bio-Rad). A UPGMA tree was constructed using Dice similarity indices, complete linkage and optimization: 1 %, position tolerance 1.3 % (Beutin et al., 2005).
Results and Discussion
Prevalence
We analysed 5914 samples in total. The prevalence of STEC was 0.4 % (14/3219) in diarrhoeal samples and 0.6 % (15/2695) in healthy controls (Table 1). The prevalence of STEC was significantly lower compared with other pathogens. The mean prevalence of the other isolated pathogens (using the same PCR methodology) was: enteroaggregative E. coli, 9.9 %; enteropathogenic E. coli, 8.5 %; enterotoxigenic E. coli, 6.9 %; and diffusely adherent E. coli, 4.8 % (T. J. Ochoa, A. Llanos, J. Lee and F. Lopez, unpublished data). To our knowledge, this is the first study of the prevalence of STEC in Peruvian children. This is important because STEC is not routinely looked for in clinical laboratories, even when the child presents with bloody diarrhoea or HUS. The small number of isolated STEC strains was one of the main limitations of this study. The age of the STEC-infected children was 4–36 months (mean 15 months). Among the STEC-positive diarrhoea samples, one was bloody (VES 230-5, isolation date 2 June 2004). Of the 29 STEC strains, 20 were available for further analysis (by MLST and PFGE).
Serotypes and seropathotypes
The typable STEC belonged to 18 serotypes. The most common serogroups were O26 (four strains, 14 %), O111 (three strains, 10 %) and O145 (three strains, 10 %), with similar distribution among the diarrhoea and control samples (Table 3). Infections with some non-O157 STEC types, such as O26 : H11 or H−, O91 : H21 or H−, O103 : H2, O111 : H−, O113 : H21, O117 : H7, O118 : H16, O121 : H19, O128 : H2 or H−, O145 : H28 or H− and O146 : H21, have been associated with severe illness in humans (Bettelheim, 2007; Coombes et al., 2008) and with a number of outbreaks (Hiruta et al., 2001; McMaster et al., 2001; Werber et al., 2002). The strains included examples of four of the five reported seropathotypes (Karmali et al., 2003). The most common seropathotype was B (13/29, 45 %), which comprised all O26 : H11 (four strains), O111 (H10, H8 and H7; three strains), O103 : H2 (two strains), O145 (H11 and H+; three strains) and O174 : H19 (one strain), and was associated with disease (Girardeau et al., 2005). There were five non-typable strains and four with only H-types, which did not belong to any of the known seropathotype groups.
nd, Not determined; +, positive for the gene; −, negative for the gene; nt, non-typable.
Distribution of virulence genes
Analysis of the frequency of virulence factors and clonal distribution of STEC is pivotal to improve our understanding of epidemiological characteristics of pathogens that pose a risk to public health. Epidemiological studies, together with in vivo and in vitro experiments, have revealed that stx2 (and its variants) is the most important virulence factor associated with severe human disease. STEC producing stx2 is more commonly associated with serious disease than isolates producing stx1 or stx1 plus stx2 (Boerlin et al., 1999; Louise & Obrig, 1995; Paton & Paton, 1998). In the current study, the majority of strains were stx1-producing strains (24/29, 83 %); only 5/29 (17 %) strains carried stx2. This fact presumably explains the mild illness found in these infections. There were too few stx2-positive isolates to assess its relationship to pathogenesis. Severe diarrhoea (especially haemorrhagic colitis) and HUS are closely associated with STEC types carrying the eae gene for intimin (Boerlin et al., 1999), although a large number of locus of enterocyte effacement-negative STEC have also caused human disease (Bettelheim, 2007). In this study, the eae gene was present in 72 % (21/29) of the STEC strains: 19 strains were stx1+ eae+, five were stx1+ eae−, three were stx2+ eae− and two were stx2+ eae+. The distribution of frequency of eae+ STEC among diarrhoea and control samples was similar.
The ehxA gene (enterohaemolysin) was detected in 59 % of the STEC strains (17/29) with a similar distribution among the diarrhoea and control samples. Haemolysis production was present in 11/17 strains positive for the ehxA gene; only one strain that was ehxA− was haemolytic (Table 3). The astA gene (toxin EAST1) was uncommon (4/29, 14 %), being found in 1/15 control samples (7 %) and 3/14 diarrhoeal samples (21 %) (Table 3). We did not find any significant association between the presence of specific virulence genes and a specific seropathotype as reported by others (Girardeau et al., 2005).
Clermont’s phylogenetic group distribution
Recent phylogenetic studies have indicated that STEC/EHEC strains fall principally into phylogenetic groups A, B1 and D (Escobar-Páramo et al., 2004; Girardeau et al., 2005; Ziebell et al., 2008). The strains in this study belonged to phylogenetic group B1 (52 %, 15/29), D (28 %, 8/29), A (17 %, 5/29) and B2 (3 %, 1/29). The most frequent phylogenetic group was B1, consistent with earlier reports (Girardeau et al., 2005). Among the 14 diarrhoeal strains, six belonged to group B1 and five to group D (Fig. 1, Table 3).
PFGE profiles and clusters of 20 Peruvian STEC strains. The corresponding MLST ST, clonal group, phylogenetic group and serotype are listed for each strain based on the corresponding pulsed-field pattern (PFGE groups I–XI). D, Diarrhoea sample; C, control sample; S, sample; UNASS, unassigned; nd, not done. Strain D5018-15JAN08 (not shown) was ST895, clonal group STEC 12.
MLST analysis
MLST loci were sequenced in 19 STEC strains. For phylogenetic analyses, the sequenced internal fragments of the seven housekeeping genes were concatenated to yield 3732 nt. MLST analysis resolved a mean of 19.4 variable nucleotide sites per locus, which defined a number of alleles, ranging from five to eight (Table 4). The dS value ranged from 3.57 % for mdh to 6.64 % for fadD, with a mean of 5.03 synonymous substitutions per 100 synonymous sites (Table 4). The dN value per 100 non-synonymous sites was generally an order of magnitude lower than that of dS, ranging from 0.00 for aspC, clpX, fadD, icdA and lysP to 0.41 for uidA. Tests for natural selection operating on the allelic variation at each MLST locus based on the single likelihood ancestor counting method found no individual sites to be under significant negative or positive selection, indicating that the MLST loci are evolving neutrally.
The distinct combinations of alleles across the MLST loci were used to define 13 multilocus genotypes or STs among the 19 strains. The 13 STs differed on average at 1.2 and 0.2 % of the nucleotide and amino acid sites, respectively. ST106 was the most common multilocus genotype (5/19, 26 % of strains) (Fig. 1). In the phylogenetic tree based on the genetic relationships of the STEC (Fig. 2a), we observed that our strains were closely related to others of the same serotype from other studies. In addition, STEC strains ST106, ST896 and ST898 (EHEC 2) were related in the network, similar to the results observed in the tree based on the PFGE results (Fig. 1). The same results were observed in clonal group STEC 12.
Phylogenetic relationships among 13 STEC STs. (a) Unrooted phylogenetic tree constructed by a neighbour-joining algorithm based on the Kimura two-parameter model of nucleotide substitution. Bootstrap values greater than 75 % based on 500 replications are given at internal nodes. The serotypes for the published E. coli and Shigella species genome strains are given in parentheses. (b) Phylogenetic (splits) network based on a neighbour-net algorithm using a p-distance matrix. The 13 STEC STs are indicated by filled circles in (a) and (b).
The correlation observed between Whittam’s clonal groups, Clermont’s phylogenetic groups and some of the serotypes in this study is of interest. EHEC 2 contains serotypes O26 : H11, O111 : H8 (O111 strains are often non-motile or of other H types) and O145 : H11, which are classified as seropathotype B and phylogenetic group B1, as observed by others (Karmali et al., 2003; Ziebell et al., 2008).
The splits network (Fig. 2b) revealed several parallel paths indicative of the presence of phylogenetic incompatibilities in the divergence of clones. Such incompatibilities could arise from recurrent mutation or recombination in MLST loci. To detect recombination, the Φw test, which discriminates between recurrent mutation and recombination (Bruen et al., 2006), was used. When applied to the concatenated sequences of the 13 STs, the Φw test found significant evidence of recombination (Table 4). Evidence for recombination was also detected among the alleles of fadD and icdA (Table 4).
PFGE
PFGE typing of 20 STEC strains resulted in 19 pulsed-field patterns.
Comparison of the patterns revealed 11 clusters (I–XI) with a general similarity of 70 % in the UPGMA tree. Each cluster included strains belonging to different serotypes (Fig. 1), with the exception of cluster VI, which exclusively contained seven STEC of clonal group EHEC 2, phylogenetic group B1 and seropathotype B. In addition, the strains of pulsed-field pattern 1 (cluster I) showed the same pattern, belonging to the same clonal group of STEC 2. Most of the strains in this study were from children in separate geographical areas taken on different dates, suggesting that these pathogenic clones may be widespread in Peru.
MLST and PFGE were performed to establish the clonal relationships between representative STEC strains in this study. Both techniques identified strains that shared similar clonal origins (PFGE group VI and EHEC 2; Fig. 1). PFGE was more discriminative than MLST, as each ST was represented by more than one pulsed-field pattern. Differences between MLST and PFGE may be the result of the type of analysis. While PFGE detects multiple differences in the genome, MLST analyses only small fragments of conserved metabolic genes. Therefore, events such as the recent acquisition of virulence factors cannot be detected by MLST; genome sequencing was not carried out in this study.
In summary, STEC prevalence was low in children with diarrhoea in the community setting in Peru. Strains were phylogenetically diverse and associated with mild infections. There was a good correlation between the seropathotypes, clonal groups, PFGE groups and Clermont’s phylogenetic groups. However, additional studies are needed in Peruvian children with bloody diarrhoea and HUS to determine the virulence genes and phylogenetic characteristics of more virulent strains.
Acknowledgements
The authors wish to thank Maruja Bernal, Rina Meza and David Cepeda for their help in laboratory analysis. This work was partially funded by: Agencia Española de Cooperación Internacional (AECID), Spain, Programa de Cooperación Interuniversitaria e Investigación Científica con Iberoamérica (D/019499/08 and D/024648/09); Institutional Research Funds from Universidad Peruana Cayetano Heredia, Perú (T. J. O.) and Instituto Nacional de Salud del Niño, Peru (L. H.); Military Infectious Disease Research Program work unit # 60000.000.0.B0017 (R. C. M.); Instituto de Investigación Nutricional, Peru (N. Z. and C. F. L.); Programa Miguel Servet (CP05/00130) (J. R.); and the National Institutes of Health, USA, Public Health Service awards 1K01TW007405 (T. J. O.) and R01-HD051716 (T. G. C. and E. C.-W.).
An author of this work (R. C. M.) is an employee of the US Government. This work was prepared as part of his official duties. There is no conflict of interest for any of the authors.