Next Article in Journal
Recent Advances in Our Understanding of the Infectious Entry Pathway of Human Papillomavirus Type 16
Next Article in Special Issue
Whole Genome Sequencing Analysis of Salmonella enterica Serovar Typhi: History and Current Approaches
Previous Article in Journal
Reversal of Polymicrobial Biofilm Tolerance to Ciprofloxacin by Blue Light plus Carvacrol
Previous Article in Special Issue
Comparison of Conventional Molecular and Whole-Genome Sequencing Methods for Differentiating Salmonella enterica Serovar Schwarzengrund Isolates Obtained from Food and Animal Sources
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Hypothesis

AT Homopolymer Strings in Salmonella enterica Subspecies I Contribute to Speciation and Serovar Diversity

1
US Department of Agriculture, US National Poultry Research Center, Athens, GA 30605, USA
2
US Department of Agriculture, Genomics and Bioinformatics Research, Gainesville, FL 32608, USA
3
Department of Veterinary Microbiology and Pathology, Washington State University, Pullman, WA 99164, USA
*
Author to whom correspondence should be addressed.
Submission received: 13 June 2021 / Revised: 8 September 2021 / Accepted: 27 September 2021 / Published: 1 October 2021
(This article belongs to the Special Issue Salmonella and Salmonellosis)

Abstract

:
Adenine and thymine homopolymer strings of at least 8 nucleotides (AT 8+mers) were characterized in Salmonella enterica subspecies I. The motif differed between other taxonomic classes but not between Salmonella enterica serovars. The motif in plasmids was possibly associated with serovar. Approximately 12.3% of the S. enterica motif loci had mutations. Mutability of AT 8+mers suggests that genomes undergo frequent repair to maintain optimal gene content, and that the motif facilitates self-recognition; in addition, serovar diversity is associated with plasmid content. A theory that genome regeneration accounts for both persistence of predominant Salmonella serovars and serovar diversity provides a new framework for investigating root causes of foodborne illness.

1. Introduction

Approximately 30 of 1500 Salmonella enterica subspecies I (S. enterica) serovars are persistent agents of foodborne illness in people [1]. Despite improved biosecurity throughout the food production pipeline, reduction of salmonellosis has plateaued over 20 years [2]. The inability to reduce salmonellosis indicates new approaches to understanding the biology of this important pathogen are needed. Recently, the most commonly occurring single nucleotide polymorphism (SNP) that caused disruption of a gene in S. enterica serovar Enteritidis (Enteritidis) was identified, and it was deletion of a single adenine in a homopolymer string of 8 nucleotides (nt) within the fimbrial gene sefD [3]. Mutational analysis, phenotype microarray, and infection experiments in the egg-laying hen indicated that the sefD mutation increased organ invasion and mortality in hens, disturbed egg production, enhanced growth of the pathogen to high cell density, and otherwise behaved as a regulator of dimorphism of phenotype [4]. As a result of this finding the performance of a killed vaccine for hens was enhanced by increasing SefD in preparations [5]. The drastic change in biological phenotype imparted by the single base pair deletion suggested that characterization of purine homopolymer strings of adenine, AAAAAAAA, and its pyrimidine base pair (bp) of thymine, TTTTTTTT, in S. enterica should be explored. It is not evident that S. enterica of different serovars would have conserved AT 8+mer content, because it has a mosaic genome with frequent inversions of sections of the core genome as well as differences originating from mobile genetic elements such as bacteriophage, transposons, insertion elements, and plasmids.
Previous analysis of homopolymeric A and T strings across 81 bacterial and 18 archeal genomes showed that the motif in bacteria was hypermutable, occurred preferentially at the 5′ end of genes, and was not biased for AA- or TT- encoded amino acids [6]. Furthermore, detailed analysis of these tracts in the inlA gene of Listeria monocytogenes supported the premise that AT homopolymeric tracts were part of a robust regulatory mechanism that facilitated a reversible, rapid evolutionary mechanism facilitating adaptation and phase variation in gene regulation. Finding that S. enterica serovar Enteritidis, the world’s leading cause of foodborne salmonellosis, had the hypermutable gene sefD linked to phenotypic phase variation suggested that review of the S. enterica genome for AT homopolymeric tracts was warranted.
In addition to the research already cited, there is other evidence that AT homopolymer tracts, referred to in this manuscript AT 8+mers, constitute an important regulatory element in bacteria. It is a DNA motif suggested by conformational studies to bend DNA out of the Z-conformation [7]. Polyadenine regions can impact gene regulation in prokaryotes and can contribute to microsatellite instability in eukaryotes [8,9,10,11]. It has been shown that homopolymer nucleotide strings contribute to non-programmed slipped strand replication and the accumulation of errors in DNA [12,13,14]. Thus, the physicochemical impact of these strings was another reason to catalogue this motif in the genome of S. enterica.
The incidence of homopolymer strings in completed genomes of subspecies I of S. enterica and two different taxonomic classes, namely Gammaproteobacteria and Bacilli, were compared to better understand if S. enterica was at all unique. Reference genomes of S. enterica serovars Enteritidis and Typhimurium were analyzed to locate mutated AT 8+mers in genomes because they are from different genomic lineages and have been extensively sequenced and annotated. Together they cause approximately 40% of all foodborne salmonellosis in the US [1]. S. enterica serovar Gallinarum was included in the same analysis because it is a biological outlier within subspecies I that does not cause foodborne disease. Unlike either Enteritidis or Typhimurium serovars, Gallinarum causes disease in poultry resulting in high morbidity, mortality, and economic loss. A comparative approach is useful for linking phenotype to single nucleotide polymorphisms occurring in genomes [15]. In this study, the three genomes were compared to better understand the association the motif might have with naturally occurring mutation that disrupts open reading frames of genes.

Additional Background on Poultry-Associated S. enterica Serovars

S. enterica serovars Enteritidis and Typhimurium differ biologically [16]. Epidemiological patterns for the two predominant pathogens also differ. Enteritidis is an exceptional Salmonella pathogen in part because it efficiently contaminates the internal contents of eggs produced by otherwise healthy-appearing hens. It produces a high molecular mass (HMM) O-antigen, which not only protects killing of the pathogen by the host but also acts as a protective capsule in the hostile environment of the egg [17,18,19]. Typhimurium is also resistant to complement but it does not produce HMM O-antigen, and thus it does not survive in the internal contents of eggs to an extent that can be detected by epidemiological surveillance. Both Typhimurium and Enteritidis can contaminate a broad spectrum of other food sources such as poultry carcasses, other meats, and fresh vegetables. Both serovars can invade organs and survive, which contributes to systemic spread during infection [20]. Variation between strains within each serovar occurs but serovar characteristics and general genome organization are maintained [21,22]. There are serovar-specific patterns in plasmid carriage and fimbrial genes. Comprehensive reviews are available on the similarities and differences between Salmonella serovars [23,24,25,26,27].
S. enterica serovars Gallinarum and Enteritidis are genetically closely related [28]. Serovar Gallinarum’s antigenic formula is 1,9,12:-:-, which indicates it has the same lipopolysaccharide O-antigen epitopes as Enteritidis; however, it lacks both H1 and H2 flagellin proteins and is thus non-motile. Both Gallinarum and Enteritidis can contaminate the internal contents of eggs; however, Gallinarum has mutations and rearrangements that restrict its host range to the avian host, possibly by reducing immunological response to infection and thereby facilitating systemic infection [20]. Thus, the most striking differences between the foodborne pathogen Enteritidis and host-restricted Gallinarum is that the latter makes poultry sick, reduces egg production, and causes mortality. In contrast, hens infected with Enteritidis often appear healthy, remain in production, eggs become contaminated internally, and are a source of foodborne illness. The ability of Enteritidis to spread through flocks that appear healthy was one of the contributing factors in its world-wide spread through the layer industry. The differences in the epidemiology, association with food, and virulence characteristics of the three pathogens, all of which occur in the poultry environment, supported comparing them to better understand the association between the AT 8+mer motif and naturally occurring mutation of an important food borne pathogen. Other pathogenic Salmonellae and bacteria with different taxonomy were also included in analysis to facilitate comparison to results obtained by others [6].

2. Materials and Methods

2.1. Salmonella enterica Subspecies I Strains Analyzed for Strings of Homopolymers

A database of 49 completed genomes of S. enterica subspecies I (taxid:59201 was accessed at the National Center for Biotechnology Information (NCBI) [29] (Table S1), and the last accession date was 20 August 2021. S. enterica serovars Typhimurium, Typhi, and Enteritidis genomes were over-represented compared to other serovars, and together, they comprised 39.4% of all completed genomes available at the time of analysis. Only 51.2% of S. enterica subspecies I genomes had a complete adenylate cyclase (cyaA) gene, which is required for virulence as a foodborne pathogen; thus, analysis was restricted to completed genomes to avoid issues associated with incomplete assembly and annotation. The other sequences were plasmids, which were also evaluated for the AT 8+mer motif. Genome CP018657 was excluded from analyses due to errors. Genome databases at NCBI show homopolymer strings, as well as other combinations of low-complexity regions, in lower-case gray font, because there is recognition that some sequence strings might be susceptible to alignment error; thus, they required masking during the alignment process. For the BLAST searches conducted here, each gene was observed for high fidelity of surrounding regions; therefore, it is unlikely low complexity impacted observed alignments. For S. enterica subspecies I, 12 complete genomes were assessed for Typhimurium, Enteritidis, and Typhi, and another grouping included a mixture of 12 serovars often recovered from foods. It is important to note that the NCBI is in the process of consolidating redundancy in over-represented genomes; thus, the total number of genomes represented by high quality and well-annotated accession numbers may be larger than it appears as redundant material is condensed and re-annotated (accessed on 26 August 2021; https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/refseq/about/prokaryotes/reannotation/).

2.2. Other Bacteria Analyzed for Comparison to S. enterica

To compare homopolymer strings in S. enterica subspecies I, AT and GC 8+mer homopolymers were tabulated from 42 strains encompassing 5 other genera in Phyla Proteobacteria (Escherichia coli, Proteus mirabilis, Shigella sonnei, Yersinia pseudotuberculosis, Vibrio vulnificus (chromosome I and II)), and 5 genera of Firmicutes (Staphylococcus aureus, Streptococcus pyogenes, Enterococcus faecalis, Bacillus anthracis and Bacillus cereus) (Table S1). Salmonella, Shigella, and Escherichia are all members of the Family Enterobacteriaceae, whereas Staphylococcus, Streptococcus, Enterococcus, and Bacillus are in the Family Bacilli. At least 3 complete genomes of the other genera were assessed, including 12 genomes from Escherichia coli (Table 1).

2.3. Statistical Analysis of Kmer Content

Strings of homopolymers of different lengths were entered as text in the find function and the genomes were searched. Searches would count single locations multiple times if the length exceeded 8nt, e.g., kmers of 10nt would often be counted 3 times as 8, 9, and 10nts. Thus, each kmer was reviewed for length to make sure a single location was counted only once, and then it was catalogued according to total length. Review for undercounting homopolymer strings due to wrapping at end of lines in sequence reports was also done; thus, the Geneious or NCBI search programs were preferred. The genomes of interest were stored in SeqBuilder Pro, Lasergene V16.0.0 352 (DNASTAR, Madison, WI, USA) and in Geneious V2030.0.3 (Biomatters, Inc., San Diego, CA, USA) format (https://www.geneious.com, accessed on 15 July 2021). Results were copied into an Excel “.csv” file as Unicode text Microsoft Excel for Mac, V16.16.20 (200307). The text to column feature, and appropriate delimiters, were used to produce columns of data to calculate distance between nucleotide strings. The average, standard deviation, and median values between AT 8+mer homopolymers were then calculated. Homopolymer strings of all 4 nucleotides, ranging from 5–20 nucleotides, were counted in S. enterica serovars and other genera and results are shown in columns F and G of Table 1. To account for every AT kmer of at least 8 nucleotides, the longer motifs were added to 8mers in further analyses; thus, the term AT 8+mer is applied throughout to describe the motif. Ttest analysis was used to determine if differences in kmer counts between groupings were significant at p < 0.01.
S. enterica subspecies I serovar Typhimurium LT2 was the reference genome to produce a common denominator to normalize genomes of different sizes (Table 1). Thus, values greater than 1 indicated that more than the expected number of motifs were observed in comparison to S. enterica after normalizing for the size of the genome, and less than 1 indicated fewer motifs were observed.
AT 8+kmers for Typhimurium LT2 were classified as intergenic, intragenic, or regulatory using Genious Prime 2020.0.3 (Table S2). Another approach used to establish a baseline incidence of AT 8+mers occurring in genes was to generate a list of random numbers using the 4600 predicted genes of the reference genome. Two hundred random numbers were generated between 1 to 4600 corresponding to numbered genes, a FASTA file was then compiled, and the number of AT 8+mers within the randomly generated sets was determined.

2.4. Locating AT 8+Mers within Mutated Genes

S. enterica serovar Typhimurium LT2 (NC_003197.2) was used as the primary reference sequence to name genes and gene functions [30]. The two other reference genomes were Enteritidis strain P125109 and Gallinarum strain 9184, with respective NCBI accession numbers of NC_011294.1 and CP019035.1 [31,32]. Classes and functions of proteins were assessed at https://uniprot.org (accessed on 20 August 2021). Determining impact on open reading frames (ORFs) within annotated genes was done with Genious V2030.0.3. In addition, online software available at https://web.expasy.org (accessed on 26 August 2021) was used to translate proteins and align amino acids with nucleotides. BLAST analyses and generation of reverse complemented sequences used the NCBI website (https://blast.ncbi.nlm.nih.gov/Blast.cgi) (accessed on 26 August 2021).

3. Results

3.1. Homopolymer Strings of at Least 8 Nucleotides Were Dispersed in the Genome of S. enterica

The AT 8+mers were dispersed throughout the entire genome of serovar Typhimurium LT2 (Figure 1). The genome of reference strain S. enterica serovar Typhimurium LT2 is 52.2% GC. When data were expressed as ratios of AT:GC homopolymer strings, the AT 8mer homopolymers (e.g., AAAAAAAA and TTTTTTTT) were much more prevalent than GC 8mers in the reference genome (Figure 2). In total, there were 294 AT 8mers and 11 GC 8mers in the reference serovar, which is a ratio of 27 AT 8mers to every GC 8mer. AT strings longer than 8bp were less frequently observed (Figure 2). On average, the motif occurred every 16,450nt (Table S2). The range of AT 8+kmer distance was 11 to 117,141nt, and the median was 11,498nt (Table S2). Distances of 51,975nt or greater between motifs were over 3 standard deviations and were thus possibly deficient in AT 8+mers. Of 13 putatively deficient regions, the 4 longest regions were assessed for phage genes, pseudo genes, insertion elements, transposases, ribosome binding sites, and regulons. The 4 regions were located between nucleotides (i) 1368633–1444823 (76,198nt), (ii) 2612956–2730097 (117,148nt), (iii) 4124625–4209022 (84,404nt), and (iv) 4342879–4418289 (75,418). At this time, no feature could be found that differentiated AT 8+mer deficient regions from regions separated by a shorter distance.

3.2. The AT 8+mer Motif in Bacteria Was Specific to Genus and Species

Results of comparisons between Bacteria are shown in Table 1. Details include: (1) AT 8+mers in S. enterica groups were significantly more frequent than what was observed for E. coli (p < 0.005); (2) across bacterial classes, Vibrio vulnificus cII had a minimum of 90.0 AT 8+mers and Proteus mirabilis had a 712.7; (3) standard deviations between strains in each Genera ranged from 2.3 for Yersinia pseudotuberculosis to 84.1 for Enterococcus faecalis; (4) all the genera examined, including those that differed by Phylum and Class, had a relative paucity of GC 8+mers as compared to AT 8+mers; thus, it appears there is a bias for Bacteria maintaining AT 8+mers in genomes, or inversely, selecting against GC 8+mers; (5) each species appeared distinctly different from others; (6) Vibrio vulnificus had 180 and 90 AT 8+mers in chromosomes cI and cII, respectively; thus, AT 8+mer content might be a chromosomal characteristic that maintains the separation of the two chromosomes.
Results based upon taxonomic class were an average of 379.57 and 430.8 AT 8+mers, respectively, for the Gammaproteobacteria and Bacilli. In addition, standard deviations were large for both classes, namely 174.51 and 251.24 respectively. The range reflected by standard deviations masks the individuality of each genus and species. For example, Bacillus anthracis had an average of 432 AT 8+mers, whereas Bacillus cereus had an average of 700.3. Results from S. enterica subspecies I had a relatively small overall standard deviation of 12.83 AT 8+mers across 24 serovars, which agrees with the classification of S. enterica subspecies I as a single genomic grouping.

3.3. The AT 8+mer Motif in the Chromosome Is Not Specific to Serovar

As mentioned in the previous section, AT 8+mers per S. enterica grouping was from 315.6 to 332.6, and the average was 322.2 +/− 12.83 AT 8+mers (Table 1). The length of the homopolymer is proposed to impact the physicochemical bending properties of DNA; thus, we wanted to account for every kmer of 8 nucleotides or more. Results from analysis of AT 8+mers between S. enterica serovars were: (i) The incidence of AT 8+mers in the reference genome for serovar Typhimurium LT2 was the lowest of the 12 strains in the group, which suggests that using the serovar as a reference would not over-estimate the incidence of AT 8+mers for S. enterica or other genera; (ii) the standard deviations for AT 8+mers in serovar Typhimurium and in the group of mixed serovars were, respectively, 13.0 and 13.9; (iii) serovars Enteritidis and Typhi, with respective standard deviations of 10.5 and 5.9, appeared more clonal than Typhimurium, and this finding agrees with current knowledge; (iv) the foodborne serovars, namely Typhimurium, Enteritidis, and the group of mixed serovars, had a more variable motif content than human-restricted Typhi. Overall, the S. enterica serovar groups were not significantly different from each other although Typhi had the lowest standard deviation. There were not enough completed genomes of the poultry-restricted serovar Gallinarum to include it for this analysis; however, a reference genome of this important host-restricted poultry pathogen is evaluated in the following text.

3.4. Characterization of the AT 8+mer Motif in Poultry-Associated Serovars of Salmonella

The association of motif content with gene function was catalogued. Table S2 lists 294 intergenic, regulatory, and gene AT 8+mer loci in serovar Typhimurium LT2; of these, 131 were intergenic, 150 were in genes, and 13 were in regulatory regions. Table S3 lists 185 genes and regulatory regions with AT 8+mers, which were located by comparing reference genomes of serovars Typhimurium, Enteritidis, and Gallinarum; thus, 20 more genes and regulatory regions were identified using the comparative approach. Gene functions were assigned in the last column. Some genes in serovars Enteritidis and/or Gallinarum did not have homologs in the Typhimurium reference strain, and vice versa. Serovars Typhimurium, Gallinarum, and Enteritidis each had 3, 22, and 5 pseudogenes with AT 8+mers, respectively, while each genome had a total of 40, 287, and 96 pseudogenes. Thus, respectively, 7.5%, 8.4%, and 5.2% of pseudogenes involved the motif.
The regulons with the motif encompassed mostly metabolic functions. Regulons associated with the genes STM2277, STM0664, and STM4585 have some association with antibiotics (references not shown). Another class of gene that could be of interest for targeting of antimicrobials include transporters, and there are 20 listed in Table S3 [33]. Cell surface molecules that include genes with the motif include colonic acid, lipopolysaccharide, flagella, and fimbria. A virulence factor with the motif is MviM.

3.5. Location of AT 8+mers within Mutated Open Reading Frames

Table 2 lists 22 genes that are a subset of those from Table S3. Serovar Gallinarum had 17 pseudogenes that were intact in serovars Enteritidis and Typhimurium. There were 4 pseudogenes in both serovars Enteritidis and Gallinarum that were intact in Typhimurium. STM1666 was unusual because it was a pseudogene in serovar Typhimurium, absent in Gallinarum, and intact in Enteritidis. STM1666 is thus one of the few genes that differentiates serovar Enteritidis from both Gallinarum and Typhimurium. The gene was present in 31 other Salmonellae genomes available at NCBI, but no serovar was reported for these and no other information was present. There was no homolog of STM1666 in other bacteria.
The location of the first adenine or thymine in homopolymer strings in the subset of mutated genes was located relative to start codons ATG or GTG. Results were that 8, 7, 6, and 1 gene(s) fell into the 1st, 2nd, 3rd, and 4th quartile of gene lengths. These findings thus agree in concept with others that homopolymer tracts are found closer to the 5′ end of open reading frames [6]. However, we suggest that these data are more accurately summarized as AT 8+mers of S. enterica do not often occur near the 3′ end of genes. An area of future research is to determine if taxonomic classes of bacteria differ in location of mutable homopolymer strings within open reading frames.

3.6. S. enterica Plasmids Have AT 8+mer Motifs Possibly Associated with Serovar Preference

Some of the most important foodborne Salmonella serovars harbor a large low-copy virulence plasmid, which is consistently present even when plasmid profiles differ between strains [34,35]. S. enterica serovar Typhimurium, and related immunogenic variant with LPS O-antigens [1,4,5,12] has one of the largest virulence plasmids (pSLT) (NC_003277.2) [30]. At 93,933nt, pSLT would be expected to have 5 or 6 AT 8+mers if the average of motifs found within the chromosome, which was 16,450nt, applied to the extrachromosomal plasmid. Typhimurium pSLT had 6 AT 8+mer loci (Table 3). No evidence of hypermutability of these loci was found within serovar by BLAST analysis; thus, AT 8+mers within serovar appeared conserved. The motif in plasmids of the other two serovars differed substantially from each other and from Typhimurium (Table 3). The serovar Enteritidis plasmid pSENV (CP063701.1) and Gallinarum SG9 strain 9 plasmid (CM001154.1) were 59,372 and 87,371 nucleotides, respectively, and each had 4 AT 8+mers. Overall, there was substantial variation between the AT 8+mer motif in each of the plasmids for the three serovars reviewed, and substantial serovar-related differences in plasmid profiles have been reported [36]. While these results suggest that at least the pSLT virulence plasmids have serovar preference, there are not enough plasmids sequenced across subspecies I to comment further. Analysis of the relationship between AT 8+mer content of plasmids and serovar is a topic for future investigation.

4. Discussion

The AT 8+mer motif was shown here to be associated with (1) S. enterica speciation, (2) serovar designation via details of the AT 8+mer in plasmids, (3) hypermutability in genes and regulatory regions that impact phenotype, growth potential, virulence and metabolism of Salmonella enterica subspecies I. AT 8+mers influence microbial biology. For example, A and T homopolymers impact transcription termination in Archea [37]. The canine herpesvirus thymidine kinase gene has mutational hotspots at stretches of 8 adenines [38]. T7 bacteriophage RNA polymerases undergo transcription slippage at A and T homopolymers [39]. As mentioned previously for S. enterica serovar Enteritidis, a mutational hotspot in 1 of 8 adenines increased virulence [3]. We did not conclude that mutation in the motif was primarily in the 5′ region of genes; however, it is a subject that requires development of better algorithms to address in detail [6]. These results suggest that the AT 8+mer motif facilitates maintenance of S. enterica as a persistent and important foodborne pathogen, as well as impacting epidemiological outcomes by contributing to ecological adaptation [40].
If AT 8+mers are mutational hotspots, then there must also exist a mechanism for repair. Otherwise, evolution of any one serovar of S. enterica would be unidirectional towards extinction. There are several examples of Salmonella serovars, e.g., Typhimurium, Enteritidis, Newport, Infantis, and Heidelberg, that persist over decades; however, the majority of the 1500 serovars within subspecies I cause illness inconsistently, rarely, or never [1,16]. For this reason, we theorize there is another function for AT 8+mers. It is proposed that chromosomal AT 8+mers align sections of genomes during replication and DNA repair processes. This function would result in repair of mutations occurring between stretches of wildtype AT 8+mers [41,42]. It would also account for an inherent mechanism of self-recognition, which would facilitate preferential, but not exclusive, DNA exchange within the species. The pan-genome of S. enterica subspecies 1 has a mosaic structure, with frequent inversions, deletions, and insertions occurring between serovars; however, the chromosomal arrangement of many Salmonella lineages is comparatively stable [25,32,43,44]. Thus, the motif would account for (i) the stability of some serovars with conserved genome features that are persistent, e.g., serovar Typhimurium [1], (ii) the occasional emergence of a new serovar that happens to undergo clonal expansion in an environment favorable for growth, e.g., serovar Tennessee in peanut butter [45,46], (iii) the rare emergence of a hybrid strain following a major recombination event that results in rapid proliferation of a serovar with new biological properties, e.g., serovar Enteritidis [47], and (iv) the periodic emergence and disappearance of serovars that are not optimized for the survival in the environment in which they are generated. Another finding that the AT 8+mer impacts speciation is that the two chromosomes of Vibrio vulnificus differ substantially and would thus never be predicted to coalesce.

5. Conclusions

These results support and add to the concept that the homopolymer AT motif is an important regulator of bacterial phase transition, ecological persistence, and association with disease [6]. In regard to Salmonella enterica subspecies I, the motif is proposed to be important for maintaining optimal gene content, with the coincidental impact of contributing to serovar diversity [48,49,50,51]. Future research on the AT 8+mer contribution to these proposed functions will require proof of concept experimentation. Biological experimentation will focus on finding environmental niches that facilitate genomic exchange and repair mechanisms. Analyzing the impact of the motif on the safety of the food supply may require methods with detection limits that are orders of magnitude lower than those used to currently detect bacteria. This is because a successful recombinant may at first be a rare cell type [52,53]. Further analysis into the impact of AT 8+mers on the ability of S. enterica to survive and persist in environments associated with foodborne illness is thus warranted.

Supplementary Materials

The following are available online at https://0-www-mdpi-com.brum.beds.ac.uk/article/10.3390/microorganisms9102075/s1, Table S1: List of bacterial genomes analyzed for AT 8+mer homopolymers, Table S2: Location and classification of all AT8+mers in Typhimurium LT2, Table S3: Categories of genes from 3 serovars of S. enterica subspecies I that vary in AT 8+mer mutations1.

Author Contributions

Conceptualization, J.G. and A.R.R.; methodology, all authors; validation, J.G., A.R.R. and J.N.V.; formal analysis, A.R.R.; investigation, J.G. and J.N.V.; resources, J.G.; data curation, J.G.; writing—original draft preparation, J.G.; writing—review and editing, J.G., M.J.R.J., A.O. and D.H.S.; visualization, J.G., A.R.R., J.N.V., A.O. and D.H.S.; supervision, J.G.; project administration, J.G., A.R.R. and M.J.R.J.; funding acquisition, J.G., A.R.R. and M.J.R.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The U.S. Department of Agriculture, Agricultural Research Service project plan number 6040-32000-012-00-D and by The National Institute of Food and Agriculture, Agriculture and Food Research Initiative Grant Number 2019-67021-29924.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The database analyzed for this project can be found at the National Center for Biotechnology Institute (NCBI) at https://www.ncbi.nlm.nih.gov (accessed on 26 August 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. CDC. An Atlas of Salmonella in the United States, 1968–2011: Laboratory-Based Enteric Disease Surveillance; Centers for Disease Control and Prevention (CDC): Atlanta, GA, USA, 2013. [Google Scholar]
  2. Tack, D.M.; Ray, L.; Griffin, P.M.; Cieslak, P.R.; Dunn, J.; Rissman, T.; Jervis, R.; Lathrop, S.; Muse, A.; Duwell, M.; et al. Preliminary Incidence and Trends of Infections with Pathogens Transmitted Commonly Through Food—Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 2016–2019. MMWR Morb. Mortal. Wkly. Rep. 2020, 69, 509–514. [Google Scholar] [CrossRef]
  3. Guard, J.; Cao, G.; Luo, Y.; Baugher, J.D.; Davison, S.; Yao, K.; Hoffmann, M.; Zhang, G.; Likens, N.; Bell, R.L.; et al. Genome sequence analysis of 91 Salmonella Enteritidis isolates from mice caught on poultry farms in the mid 1990s. Genomics 2020, 112, 528–544. [Google Scholar] [CrossRef] [PubMed]
  4. Morales, C.A.; Guard, J.; Sanchez-Ingunza, R.; Shah, D.H.; Harrison, M. Virulence and metabolic characteristics of Salmonella enterica serovar enteritidis strains with different sefD variants in hens. Appl. Environ. Microbiol. 2012, 78, 6405–6412. [Google Scholar] [CrossRef] [Green Version]
  5. Sanchez-Ingunza, R.; Guard, J.; Morales, C.A.; Icard, A.H. Reduction of Salmonella Enteritidis in the spleens of hens by bacterins that vary in fimbrial protein SefD. Foodborne Pathog. Dis. 2015, 12, 836–843. [Google Scholar] [CrossRef]
  6. Orsi, R.H.; Bowen, B.M.; Wiedmann, M. Homopolymeric tracts represent a general regulatory mechanism in prokaryotes. BMC Genom. 2010, 11, 102. [Google Scholar] [CrossRef] [Green Version]
  7. Reich, Z.; Friedman, P.; Levin-Zaidman, S.; Minsky, A. Effects of adenine tracts on the B-Z transition. Fine tuning of DNA conformational transition processes. J. Biol. Chem. 1993, 268, 8261–8266. [Google Scholar] [CrossRef]
  8. Hines, E.R.; Kolek, O.I.; Jones, M.D.; Serey, S.H.; Sirjani, N.B.; Kiela, P.R.; Jurutka, P.W.; Haussler, M.R.; Collins, J.F.; Ghishan, F.K. 1,25-dihydroxyvitamin D3 down-regulation of PHEX gene expression is mediated by apparent repression of a 110 kDa transfactor that binds to a polyadenine element in the promoter. J. Biol. Chem. 2004, 279, 46406–46414. [Google Scholar] [CrossRef] [Green Version]
  9. Lindemose, S.; Nielsen, P.E.; Mollegaard, N.E. Polyamines preferentially interact with bent adenine tracts in double-stranded DNA. Nucleic Acids Res. 2005, 33, 1790–1803. [Google Scholar] [CrossRef] [Green Version]
  10. Jung, B.H.; Beck, S.E.; Cabral, J.; Chau, E.; Cabrera, B.L.; Fiorino, A.; Smith, E.J.; Bocanegra, M.; Carethers, J.M. Activin type 2 receptor restoration in MSI-H colon cancer suppresses growth and enhances migration with activin. Gastroenterology 2007, 132, 633–644. [Google Scholar] [CrossRef] [Green Version]
  11. Agnoli, K.; Haldipurkar, S.S.; Tang, Y.; Butt, A.T.; Thomas, M.S. Distinct Modes of Promoter Recognition by Two Iron Starvation sigma Factors with Overlapping Promoter Specificities. J. Bacteriol. 2019, 201. [Google Scholar] [CrossRef] [Green Version]
  12. Roberts, J.D.; Nguyen, D.; Kunkel, T.A. Frameshift fidelity during replication of double-stranded DNA in HeLa cell extracts. Biochemistry 1993, 32, 4083–4089. [Google Scholar] [CrossRef] [PubMed]
  13. Traverse, C.C.; Ochman, H. Genome-Wide Spectra of Transcription Insertions and Deletions Reveal That Slippage Depends on RNA:DNA Hybrid Complementarity. mBio 2017, 8. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Gragg, H.; Harfe, B.D.; Jinks-Robertson, S. Base composition of mononucleotide runs affects DNA polymerase slippage and removal of frameshift intermediates by mismatch repair in Saccharomyces cerevisiae. Mol. Cell. Biol. 2002, 22, 8756–8762. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Guard, J.; Morales, C.A.; Fedorka-Cray, P.; Gast, R.K. Single nucleotide polymorphisms that differentiate two subpopulations of Salmonella enteritidis within phage type. BMC Res. Notes 2011, 4, 369. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Grimont, P.; Weill, F.-X. Antigenic formulae of the Salmonella serovars, 9th Edition. In WHO Collaborating Centre for Reference and Research on Salmonella; World Health Organization: Paris, France, 2007. [Google Scholar]
  17. Guard-Bouldin, J.; Gast, R.K.; Humphrey, T.J.; Henzler, D.J.; Morales, C.; Coles, K. Subpopulation characteristics of egg-contaminating Salmonella enterica serovar Enteritidis as defined by the lipopolysaccharide O chain. Appl. Environ. Microbiol. 2004, 70, 2756–2763. [Google Scholar] [CrossRef] [Green Version]
  18. Gantois, I.; Ducatelle, R.; Pasmans, F.; Haesebrouck, F.; Van Immerseel, F. The Salmonella Enteritidis lipopolysaccharide biosynthesis gene rfbH is required for survival in egg albumen. Zoonoses Public Health 2009, 56, 145–149. [Google Scholar] [CrossRef] [PubMed]
  19. Parker, C.T.; Liebana, E.; Henzler, D.J.; Guard-Petter, J. Lipopolysaccharide O-chain microheterogeneity of Salmonella serotypes Enteritidis and Typhimurium. Environ. Microbiol. 2001, 3, 332–342. [Google Scholar] [CrossRef]
  20. Huang, K.; Fresno, A.H.; Skov, S.; Olsen, J.E. Dynamics and Outcome of Macrophage Interaction Between Salmonella Gallinarum, Salmonella Typhimurium, and Salmonella Dublin and Macrophages From Chicken and Cattle. Front. Cell. Infect. Microbiol. 2019, 9, 420. [Google Scholar] [CrossRef]
  21. Foley, S.L.; Nayak, R.; Hanning, I.B.; Johnson, T.J.; Han, J.; Ricke, S.C. Population dynamics of Salmonella enterica serotypes in commercial egg and poultry production. Appl. Environ. Microbiol. 2011, 77, 4273–4279. [Google Scholar] [CrossRef] [Green Version]
  22. Branchu, P.; Bawn, M.; Kingsley, R.A. Genome Variation and Molecular Epidemiology of Salmonella enterica Serovar Typhimurium Pathovariants. Infect. Immun. 2018, 86. [Google Scholar] [CrossRef] [Green Version]
  23. McMillan, E.A.; Gupta, S.K.; Williams, L.E.; Jove, T.; Hiott, L.M.; Woodley, T.A.; Barrett, J.B.; Jackson, C.R.; Wasilenko, J.L.; Simmons, M.; et al. Antimicrobial Resistance Genes, Cassettes, and Plasmids Present in Salmonella enterica Associated With United States Food Animals. Front. Microbiol. 2019, 10, 832. [Google Scholar] [CrossRef] [PubMed]
  24. Desai, P.T.; Porwollik, S.; Long, F.; Cheng, P.; Wollam, A.; Bhonagiri-Palsikar, V.; Hallsworth-Pepin, K.; Clifton, S.W.; Weinstock, G.M.; McClelland, M. Evolutionary Genomics of Salmonella enterica Subspecies. mBio 2013, 4. [Google Scholar] [CrossRef] [Green Version]
  25. Achtman, M.; Hale, J.; Murphy, R.A.; Boyd, E.F.; Porwollik, S. Population structures in the SARA and SARB reference collections of Salmonella enterica according to MLST, MLEE and microarray hybridization. Infect. Genet. Evol. 2013, 16, 314–325. [Google Scholar] [CrossRef] [Green Version]
  26. Turcotte, C.; Woodward, M.J. Cloning, DNA nucleotide sequence and distribution of the gene encoding the SEF14 fimbrial antigen of Salmonella enteritidis. J. Gen. Microbiol. 1993, 139, 1477–1485. [Google Scholar] [CrossRef] [Green Version]
  27. Clouthier, S.C.; Collinson, S.K.; Kay, W.W. Unique fimbriae-like structures encoded by sefD of the SEF14 fimbrial gene cluster of Salmonella enteritidis. Mol. Microbiol. 1994, 12, 893–901. [Google Scholar] [CrossRef] [PubMed]
  28. Matthews, T.D.; Schmieder, R.; Silva, G.G.; Busch, J.; Cassman, N.; Dutilh, B.E.; Green, D.; Matlock, B.; Heffernan, B.; Olsen, G.J.; et al. Genomic Comparison of the Closely-Related Salmonella enterica Serovars Enteritidis, Dublin and Gallinarum. PLoS ONE 2015, 10, e0126883. [Google Scholar] [CrossRef]
  29. Sayers, E.W.; Agarwala, R.; Bolton, E.E.; Brister, J.R.; Canese, K.; Clark, K.; Connor, R.; Fiorini, N.; Funk, K.; Hefferon, T.; et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2019, 47, D23–D28. [Google Scholar] [CrossRef] [Green Version]
  30. McClelland, M.; Sanderson, K.E.; Spieth, J.; Clifton, S.W.; Latreille, P.; Courtney, L.; Porwollik, S.; Ali, J.; Dante, M.; Du, F.; et al. Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature 2001, 413, 852–856. [Google Scholar] [CrossRef] [Green Version]
  31. Thomson, N.R.; Clayton, D.J.; Windhorst, D.; Vernikos, G.; Davidson, S.; Churcher, C.; Quail, M.A.; Stevens, M.; Jones, M.A.; Watson, M.; et al. Comparative genome analysis of Salmonella Enteritidis PT4 and Salmonella Gallinarum 287/91 provides insights into evolutionary and host adaptation pathways. Genome Res. 2008, 18, 1624–1637. [Google Scholar] [CrossRef] [Green Version]
  32. Allard, M.W.; Luo, Y.; Strain, E.; Pettengill, J.; Timme, R.; Wang, C.; Li, C.; Keys, C.E.; Zheng, J.; Stones, R.; et al. On the evolutionary history, population genetics and diversity among isolates of Salmonella Enteritidis PFGE pattern JEGX01.0004. PLoS ONE 2013, 8, e55254. [Google Scholar] [CrossRef]
  33. Du, D.; van Veen, H.W.; Murakami, S.; Pos, K.M.; Luisi, B.F. Structure, mechanism and cooperation of bacterial multidrug transporters. Curr. Opin. Struct. Biol. 2015, 33, 76–91. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Guiney, D.G.; Fang, F.C.; Krause, M.; Libby, S. Plasmid-mediated virulence genes in non-typhoid Salmonella serovars. FEMS Microbiol. Lett. 1994, 124, 1–9. [Google Scholar] [CrossRef] [PubMed]
  35. Fardsanei, F.; Nikkhahi, F.; Bakhshi, B.; Salehi, T.Z.; Tamai, I.A.; Dallal, M.M.S. Molecular characterization of Salmonella enterica serotype Enteritidis isolates from food and human samples by serotyping, antimicrobial resistance, plasmid profiling, (GTG)5-PCR and ERIC-PCR. New Microbes New Infect. 2016, 14, 24–30. [Google Scholar] [CrossRef] [Green Version]
  36. Jelesic, Z.; Kulauzov, M.; Kozoderovic, G. Analysis of the plasmid profile of various Salmonella serotypes. Med. Pregl. 2000, 53, 564–567. [Google Scholar]
  37. Santangelo, T.J.; Cubonova, L.; Skinner, K.M.; Reeve, J.N. Archaeal intrinsic transcription termination in vivo. J. Bacteriol. 2009, 191, 7102–7108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Yamada, S.; Matsumoto, Y.; Takashima, Y.; Otsuka, H. Mutation hot spots in the canine herpesvirus thymidine kinase gene. Virus Genes 2005, 31, 107–111. [Google Scholar] [CrossRef]
  39. Koscielniak, D.; Wons, E.; Wilkowska, K.; Sektas, M. Non-programmed transcriptional frameshifting is common and highly RNA polymerase type-dependent. Microb. Cell Fact. 2018, 17, 184. [Google Scholar] [CrossRef] [PubMed]
  40. Silva, C.; Puente, J.L.; Calva, E. Salmonella virulence plasmid: Pathogenesis and ecology. Pathog. Dis. 2017, 75, ftx070. [Google Scholar] [CrossRef]
  41. Brandis, G.; Cao, S.; Hughes, D. Co-evolution with recombination affects the stability of mobile genetic element insertions within gene families of Salmonella. Mol. Microbiol. 2018, 108, 697–710. [Google Scholar] [CrossRef] [Green Version]
  42. Brandis, G.; Cao, S.; Hughes, D. Measuring Homologous Recombination Rates between Chromosomal Locations in Salmonella. Bio. Protoc. 2019, 9, e3159. [Google Scholar] [CrossRef]
  43. Achtman, M.; Zhou, Z.; Alikhan, N.F.; Tyne, W.; Parkhill, J.; Cormican, M.; Chiou, C.S.; Torpdahl, M.; Litrup, E.; Prendergast, D.M.; et al. Genomic diversity of Salmonella enterica -The UoWUCC 10K genomes project. Wellcome Open Res. 2020, 5, 223. [Google Scholar] [CrossRef]
  44. Alikhan, N.F.; Zhou, Z.; Sergeant, M.J.; Achtman, M. A genomic overview of the population structure of Salmonella. PLoS Genet. 2018, 14, e1007261. [Google Scholar] [CrossRef] [Green Version]
  45. Dong, H.J.; Cho, S.; Boxrud, D.; Rankin, S.; Downe, F.; Lovchik, J.; Gibson, J.; Erdman, M.; Saeed, A.M. Single-nucleotide polymorphism typing analysis for molecular subtyping of Salmonella Tennessee isolates associated with the 2007 nationwide peanut butter outbreak in the United States. Gut Pathog. 2017, 9, 25. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Wilson, M.R.; Brown, E.; Keys, C.; Strain, E.; Luo, Y.; Muruvanda, T.; Grim, C.; Beaubrun, J.J.-G.; Jarvis, K.; Ewing, L.; et al. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks. PLoS ONE 2016, 11, e0146929. [Google Scholar] [CrossRef] [PubMed]
  47. Guard, J.; Shah, D.; Morales, C.A.; Call, D. Evolutionary trends associated with niche specialization as modeled by whole genome analysis of egg-contaminating Salmonella enterica serovar Enteritidis. In Salmonella: From Genome to Function; Porwollik, S., Ed.; Caister Academic Press: San Diego, CA, USA, 2011; pp. 91–106. [Google Scholar]
  48. Zhou, Z.; McCann, A.; Litrup, E.; Murphy, R.; Cormican, M.; Fanning, S.; Brown, D.; Guttman, D.S.; Brisse, S.; Achtman, M. Neutral genomic microevolution of a recently emerged pathogen, Salmonella enterica serovar Agona. PLoS Genet. 2013, 9, e1003471. [Google Scholar] [CrossRef] [Green Version]
  49. Sangal, V.; Harbottle, H.; Mazzoni, C.J.; Helmuth, R.; Guerra, B.; Didelot, X.; Paglietti, B.; Rabsch, W.; Brisse, S.; Weill, F.X.; et al. Evolution and population structure of Salmonella enterica serovar Newport. J. Bacteriol. 2010, 192, 6465–6476. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Park, C.J.; Andam, C.P. Distinct but Intertwined Evolutionary Histories of Multiple Salmonella enterica Subspecies. mSystems 2020, 5. [Google Scholar] [CrossRef] [Green Version]
  51. Liu, Y.; Zhang, D.F.; Zhou, X.; Xu, L.; Zhang, L.; Shi, X. Comprehensive Analysis Reveals Two Distinct Evolution Patterns of Salmonella Flagellin Gene Clusters. Front. Microbiol. 2017, 8, 2604. [Google Scholar] [CrossRef]
  52. Richards, A.K.; Hopkins, B.A.; Shariat, N.W. Conserved CRISPR arrays in Salmonella enterica serovar Infantis can serve as qPCR targets to detect Infantis in mixed serovar populations. Lett. Appl. Microbiol. 2020, 71, 138–145. [Google Scholar] [CrossRef]
  53. Shariat, N.; Dudley, E. CRISPR Typing of Salmonella Isolates. Methods Mol. Biol. 2021, 2182, 39–44. [Google Scholar] [CrossRef]
Figure 1. Locations of AT 8+mers in the genome of Salmonella enterica serovar Typhimurium LT2 NC_003197.2.
Figure 1. Locations of AT 8+mers in the genome of Salmonella enterica serovar Typhimurium LT2 NC_003197.2.
Microorganisms 09 02075 g001
Figure 2. Ratios of AT homopolymers from 5 to 10 nucleotides in Salmonella enterica serovar Typhimurium LT2 NC_003197.2. The ratio of AT homopolymer kmers, either adenine or thymine but not mixed, to GC homopolymers was determined using Geneious software as described in text. The range in number of nucleotides per kmer searched was 5 to 10 (see legend label). Results show that a nucleotide motif of 8 was the most commonly encountered, and that approximately 27 AT homopolymers were found for every 1 GC AT homopolymer in the reference sequence of S. enterica LT2 NC_003197.2.
Figure 2. Ratios of AT homopolymers from 5 to 10 nucleotides in Salmonella enterica serovar Typhimurium LT2 NC_003197.2. The ratio of AT homopolymer kmers, either adenine or thymine but not mixed, to GC homopolymers was determined using Geneious software as described in text. The range in number of nucleotides per kmer searched was 5 to 10 (see legend label). Results show that a nucleotide motif of 8 was the most commonly encountered, and that approximately 27 AT homopolymers were found for every 1 GC AT homopolymer in the reference sequence of S. enterica LT2 NC_003197.2.
Microorganisms 09 02075 g002
Table 1. Expected versus observed occurrence of homopolymer strings of 8 and more nucleotides in genomes of Bacteria.
Table 1. Expected versus observed occurrence of homopolymer strings of 8 and more nucleotides in genomes of Bacteria.
PhylumGenus Species 1Other Genome InformationNumber of Genomes AnalyzedCharacteristicsGenome Size (bp)Common Denominator (nt) 2Expected Number of 8+kmersObserved AT 8+mersObserved GC 8+mersObserved vs. Expecterd AT 8+mers 3Observed vs. Expected GC 8+mers 3
ProteobacteriaSalmonella entericaTyphimurium12Average4,890,44816,299300.0332.617.21.110.06
stdev50,356---3.113.03.6------
ProteobacteriaSalmonella entericaEnteritidis12Average4,686,46216,299287.5323.721.51.130.07
stdev20,384---1.310.54.2------
ProteobacteriaSalmonella entericaTyphi12Average4,770,41416,299292.7316.929.51.080.10
stdev60,270---3.75.93.2------
ProteobacteriaSalmonella entericamixed12Average4,713,70116,299289.2315.617.21.090.06
stdev80,652---4.913.94.4------
ProteobacteriaEscherichia coli---12Average5,087,13316,299312.1281.918.30.900.06
stdev262,098---16.130.26.5------
ProteobacteriaProteus mirabilis---3Average4,124,43116,299253.0712.715.72.820.06
stdev83,305---5.142.02.1------
ProteobacteriaShigella sonnei---3Average4,929,59916,299302.4261.311.70.860.04
stdev90,607---5.68.43.5------
ProteobacteriaYersinia pseudotuberculosis---3Average4,802,24516,299294.6429.3120.31.460.41
stdev118,706---7.32.33.8------
ProteobacteriaVibrio vulnificuschromosome I3Average3,330,10416,299204.3180.010.30.880.05
stdev79,423---4.914.04.9------
ProteobacteriaVibrio vulnificuschromosome II3Average1,756,66816,299107.890.03.30.830.03
stdev87,177---5.37.93.1------
FirmicutesStaphylococcus aureus---3Average2,948,37316,299180.9108.30.00.600.00
stdev114,371---7.010.60.0------
FirmicutesStreptococcus pyogenes---3Average1,895,70716,299116.3263.70.32.270.00
stdev42,370---2.615.40.6------
FirmicutesEnterococcus faecalis---3Average3,090,38716,299189.6649.72.03.420.01
stdev117,259---7.284.13.5------
FirmicutesBacillus anthracis---3Average5,228,73216,299320.8432.01.31.350.00
stdev1349---0.111.50.6------
FirmicutesBacillus cereus---3Average5,406,06016,299331.7700.313.02.110.04
stdev16,615---1.053.76.1------
1 Genomes included in analysis are listed in Supplementary Table S1 with NCBI accession numbers. 2 The common denominater of 16,299 nucleotides (nt) used to normalize variation in geome size was obtained from Salmonella enterica subspecies I serotype Typhimurum LT2 (NC_003197.2) as described in text. 3 Values greater than one indicate more motifs observed than expected, less than one indicates fewer were observed than expected.
Table 2. AT 8+mer motifs associated with gene disruption or altered regulatory regions in S. enterica serovars Enteritidis, Table 1.
Table 2. AT 8+mer motifs associated with gene disruption or altered regulatory regions in S. enterica serovars Enteritidis, Table 1.
STM Gene AccessionSEEG Gene AccessionSEN Gene AccessionAT 8+mer SequenceCommon Name of GeneDescription of Target GeneGene FunctionBiological Process
no homologSEEG9184_21515SEN_RS22080conserved (3 locations)sefCSEEG pseudogeneHas 3 AT 8mers in sequence; outer membrane fimbrial protein SefC, pseudo in SG, which has an extra A/T to make a 7meradhesion
STM0071SEEG9184_20585SEN_RS00360conservedcaiCSEEG pseudogenecrotonobetaine/carnitine-CoA ligasetransporter
STM0858SEEG9184_16520SEN_RS04155conservedunnamedSEEG pseudogeneelectron transfer flavoprotein-ubiquinone oxidoreductasetransporter
STM2020SEEG9184_10295SEN_RS10515conservedcbiOSEEG pseudogenecobalt transport atp-binding protein CbiO: B12 synthesis associated? Truncated, maybe shorter product?transporter
STM2241SEEG9184_09260SEN_RS11570conservedsspH2SEEG pseudogeneE3 ubiquitin-protein ligase; induced by the SPI-2 regulatory ssrA/Bvirulence factor
STM2274SEEG9184_09070SEN_RS11735conservedunnamedSEEG pseudogeneMFS transportertransporter
STM2691SEEG9184_07115SEN_RS13585conservedunnamedSEEG pseudogenetype I secretion system permease/ATPase: TolC family OMPvirulence factor
STM3658SEEG9184_00930SEN_RS18105conservedyiaHSEEG pseudogeneacetyltransferasebiosynthesis
STM1054SEEG9184_09275SEN_RS23040conservedunnamedSEN pseudogene, SEEG 78 bp tRNA regionGifsy-2 prophage protein in STM: GC rich region has a 7 bp deletion in SEN in a guanidine rich fragment, causing a frameshift; homology in SEEG to tRNA-prophage associated
STM4039SEEG9184_23265SEN_RS24515conservedunnamedSEN, SEEG pseudogeneHO proteininner membrane protein
STM1666no homologSEN_RS07090conservedunnamedSTM pseudogene, SEEG absentSTM has in-frame stop following codon 24; SEN, hypothetical proteinunknown
STM1550no homologSEN_RS07790Deletion of 154 bp in SENunnamedSEN pseudogene, SEEG absenttype II toxin-antitoxin system mRNA interferase toxincellular detoxification
STM0341SEEG9184_19090&19095SEN_RS01660SEEG 1 bp deletionunnamedSEEG pseudogene or split into two genesSTM and SEN, putative inner membrane protein; SEEG, 2 transmembrane regulatorsinner membrane protein
STM1130SEEG9184_15700SEN_RS05135SEEG 1 bp deletionunnamedSEEG pseudogeneN-acetylneuraminic acid mutarotasemetabolism
STM1602SEEG9184_13085SEN_RS07535SEEG 1 bp deletionsifBSEEG pseudogeneeffector protein SifBvirulence factor
STM1698SEEG9184_13605SEN_RS06925SEEG 1 bp deletionsteCSEEG pseudogenesecreted effector kinase SteCvirulence factor
STM1869SEEG9184_14565SEN_RS06020SEEG 1 bp deletionunnamedSEEG pseudogeneHO proteinphage associated
STM1939SEEG9184_15200SEN_RS05535SEEG 1 bp deletionunnamedSEEG pseudogeneputative glucose-6-phosphate dehydrogenasemetabolism
STM2129SEEG9184_09800SEN_RS11065SEEG 1 bp deletionyegBSEEG pseudogenemultidrug transporter subunit MdtDtransporter
no homologSEEG9184_21510SEN_RS22085SEEG 1 bp substitutionsefDSEEG pseudogeneadhesin and master global regulator of phase transition; often mutated in SEN due to 1bp del in adenine homopolymer 8merregulon, adhesion
STM1941SEEG9184_15210SEN_RS05525SEEG, SEN 1 bp deletionunnamedSEN, SEEG pseudogeneHO proteininner membrane protein
STM3674SEEG9184_00845SEN_RS18190SEEG, SEN 1 bp substitutionlyxKSEEG pseudogenecarbohydrate kinasemetabolism
1 Reference genomes are S. Typhimurium NC_003197.2 (STM), S. Enteritidis NC_011294.1 (SEN), and S. Gallinarum CP019035.1 (SEEG).
Table 3. AT 8+mer motif variation in 3 serotypes of Salmonella enterica subspecies I.
Table 3. AT 8+mer motif variation in 3 serotypes of Salmonella enterica subspecies I.
SerovarTyphimuriumEnteritidisGallinarumOther Information
AccessionNC_003277.2NZ_CP063701.1CM001154.1
Other NamepSLTpSENVstr. SG9
Atmer SizeDescription of LociVariation from pSLTVariation from pSLTpSLT StartpSLT EndGene Function
9merIntergenic(PSLT039-PSLT038)1nt deletion1nt deletion2796127969spvB-spvC;58nt upstream from spvC start
10merIntergenic (PSLT041-PSLT042)1nt deletion2nt deletion3232432333spvR-PSLT041: 10nt upstream from spvR start
9merPSLT076 presentabsent6229962307traY: conjugative transfer: oriT nicking
8merPSLT088 absentpresent6909369100traC: conjugative transfer: assembly
8merPSLT102absentabsent8283582842traS: conjugative transfer: surface exclusion
8merPSLT111 truncatedtruncated9351293519finO: conjugative transfer: regulation
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Guard, J.; Rivers, A.R.; Vaughn, J.N.; Rothrock, Jr., M.J.; Oladeinde, A.; Shah, D.H. AT Homopolymer Strings in Salmonella enterica Subspecies I Contribute to Speciation and Serovar Diversity. Microorganisms 2021, 9, 2075. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms9102075

AMA Style

Guard J, Rivers AR, Vaughn JN, Rothrock, Jr. MJ, Oladeinde A, Shah DH. AT Homopolymer Strings in Salmonella enterica Subspecies I Contribute to Speciation and Serovar Diversity. Microorganisms. 2021; 9(10):2075. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms9102075

Chicago/Turabian Style

Guard, Jean, Adam R. Rivers, Justin N. Vaughn, Michael J. Rothrock, Jr., Adelumola Oladeinde, and Devendra H. Shah. 2021. "AT Homopolymer Strings in Salmonella enterica Subspecies I Contribute to Speciation and Serovar Diversity" Microorganisms 9, no. 10: 2075. https://0-doi-org.brum.beds.ac.uk/10.3390/microorganisms9102075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop