Dendrograms on the left are derived from Figure 3a (branch lengths do not represent inferred distances). Detected orthologs are only present in the genomes in bold. Arrows in black represent genes in an OG of the highlighted pattern and grey arrows represent other genes nearby in
the genome. Blue lines linking genes indicate inferred orthology. Gene numbers correspond to the last part of the original gene names. Numbers in colours other than black indicate genes with products putatively secreted (red) or with transmembrane domains (green). The clusters are (a) one including a wrongly annotated check details pathogenicity-related gene (yapH) and a phage gene (Φ-hk97); and (b) one possibly related to the type IV secretion system. The second cluster (Figure 5b) is present in XamC and Xfa0 but not in Xfa1, despite the high genome-wide similarity presented between Xfa1 and Xfa0 (Figure 2a). The classification of putative homologs of the genes in this cluster (see methods) revealed that it is mainly composed of sequences similar to proteins in Escherichia coli, Siphoviridae, Savolitinib Stenotrophomonas sp. SKA14, Salmonella enterica and
Pseudomonas aeruginosa (Additional file 5). Moreover, members of the Siphoviridae viral family are known to be Pseudomonas and Xanthomonas phages, suggesting the presence of virus-mediated LGT. We cannot attribute the pattern to the mixture of chromosomal and plasmidic DNA in draft genomes (XamC and Xfa0), because none of the sequences presented Celecoxib similarity with genes in Xanthomonas plasmids. Note that the gene at the locus XAUC_17260_1
(Xfa0:1726 in Figure 5b) was originally annotated as yapH, but its product is a large protein of 1231 aa in Xfa0 and 1482 aa in XamC, putatively xenologous with a component of a phage tail (group COG4733 in the COG database). Two genes in the cluster (XamCg00977 and XamCg00978) presented a G+C content more than one standard deviation below the mean of the coding sequences in the XamC genome (i.e., 64.82 ± 3.31%), and a low CAI with respect to the whole predicted coding sequences (0.516 and 0.486, respectively). The other seven genes in the cluster presented average features, which would have precluded their identification as units potentially under LGT. Discussion The results of the genome-based phylogenetic reconstruction suggest that certain changes should be considered in the nomenclature of the Xanthomonas genus. For instance, X. fuscans was recently proposed as a new species [27], but here we show that it should be considered as a later heterotypic synonym of X. citri, as previously suggested [18, 31]. Other clades in the standing bacterial nomenclature [63] within the Xanthonomonas genus were consistent with the phylogenetic reconstruction. Nevertheless, we observed a paralogy in the genus Xanthomonas when Xylella fastidiosa was included with X. albilineans outside the Xanthomonas group. Our results suggest that X.