Other criteria for the blastx comparison were tested but we observed no significant dif ference in the results after the subsequent filters. Candi dates selleck with some of their best hits in stramenopiles in addition to bacteria were also retained since some HGTs may be shared between stramenopiles, and genes for which orthologs were identified in non stramenopile species were discarded. The evolutionary origin of the candidate genes was then investigated using phylogenetic approaches. For each gene, homologues were retrieved from the protein nr database using Blastp. The sequences were aligned using Muscle 3. 6. The resulting alignments were visually inspected Inhibitors,Modulators,Libraries and manually refined using the MUST software. Inhibitors,Modulators,Libraries Ambiguously aligned regions were removed prior to phylogenetic analysis.
Maximum likelihood phylogenetic tree reconstructions were carried out on the remaining positions using PhyML with the Le and Gascuel model with a gamma correction to take into account evolution ary rate variation among sites. Inhibitors,Modulators,Libraries Tree robustness was estimated by a non parametric bootstrap approach using PhyML and the same parameters with 100 replicates of the original dataset. Bayesian phylogenetic trees were also reconstructed using MrBayes version 3. 1. 2. We used a mixed model of amino acid substitution and a gamma distribution to take into account site rate variation. MrBayes was run with four chains for 1 million generations and trees were sampled every 100 generations. To construct the consensus tree, the first 1,500 trees were discarded as burn in. The candidates with clear eukaryotic origin were then discarded.
This process provided 133 candidate genes. These candidates contain a high pro portion of monoexonic genes compared to the average number of monoexonic genes in Blastocystis sp. Protein domain analysis InterProScan Inhibitors,Modulators,Libraries was run against all C. merolae, P. sojae, T. pseudonana and Blastocystis sp. proteins. Matches that fulfilled the following criteria were retained match tagged as true positive by InterProScan. match with an e value 10 1. A total of 2,305 InterPro domains were found in Blastocystis sp. which corresponds to 4,096 proteins. Functional annotation Enzyme annotation Enzyme detection in predicted Blastocystis sp. proteins was performed with PRIAM, using the PRIAM July 2006 Enzyme release. A total of 428 different Inhibitors,Modulators,Libraries EC numbers, corresponding to enzyme domains, are asso ciated with 1,140 Blastocystis sp.
proteins. Therefore, about 19% of Blastocystis sp. proteins contain at least one enzymatic domain. Association of metabolic pathways with enzymes and Blastocystis sp Potential metabolic pathways were deduced from EC numbers using the KEGG pathway selleck products database. Links between EC numbers and metabolic pathways were obtained from the KEGG website. Using this file and the PRIAM results, 906 Blastocystis sp. proteins were assigned to 201 pathways.