Microbial or Other Small Genome Sequencing
Due to the emergence of powerful next-generation sequencing (NGS) technologies, it has never been easier to de novo sequence or re-sequence whole genomes. One of Microsynth’s core competencies is the sequencing of small microbial genomes as well as profiling microbial communities. Besides offering its customers state-of-the-art NGS platforms, Microsynth has also made signifcant investments into the bioinformatics area. This application note gives you an overview about Microsynth’s various bioinformatics tools/services in the field of microbial re-sequencing as well as its possible impact on your research. In brief, Microsynth offers a basic bioinformatics service for detecting small InDels & substitutions and its effect on proteins as well as a more advanced service capable of finding even difficult-to-detect large insertions and deletions (see also the application note "Microbial Illumina Resquencing").
Re-sequencing including Basic Bioinformatics Analysis
Sequencing: DNA quality check, library preparation and MiSeq sequencing is performed from either genomic DNA isolated by the client or DNA isolated at Microsynth.
Mapping: Sequencing reads are mapped to the reference sequence/genome; this step builds the foundation for the following analyses.
Variant Calling: Possible small insertions, deletions and substitutions are detected and reported in the VCF format.
Annotation of Variants: A variant calling step results in a user-friendly graphic summarizing the major findings of all investigated samples (e.g. bacterial strains). For those variants which occur within protein coding regions, the impact on the translated amino acid sequence is shown. As a consequence, each mutation detected can be specified (silent vs. missense vs. nonsense mutation).
Provided Output Files: Raw data: Fastq;
Mapping: BAM files (see Fig. 1); Variant calling, protein consequences: VCF (for each sample separately) and HTML (includes all samples, see Fig. 2)
Re-sequencing including Advanced Bioinformatics Analysis
The advanced bioinformatics analysis includes the results from the basic analysis and additionally summarizes all observations providing evidence for possible large deletions and insertions. Such mutations are difficult to detect using standard bioinformatics tools and often need additional experiments for confirmation.
Large InDels: Possible large insertions and deletions are detected using a breakpoint identification algorithm. Regions are reported in a table visualizing the read alignment at the position of the candidate InDel.
Unmapped Reads: Reads which could not be mapped to the reference sequence are de novo assembled and the resulting contigs are BLASTed to NCBI‘s nucleotide database. This analysis provides useful information about sample contaminants (e.g. plasmids, phages) etwhich are not part of the reference sequence. In addition, large indels can often be recovered in the de novo assembly.
Provided Output Files: Large InDels: HTML (see Fig. 3); Unmapped Reads: FASTA (assembled contigs), NCBI BLAST hits.
Examples for Output Files Provided by Microsynth