Small and microRNA Sequencing

Introduction
microRNAs (miRNA) are a class of small non-coding RNAs typically 21-23 nt long found in plants, animals, and some viruses (Figure 1). miRNAs play a pivotal role in RNA silencing and post-transcriptional regulation of gene expression. Discovered in the early 1990s, miRNA research has revealed (i) multiple roles for miRNAs in development (ii) disease-associated aberrant expression of miRNAs and (iii) the importance of miRNA in many other biological processes. Next generation sequencing (NGS) technologies have become a powerful tool to study genome-wide miRNA expression patterns and have helped to identify disease associations, isoforms of miRNAs, and to discover previously uncharacterized miRNAs.
Figure 1. Typical structure of two precursor miRNAs showing the stem loop structure observed in precursor miRNAs (pre-miRNAs). The primary miRNA transcript which varies in between 500 and 3000 nt is processed by RNAse III and the dsRNA binding protein resulting in a 70-80 nt long pre-miRNA. The pre-miRNA is then actively transported from the nucleus to the cytoplasm where it is further processed by the protein Dicer resulting in mature 17-23 nt long miRNAs.

Microsynth Competences and Services
Experimental Design:
As an expert in the area of miRNA-Seq, Microsynth is able to provide a one-stop service from experimental design consulting up to bioinformatics analysis. Should you not involve Microsynth in your experimental design, please consider the importance of the number of biological replicates. We usually advise to include at least 3 biological replicates per condition, to finally obtain statistical significance for your differential miRNA expression analysis.
RNA Isolation: Either you leave it up to Microsynth or you use a commercial kit to isolate total RNA used for the Illumina miRNA-Seq protocol.
Library Preparation and Sequencing: Following a quality check of your total RNA samples, Microsynth will perform miRNA enrichment. Illumina cDNA library is generated by reverse-transcription including specific sequencing adaptors with barcodes. Finally, the libraries are pooled and sequenced on the Illumina machine. The envisaged number of reads per library depends on the organism under study and the desired sensitivity. The usually required number of reads for higher eukaryotic species (e.g. human, rat, mouse) is approx. 5-15 Mio reads depending whether complex tissues or unique type of cells are analyzed.

Bioinformatics Analysis: The analysis pipeline at Microsynth addresses three main questions: (i) what is the distribution of miRNAs and which of them are novel, (ii) which pathway is influenced in which way by the miRNAs and (iii) which of the miRNAs are differentially expressed. The first step of analysis is based on the sequence data itself. In short, sequence data is quality filtered and clustered for each condition of the experiment. A representative sequence of each cluster is then compared against the miRBase database using UBLAST to identify known miRNAs. Sequence clusters that did not result in a significant hit may be regarded as putative novel miRNAs. In the second step of the analysis the quality filtered reads are mapped against the reference genome using STAR. Then, HOMER is employed to find miRNA peaks and motifs and to exhaustively annotate them (e.g. proximity to genes and gene ontology). However, this in-depth annotation is only supported for a limited set of model organisms (e.g. human, mouse, zebra-fish). Finally, differentially expressed miRNAs are found using DESeq2.
Provided Output Files: See examples below

Examples for Most Important Output Files Provided by Microsynth


Figure 2.
Exemplary extract of the comparison of cluster representative sequence (query id) against the miRBase database (subject id) using UBLAST. Clusters that show no significant blast hit can be regarded as putative novel miRNAs.


Figure 3.
Exemplary extract of the de-novo motif search results. Given the p-value and the false discovery rate, motif 5 (*) is rated as putative false positive motif. Additionally total target and background sequences are listed and links to known motifs and gene ontology enrichment results are provided if present (not shown).

Figure 4. Exemplary extract of the peak annotation table generated during pathway analysis. Peak locations and known genomic features in close proximity are annotated.

Figure 5. Along multiple detailed gene ontology tables the first 20 most significant terms are plotted in a connected network for a quick overlook. One such exemplary plot is shown above for the „biological process“ gene ontology (shades of red denote strength of the respective significance).

Figure 6. Differential miRNA expression for three different experimental conditions with each condition including three replicates. The expression data was submitted to principal component analysis (PCA) to show differences among replicates and conditions.

Related Topics
  • siRNA synthesis service at Microsynth
  • RNA-Sequencing at Microsynth
  • CHiP Seq analysis pipeline at Microsynth

Further Reading
  1. Griffiths, J.S., (2004) The microRNA Registry. Nucl. Acids Res. 32 (suppl 1): D109-D111.
  2. Dobin et al. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 29:15-21.
  3.  Love et al (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15: 550.
  4. Edgar R.C. (2010) Search and clustering orders of magnitude faster than BLAST, Bioinformatics 26: 2460-2461.
  5. Heinz et al. (2010) Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 38: 576-589.
  6. Mestdagh et al. (2014) Evaluation of quantitative miRNA expression platforms in the microRNRNA quality control (miRQC) study. Nat Methods. 11: 809-815.



rechte sp
Contact Form
Interested to discuss your NGS project with an expert or to receive an offer? Then, please fill in our NGS contact form

Related Downloads
AppNote_miRNASeq



rechte sp
to the top