RNA Sequencing For Differential Gene Expression Analysis
Next-generation sequencing (NGS) is a powerful technique to perform genome-wide transcriptional analysis of any biological organism (often also called RNA-Seq). By comparing two ore more conditions RNA-Seq permits to find differentially expressed genes – genes that are up- or down-regulated under specific conditions. Typical examples include the comparison of transcription profiles from normal tissues versus cancer tissues, cells in high versus low nutrient environments, unstressed versus stressed cells or from distinct developmental stages of an organism. A prerequisite for any RNA-Seq study is the availability of an annotated reference genome or a reference transcriptome (see also the application note "Illumina RNA-Seq" under "Related Downloads").
The advantage of RNA-Seq over conventional microarray studies is that (i) no prior knowledge about gene models is necessary and (ii) an increased dynamic range is observed with overall higher sensitivity, reliability and reproducibility levels. In addition, many RNA-Seq protocols allow to analyze both the sense as well as the natural antisense transcripts (NATs) of genes. NATs are widespread in eukaryotic and procaryotic genomes and are now acknowledged as important modulators of gene expression.
Microsynth Competences and ServicesExperimental Design: As an expert in the area of RNA-Seq, Microsynth is able to provide a full service (from experimental design consulting up to bioinformatics analysis). Should you not involve Microsynth in your experimental design, please consider the importance of the number of biological replicates. To finally obtain statistical significance for your differential gene expression analysis, we usually advise to include at least 3 biological replicates per condition.
RNA Isolation: Either you leave it up to Microsynth or you use a commercial kit to isolate total RNA.
Library Preparation and Sequencing: Following a quality check of your samples, Microsynth will perform a mRNA enrichment or a rRNA depletion depending on the studied organism. This step is essential because the fraction of rRNA is high and sequencing should be restricted to mRNA (or miRNA). Illumina cDNA library is generated by reverse-transcription including specific sequencing adaptors with barcodes. Finally, the libraries are pooled and sequenced on the Illumina machine. The envisaged number of reads per library depends on the organism under study and the desired sensitivity. Whereas the benchmark for complex eukaryotic genomes (e.g. human, rat, mouse) requires 100-150 M reads (high sensitivity) and 20-30 M reads (low sensitivity), a 10-fold less amount of reads is required for bacteria.
Bioinformatics Analysis: Reads derived from the sequencing are mapped against the reference genome of the organism under study using the Bowtie2 and TopHat software. TopHat primarily addresses the difficulty of mapping spliced reads in eukaryotic genomes (i.e. reads spanning two exons). Finally the reads per gene are counted and used as input for statistical analysis. Specific statistical packages co-des are used to seek for differentially expressed genes. These packages first normalize the data, then calculate the variance based on the replicates for each condition and finally compute statistical tests to find differentially expressed genes.
Provided Output Files: You will receive a report with following content:
- raw counts of the mapping
- differential analysis results
- heatmap with top 30 genes
- sample clustering
Besides, raw sequence data, BAM mapping files and a brochure describing some statistical details, will be provided.
Examples for Most Important Output Files Provided by Microsynth