-
Notifications
You must be signed in to change notification settings - Fork 5
Estimating Transcript Abundance
The BAM isn’t the final file
- BAM files give the loca)on of mapped reads;
- But, per individual, how many reads should be considered as from any par)cular gene?
- The count table represents this;
A common and logical method to estimate transcript abundance across a reference transcriptome using RNA-seq data is to count the number of reads that map uniquely to each transcript. Reads that map to multiple contigs or transcripts may provide ambiguous information and therefore introduce more noise than is desirable in some contexts. If, on the other hand, alternative splice variants are separate features in the reference, one would expect multiply mapped reads and may want to count them, but in an unbiased manner. In any case, it is straightforward to count the number of reads aligned to each reference feature from a SAM file, write a tabular file containing this information, and use this file for differential gene expression analysis.
Count-table Example:
6-iii. Integrated assignment answers
#Table of Contents
- Module 0 Setting Up for Data Analysis
- Introduction to High Performance Computing Cluster
- Connecting to MGHPCC
- Computing Environment
- Unix Tutorial Part 1: UNIX Bootcamp
- Unix Tutorial Part 2: Shell Scripting
- Unix Tutorial Practice
- Submitting computing jobs to HPC using LSF
- Ignore: Git Tutorial
- Module 1 Introduction/ Overview
- Overview of RNA-seq Experiment
- RNA-Seq Analysis Pipeline
- RNA-Seq Input Data
- RNA-seq File Formats and Software-Specific Files
- Getting Data for Analysis
- Module 2 Quality Control
- Module 3 Tuxedo Pipeline
- The Tuxedo Pipeline
- Read Alignment with TopHat2
- Transcript Assembly with Cufflinks
- Differential Analysis with Cuffdiff
- Visualization with CummeRbund
- Resources and Reference