English Seminars
Transcriptome Analysis of E. coli and Integration of RNA-Seq data in GenExpDB
Title | Transcriptome Analysis of E. coli and Integration of RNA-Seq data in GenExpDB |
Lecturer | Prof. Tyrrell Conway (Genome Center, Oklahoma Univ.) |
Language | English |
Date&Time | 07/13/2011 (Wed) 13:00~14:00 |
Venue | 大講義室 |
Detail | We determined the transcriptome of Escherichia coli K-12 at the single nucleotide level by ultra-high-throughput RNA sequencing (RNA-Seq). Bacteria were cultured on MOPS glucose minimal medium at 37 degrees C in a pH and oxygen controlled fermenter. Ten samples were taken at various times during logarithmic growth phase, the end of growth phase, and following the transition into stationary phase. RNA was prepared and subjected to RNA-Seq on the Solid 4 platform to obtain more than 310 million paired-end, 75 bp reads. Data were processed with ABI’s BioScope software and the raw reads were subjected to error correction with the SOLiD Accuracy Enhancer. Color-space mapping of the SAET reads to the E. coli K-12 MG1655 reference genome was accomplished in the BioScope Whole Transcriptome Analysis (WTA) pipeline. The *.bam files were further processed by using SAMTools (sam2wig) to generate base count datasets. Conversion of SAM data to WIG data results in a 250-fold reduction in file size. We devised a simple strategy to normalize the data by expressing the base count at each base position as a value per billion base counts, then log2 transformed the normalized data for further differential gene expression analysis and display in J-Browse (GMOD), which is a powerful tool for analysis that allows comparison to other data types such as the TSSs cataloged in RegulonDB. Examples of this analysis may be viewed at the Enterobacteriaceae Gene Expression Profiling Database (EGEPdb.org), which accommodates many reference genomes. For differential gene expression (dRNA-seq), we converted base count data to gene counts by calculating the average base count from start to stop codon locations for each gene. We then calculated gene expression ratios for selected conditions, just as we would for microarray data using the same global normalization approach. In this way, we visualized dRNA-seq data and integrated the next-gen results with historical microarray data in GenExpDB http://www.genexpdb.ou.edu, an archive of all publicly available E. coli microarray gene expression profiling data. Thus, GenExpDB is being upgraded to include RNA-Seq data. |
Contact | システム微生物学 森 浩禎 (hmori@gtc.naist.jp) |