Data package for transcriptomic profiles from mouse brain regions during pregnancy and postpartum period
- Experimental data were generated by Ray et al. (2015).
- Processing:
- Sequencing reads were downloaded from SRA, at PRJNA289455
fastq
files were check for adapter content, using FASTQC (no adapter sequence was found)- Reads were aligned on mouse genome GRCm38.p6 + 92 ERCC sequences, with Gencode m25 annotation, using STAR 2.7.1a, and then quantified by RSEM for abundance levels at gene and transcript levels.
Install the package, import the library and load the data set
devtools::install_github('ttdtrang/data-rnaseq-MmusPreg')
library(data.rnaseq.MmusPreg)
data(mpreg.rnaseq.gene)
dim(mpreg.rnaseq.gene@assayData$exprs)
The package includes 2 data sets, one for transcript-level counts/TPM and another for gene-level counts/TPM. Counts are non-integer estimate of expected_count
by RSEM.
cd data-raw
- Download all necessary raw data files which include
3.0M feature_attrs.genes.tsv
7.0M feature_attrs.transcripts.tsv
8.0K featureCounts-summary.genes.tsv
8.0K featureCounts-summary.transcripts.tsv
5.2M matrix.gene.expected_count.RDS
4.2M matrix.gene.featureCounts.RDS
5.4M matrix.gene.tpm.RDS
17M matrix.transcripts.expected_count.RDS
15M matrix.transcripts.tpm.RDS
24K PRJNA289455_metadata_cleaned.tsv
20K starLog.final.tsv
- Set the environment variable
DBDIR
to point to the path containing said files - Run the R notebook
make-data-package.Rmd
to assemble parts intoExpressionSet
objects.
You may need to change some code chunk setting from eval=FALSE
to eval=TRUE
to make sure all chunks would be run. These chunks are disabled by default to avoid overwriting existing data files in the folder.