Skip to content

Transcriptomics Team Repository | HackBio Internship

Notifications You must be signed in to change notification settings

rana-salah/Transcriptomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 

Repository files navigation

Hello!! Welcome to Transcriptomics Lab- Where we do all things RNA:mechanical_arm: :tada:

πŸ‘¨β€πŸ”¬ πŸ‘©β€πŸ”¬ We are research enthusiasts from diverse backgrounds in life sciencesπŸ§‘β€πŸ”¬ πŸ‘¨β€πŸ”¬ .In our lab we study transcriptome to provide insight into how genes work🦠 . This ReadMe provides complete analysis of an RNA-seq experiment profiling Drosophila cells after the depletion of a regulatory gene called Pasilla (Ps).

Enjoy!🌟

Summary

A SNEAK PEEK INTO OUR LAB

Pasilla gene 🧬 encodes a set of proteins that are most similar to those found in humans Nova-1 and Nova-2 are the names of two satellites. Nova-1 and Nova-2 are nuclear RNA-binding proteins that are normally expressed in the CNS and regulate splicing directly.There are numerous applications for RNA sequencing data with a reference genome, and there is no optimal pipeline for all cases. We will go over all the major steps in Reference based RNA-seq data analysis πŸ’», such as quality control (FastQC, MutiQC, Cutadapt), reads alignment (RNA STAR, IGV), gene and transcript quantification (HT-Seq count), differential gene expression (DEseq2), functional profiling (goseq) and advanced analysis. Using this tutorial, we were able to do differnetial gene expression analysis besides identifying the genes and pathways regulated by the Pasilla gene as they were affected by its depletion. To know more about our adventures 🀯 with the analysis process, don'y hesitate to check the following hyperlinks πŸ”₯πŸ”—. You will explore a lot about the field of transcriptomics in the next minutes πŸ‘€

πŸ“Œ Our Workflow πŸ“

To check out the results of our workflow in galaxy, click on the links below:

Quality_Control | Mapping | Analysis_Visualisation_Functional Enrichment

πŸ“ƒ Datasets and Results πŸ“ˆ

Click on each image to know the detailed procedure to conduct this analysis step




Finally, after finishing our analysis πŸŽ‰πŸŽ‰, our team decided to start an initiative to help all the members in learning from each other and in developing the biggest set of skills 🧰during this stage! Hence, we organized a training on Friday, where we had a workshop for each step in the workflow. The highly passionate members πŸ‘¨β€πŸ”¬ πŸ‘©β€πŸ”¬ who volunteered to give the workshops are highlighted in the contributions list. In this workshop, the moderators explained the purpose of doing each step in the tutorial and how it can be benefitial for the analysis πŸ”– beside highlighting some improvement points. Also, they did the anlysis process practically to help the other members follow their steps. At the end, there was a troubleshooting and a Q&A session.

Getting to the end of our work, are you excited to meet our team members?!! 😍πŸ₯³πŸ₯³

πŸ‘©β€πŸ”¬ Team Members πŸ‘¨β€πŸ”¬

πŸ‘©β€πŸ’» Contributions πŸ‘¨β€πŸ’»

Team Sub-groups Specific Task Contributors Slack IDs
Quality Control Sub-samples Yasmeen & Eman @Sam & @Eman
Quality Control Full datasets Bandana, Jaspreet, Pankaj @Bandana, @Jaspreet, @Pankaj
Quality Control Markdown Documnetation Jaspreet, Bandana, Eman @Eman, @Bandana & @Jaspreet
Mapping Inspection of Mapping Results Yasmeen, Bandana, Dawoud, Nirvana @Sam, @Bandana, @Dawoud, @Nirvana
Mapping Counting the number of reads per annotated gene Yasmeen, Saket, Johny @Sam, @Saket, @Johny
Mapping Estimation of strandness Eman, Saket, Johny @Saket, @Johny, @Eman
Mapping Counting reads per genes Ankita, Favour, Nirvana @Anku., @Nirvana, @OYEFAVOUR
Mapping Markdown Documentation Eman, Yasmeen, Dawoud, Johny @Sam & @Eman, @Dawoud, @Johny
Differential Gene Expression Analysis Identification of the differentially expressed features Rana, Utkarsha, Osama, Nikita, Chigozie, Yasmeen, Johny, Shruti @RanaSalah, @-Utkarsha12-, @Osama, @Nikita2Chimera, @GozieNkwocha, @Sam, @Johny, @ShrutiG
Differential Gene Expression Analysis Extraction of annotation of differentially expressed genes Utkarsha, Osama, Rana, Jaspreet, Nikita, Chigozie @RanaSalah, @-Utkarsha12-, @Osama, @Nikita2Chimera, @GozieNkwocha, @Jaspreet
Differential Gene Expression Analysis Markdown Documentation Rana @RanaSalah
Visualization of the DE genes' expression Visualization of the normalized counts Osama, Utkarsha, Rana, Ankita, TosinA, Dawoud @RanaSalah, @-Utkarsha12-, @Osama, @Anku., @TosinA, @Dawoud
Visualization of the DE genes' expression Computation & Visualization of the Z-score Saket, Utkarsha, Osama, Rana, Ankita, TosinA, Diyar @Saket, @RanaSalah, @-Utkarsha12-, @Osama, @Anku., @TosinA, @diyar
Visualization of the DE genes' expression Markdown Documentation Osama, Jaspreet, Utkarsha @Osama, @Jaspreet, @-Utkarsha12-
Functional enrichment analysis of the DE genes Gene Ontology analysis Amira, TosinA, Bandana, Johny, Shruti @Amira, @TosinA, @Bandana, @Johny, @ShrutiG
Functional enrichment analysis of the DE genes KEGG pathways analysis Amira, TosinA, Chigozie, Rana @Amira, @TosinA, @GozieNkwocha, @RanaSalah
GitHUb Markdown Development Format & Organization Main ReadMe: Utkarsha, Osama, Rana, Ankita, TosinA, Bandana. Quality control: Jaspreet. Mapping: Saket, Yasmeen, Johny, Dawoud. Differential Gene Expression Analysis: Rana. Visualization: Osama, Jaspreet, Utkarsha. Functional Enrichment Analysis: TosinA @RanaSalah, @-Utkarsha12-, @Osama, @Anku., @TosinA, @Bandana, @Jaspreet, @Sam, @Saket, @Johny, @Dawoud
Graphical Abstract Design Rana, Osama, Jaspreet, Ankita, Diyar @RanaSalah, @Osama, @Jaspreet, @Anku., @diyar
Advertisement Writing post on transfer-market Tosin @TosinA
Training Moderated the training workshops & presented the workflow steps practically Quality control: Yasmeen & Jaspreet. Mapping: Saket, Yasmeen. Differential Gene Expression Analysis: Rana & Osama. Visualization: Osama. Functional Enrichment Analysis: Amira @Sam & @Jaspreet, @Saket, @RanaSalah, @Osama, @Amira

References

Trapnell, C., L. Pachter, and S. L. Salzberg, 2009 TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111. https://academic.oup.com/bioinformatics/article/25/9/1105/203994

BΓ©rΓ©nice Batut, Mallory Freeberg, Mo Heydarian, Anika Erxleben, Pavankumar Videm, Clemens Blank, Maria Doyle, Nicola Soranzo, Peter van Heusden, 2021 Reference-based RNA-Seq data analysis (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/ref-based/tutorial.html Online; accessed Sat Aug 21 2021

Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

Levin, J. Z., M. Yassour, X. Adiconis, C. Nusbaum, D. A. Thompson et al., 2010 Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nature Methods 7: 709. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3005310/

Young, M. D., M. J. Wakefield, G. K. Smyth, and A. Oshlack, 2010 Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biology 11: R14. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-2-r14

Marcel, M., 2011 Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17: http://journal.embnet.org/index.php/embnetjournal/article/view/200

Brooks, A. N., L. Yang, M. O. Duff, K. D. Hansen, J. W. Park et al., 2011 Conservation of an RNA regulatory map between Drosophila and mammals. Genome Research 21: 193–202. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3032923/

Robinson, J. T., H. ThorvaldsdΓ³ttir, W. Winckler, M. Guttman, E. S. Lander et al., 2011 Integrative genomics viewer. Nature Biotechnology 29: 24. https://www.nature.com/nbt/journal/v29/n1/abs/nbt.1754.html

Wang, L., S. Wang, and W. Li, 2012 RSeQC: quality control of RNA-seq experiments. Bioinformatics 28: 2184–2185. https://www.ncbi.nlm.nih.gov/pubmed/22743226

Dobin, A., C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski et al., 2013 STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29: 15–21. https://academic.oup.com/bioinformatics/article/29/1/15/272537

Liao, Y., G. K. Smyth, and W. Shi, 2013 featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30: 923–930. https://academic.oup.com/bioinformatics/article/31/2/166/2366196

Kim, D., G. Pertea, C. Trapnell, H. Pimentel, R. Kelley et al., 2013 TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology 14: R36. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-4-r36

Luo, W., and C. Brouwer, 2013 Pathview: an R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics 29: 1830–1831. https://academic.oup.com/bioinformatics/article-abstract/29/14/1830/232698

Love, M. I., W. Huber, and S. Anders, 2014 Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15: 550. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8

Kim, D., B. Langmead, and S. L. Salzberg, 2015 HISAT: a fast spliced aligner with low memory requirements. Nature Methods 12: 357. https://www.nature.com/articles/nmeth.3317

Anders, S., P. T. Pyl, and W. Huber, 2015 HTSeqβ€”a Python framework to work with high-throughput sequencing data. Bioinformatics 31: 166–169. https://academic.oup.com/bioinformatics/article/31/2/166/2366196

Ewels, P., M. Magnusson, S. Lundin, and M. KΓ€ller, 2016 MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32: 3047–3048. https://academic.oup.com/bioinformatics/article/32/19/3047/2196507

Thurmond, J., J. L. Goodman, V. B. Strelets, H. Attrill, L. S. Gramates et al., 2018 FlyBase 2.0: the next generation. Nucleic Acids Research 47: D759–D765. https://academic.oup.com/nar/article-abstract/47/D1/D759/5144957

Kim, D., J. M. Paggi, C. Park, C. Bennett, and S. L. Salzberg, 2019 Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology 37: 907–915. https://www.nature.com/articles/s41587-019-0201-4

Used usegalaxy.org: "The sequencing data were uploaded to the Galaxy web platform, and we used the public server at usegalaxy.org to analyze the data ( Brooks et al. 2011)."

About

Transcriptomics Team Repository | HackBio Internship

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published