This is a collection of notebooks describing the basic analysis workflow for a 16S rRNA gene amplicon sequencing project - from raw reads to statistics and plots. It is intended for beginners, and therefore includes an introductory section on the R programming language and statistics.
If you find errors/typos or you would like a notebook/section on a specific topic, make a github issue or throw me an email: jakob.russel@bio.ku.dk
The notebooks uses an example dataset, which is a simulated dataset, which is supposed to look like an infant gut microbiome dataset. It is available for download here.
Only for KU students:
Background material:
- Intro to R and RStudio
- Intro to R programming language
- Statistics 101
- PCA, PCoA, PERMANOVA
- Multiple correction
- The phyloseq object
- Phyloseq operations
- Compositionality and rarefaction
- Extra plotting with ggplot2
Bioinformatic workflow:
Analysis workflow:
Troubleshooting
This section contains notebooks for more specific and/or advanced data presentations or analyses.