Genic annotation of peaks including promoters, gene bodies, gene-centric windows, and proximal genes.
PeaksAnno provides the genic annotation of peaks at promoters, gene bodies, and gene-centric windows. Annotated regions are collated to give a binary overview of proximal genes, and the peak occupancy percentages are graphically presented in a bar plot. This is a part of the major analysis for ChIPSeq or Cut&Run-Seq using our comprehensive pipeline SEASEQ.
peaksanno.py -p ["PEAK bed"] [-s ["MACS Summit bed"]] -g ["GTF"] -c ["CHROM SIZES"]
The script requires the following files:
- Peaks bed file
- GTF/GFF/GFF3 having either gene or transcript (preferred) annotation
- UCSC provided chromosomal sizes file
An optional summit file from MACS can be included
PeaksAnno genes the following files:
- TSS nearest the center of peaks in centerofpeaks_closest.regions.txt.
- Peaks overlapping genes regions in peaks_within_genebody.regions.txt.
- Peaks overlapping promoters in peaks_within_promoter.regions.txt.
- Peaks overlapping windows in peaks_within_window.regions.txt.
- Peaks identified in previous overlapping regions and comparison of all regions in peaks_compared_regions.peaks.txt.
- Genes identified in previous overlapping regions and comparison of all regions in peaks_compared_regions.genes.txt.
- Distribution graphs in peaks_compared_regions.distribution.pdf.
- TSS: TSS (transcription start site)
- promoters: 1kb +/- TSS
- window: 10kb upstream to 3kb downstream of the gene locus
- genebody: 1kb upstream of TSS to TES (transcription end site)
- bedtools
- python3
If you are using PEAKSANNO, please cite its parent paper : Adetunji, M.O., Abraham, B.J. SEAseq: a portable and cloud-based chromatin occupancy analysis suite. BMC Bioinformatics 23, 77 (2022). https://doi.org/10.1186/s12859-022-04588-z
Please use the GitHub issues page for reporting any issues/suggestions (recommended).
Alternatively, you can e-mail Modupe Adetunji modupeore.adetunji@stjude.org