Scripts and software for NUMT identification and selection signals investigation.
- "numt_identification": scripts for NUMT, cNUMT identification
- "numt_selection": scripts for dcNUMT clustering and selection signals identification
Recommend to run on a multi-core workstation if many species are tested
- Linux: Ubuntu 20.04.6
os
sys
Bio
Bio.Seq
time
pandas
re
numpy
collections
ete3
glob
Bio.Phylo.PhyloXML
-
Genome download
ncbi-genome-download
-
KaKs calculation
KaKs_Calculator2.0
-
BLAST
BLASTN v2.12.0+
https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/
-
Selection signal investigation
paml4.9j
-
Sequence extractions
BEDTools v2.26.0
-
Reverse sequence
revseq in EMBOSS v6.6.0.0
-
Sequence alignments
MAFFT v7.310
-
Rate of evolution
BayesTraits v.4.0.1
https://www.evolution.reading.ac.uk/BayesTraitsV4.0.1/BayesTraitsV4.0.1.html
-
cdNUMT and mito genes tree reconstructions
iqtree v2.0.3
-
Clustered NUMT tree recontructions
FastTree v2.1.11
-
Focal clade extraction
Newick Utilities v1.7.0
wget https://github.com/chnyuch/numt_vertebrate.git
Download the genomes, mito genomes and mito genes of focal species.
Identifying the NUMTs and identifying the cNUMTs.
Calculating the dNdS to identify dcNUMTs.
Clustering cNUMT and calculate the dNdS of clustered NUMTs and single NUMTs