Releases: ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit
Releases · ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit
Version 2.2.1
Version 2.2.0
- Fixed bugs in novel end detection; novel ends must not be on CGP. Split up the flags to a boolean and a distance.
- Fixed an issue when IsoSeq was given to some genomes but not at all.
- Reverted hal2fasta changes to match changes to the hal toolkit.
- Removed the automated singularity testing because it stopped working.
- Tried to add a new dockerfile for a more complete image, but it was not passing in travis. it now lives in the repo as an experimental dockerfile.
V2.1.0
Version 2.0
Version 2.0 adds a wide range of new features. Highlights:
- Python 3 is now required
- BLAT has been fully replaced with parasail (pairwise DNA alignments) or exonerate (protein-genome alignments)
- Support for updated clusterGenes that allows for genes to be considered not the same cluster even if they share a few bases of overlap. This is useful for compact genomes. This value can be modulated by the
--overlapping-gene-distance
flag, and defaults to 30 (exonic) bases. - New flags for controlling how de novo gene predictions are incorporated:
--denovo-ignore-novel-genes
: For de-novo predictions, discard any transcripts that are predicted to be novel genes. In other words, only retain putative novel isoforms.--denovo-novel-end-distance
: For de-novo predictions, allow transcripts to be included if they provide a novel 5' or 3' end N distance away from any existing ends. Default is 0.--denovo-allow-unsupported
: For de-novo predictions, allow novel isoforms to be called if they contain splices that are not supported by the reference annotation even if they are also not supported by RNA-seq. Without this flag, novel isoforms will only be called if they have one or more splice that has RNA-seq/IsoSeq support and no reference annotation support.--denovo-allow-bad-annot-or-tm
: For de-novo predictions, allow novel isoforms to be called that were flagged as BadAnnotOrTm. These predictions overlap instances where multiple genes transMapped to the same location with significant overlap, and so may be alignment mistakes, collapsed repeats or gene family collapse.
- GFF3 parsing is now more rigid. CAT only accepts GFF3 files that fit the required format. To help with this, new parsers have been placed in the programs folder that massage GenBank files from RefSeq and from GenBank, as well as GFF3 files produced by Prokka.
- You can test your GFF3 against the parser with the script
validate_gff3
. If your GFF3 passes this tool, it will work with CAT.
Version 1.0
This release is represents the final python2.7 code base.
Protein alignments are now possible
Bug fix for constructing hints from protein sequences. NOTE: This update requires that you update your Kent repository to at least commit 5b8e436 and rebuild to get the latest version of `pslCheck`.