Releases: bcbio/bcbio-nextgen
Releases · bcbio/bcbio-nextgen
v1.1.9
- Fix for get VEP cache.
- Support Picard's new syntax for ReorderSam (REFERENCE -> SEQUENCE_DICTIONARY).
- Remove mitochondrial reads from ChIP/ATAC-seq calling.
- Add documentation describing ATAC-seq outputs.
- Add ENCODE library complexity metrics for ATAC/ChIP-seq to MultiQC report
(see https://www.encodeproject.org/data-standards/terms/#library for a description of the metrics) - Add STAR sample-specific 2-pass. This helps assign a moderate number of reads per genes. Thanks
to @naumenko-sa for the intial implementation and push to get this going. - Index transcriptomes only once for pseudo/quasi aligner tools. This fixes race conditions that
can happen. - Add --buildversion option, for tracking which version of a gene build was used. This is used
duringbcbio_setup_genome.py
. Suggested formats are source_version, so Ensembl_94,
EnsemblMetazoa_25, FlyBase_26, etc. - Sort MACS2 bedgraph files before compressing. Thanks to @LMannarino for the suggestion.
- Check for the reserved field
sample
in RNA-seq metadata and quit with a useful error message.
Thanks to @marypiper for suggesting this. - Split ATAC-seq BAM files into nucleosome-free and mono/di/tri nucleosome files, so we can call
peaks on them separately. - Call peaks on NF/MN/DN/TN regions separately for each caller during ATAC-seq.
- Allow viral contamination to be assasyed on non tumor/normal samples.
- Ensure EBV coverage is calculated when run on genomes with it included as a contig.
v1.1.8
- Add
antibody
configuration option. Setting a specific antibody for ChIP-seq will use appropriate
settings for that antibody. See the documentation for supported antibodies. - Add
use_lowfreq_filter
for forcing vardict to report variants with low allelic frequency,
useful for calling somatic variants in panels with high coverage. - Fix for checking for pre-existing inputs with python3.
- Add
keep_duplicates
option for ChIP/ATAC-seq which does not remove duplicates before peak calling.
Defaults to False. - Add
keep_multimappers
for ChIP/ATAC-seq which does not remove multimappers before peak calling.
Defaults to False. - Remove ethnicity as a required column in PED files.
v1.1.7
v1.1.6
- GATK ApplyBQSRSpark: avoid StreamClosed issue with GATK 4.1+
- RNA-seq: fixes for cufflinks preparation due to python3 transition.
- RNA-seq: output count tables from tximport for genes and transcripts. These
are inbcbioRNASeq/results/date/genes/counts
and
bcbioRNASeq/results/data/transcripts/counts
. - qualimap (RNA-seq): disable stranded mode for qualimap, as it gives incorrect
results with the hisat2 aligner and for RNA-seq just setting it to unstranded - Add
quantify_genome_alignments
option to use genome alignments to quantify
with Salmon. - Add
--validateMappings
flag to Salmon read quantification mode. - VEP cache is not installing anymore from bcbio run
- Add support for Salmon SA method when STAR alignments are not available
(for hg38). - Add support for the new read model for filtering in Mutect2. This is
experimental, and a little flaky, so it can optionally be turned on via:
tools_on: mutect2_readmodel
. Thanks to @lbeltrame for implementing this
feature and doing a ton of work debugging. - Swap pandas
from_csv
call toread_csv
. - Make STAR respect the
transcriptome_gtf
option. - Prefix regular expression with r. Thanks to @smoe for finding all of these.
- Add informative logging messages at beginning of bcbio run. Includes the version
and the configuration files being used. - Swap samtools mpileup to use bcftools mpileup as samtools mpileup is being
deprecated (https://github.com/samtools/samtools/releases/tag/1.9). - Ensure locale is set to one supporting UTF-8 bcbio-wide. This may need to get
reverted if it introduces issues. - Added hg38 support for STAR. We did this by taking hg38 and removing the alts,
decoys and HLA sequences. - Added support for the arriba fusion caller.
- Added back missing programs from the version provenance file. Fixed formatting
problems introduced by switch to python3. - Added initial support for whole genome bisulfite sequencing using bismark. Thanks to
@hackdna for implementing this and @jnhutchinson for drafting the initial
pipeline. This is a work in progress in collaboration with @gcampanella, who
has a similar implementation with some extra features that we will be merging
in soon. - qualimap for RNA-seq runs on the downsampled BAM files by default. Set
tools_on: [qualimap_full]
to run on the full BAM files. - Add STAR junction files to the files captured at the end of a run.