Releases · bcbio/bcbio-nextgen

15 Dec 03:37

naumenko-sa

v1.2.9

a2cbdd1

v1.2.9 Latest

Latest

Fix vcf header bug: T/N SAMPLE lines are back - needed for import to SolveBio
add strandedness: auto for -l A option in salmon
report 10x more peaks in CHIP/ATAC-seq - use 0.05 qvalue
fix misleading RNA-seq duplicated reads statistics: thanks @sib-bcf
reorganize conda environments
snpEff 5.0
strandedness: auto
document WGBS pipeline steps
make --local an option, not default in bismark alignment - too slow
bcbioRNASeq update to 0.3.44
pureCN update to 2.0.1
octopus update to 0.7.4

Contributors

sib-bcf

Assets 2

14 Apr 13:29

naumenko-sa

v1.2.8

e1192f4

v1.2.8

Set ENCODE library complexity flags properly for ChIP-seq. Thanks to @mistrm82.
Fix greylisted peaks not being propagated to the output directory. Thanks to @mistrm82.
Better error message when no sample barcodes are found for single-cell RNA-seq.
Better trimming for 2 wgbs kits
enable setting parameters for deduplicate_bismark
custom threading for bismark via yaml
reproducible WGBS user story with the data from Encode
While consensus peak calling, keep the highest scoring peak instead of calling the summit for the highest scoring peak and expanding the peak to 250 bases.
Enable consensus peak calling for broad peaks. Thanks to @mistrm82 and @yoonsquared for pointing out this was missing.
Re-enable ATAC-seq tests, they work now.
svprioritize for mm10
purecn_Dx.R - mutational signatures - still requires a manual update of deconstructsigs or release of it
make sure purecn uses sv_regions bed to call variants
fix misleading disambiguation fastqc read statistics (total, hg38, mm10)
wgbs: nebemseq kit: add --maxins 1000 and --local to bismark align
WGBS: sorted indexed deduplicated bam for ready.bam
print error message when aligner: false and hla typing is on
make sure that mark_duplicates is false with collapsed UMI input

Assets 2

23 Feb 03:07

roryk

v1.2.7

c862626

v1.2.7

RNASeq: Add gene body coverage plots to multiqc report.
Restore ability to opt out of contamination checking via tools_off.
Properly invoke threading for verifybamid2.
Fix circular import issue when using bcbio functions outside of
the main bcbio script.
Enable setting custom PureCN options via YAML file.

Assets 2

05 Feb 00:03

roryk

v1.2.6

c0db51a

v1.2.6

RNASeq: Fail more gracefully if SummarizedExperiment object cannot be created.
Fixes to handle DRAGEN BAM files from the first stage of UMI processing.
Fix issue with double-annotating with dbSNP. Separating out somatic variant annotation into it's own vcfanno configuration.

Assets 2

09 Jan 16:57

roryk

v1.2.5

ec84064

v1.2.5

1.2.5 (01 January 2021)

Joint calling for RNA-seq variant calling requires setting jointcaller to bring it in line
with the configuration options for variant calling.
Allow pre-aligned BAMs and gVCFs for RNA-seq joint variant calling. Thanks to @WimSpree for the
feature.
Allow CollectSequencingArtifacts to be turned off via tools_off: [collectsequencingartifacts].
Fix getiterator -> iter deprecation in ElementTree. Thanks to @smoe.
Add SummarizedExperiment object from RNA-seq runs, a simplified version of the bcbioRNASeq object.
Add umi_type: dragen. This enables bcbio to run with first-pass, pre-consensus called UMI BAM files from DRAGEN.
Turn off inferential replicate loading when creating the gene x sample RNA-seq count matrix. This allows loading of thousands of RNA-seq samples.
Only make isoform to gene file from express if we have run express.
Allow "no consensus peaks found" as a valid endpoint of a ChIP-seq analysis.
Allow BCBIO_TEST_DIR environment variable to control where tests end up.
Collect OxoG and other sequencing artifacts due to damage.
Round tximport estimated counts.
Turn off consensus peak calling for broad peaks. Thanks to @lbeltrame and @LMannarino for diagnosing the broad-peaks-run-forever bug.

Assets 2

21 Sep 19:34

roryk

v1.2.4

aff0de6

v1.2.4

1.2.4 (21 September 2020)

Remove deprecated --genomicsdb-use-vcf-codec option as this is now the default.
Add bismark output to MultiQC.
Fix PS genotype field from octopus to have the correct type.
Edit VarDict headers to report VCFv4.2, since htsjdk does not fully support VCFv4.3 yet.
Attempt to speed up bismark by implementing the parallelization strategy suggested here: FelixKrueger/Bismark#96
Add --enumerate option to OptiType to report the top 10 calls and scores, to make it easier to decide how confident we are in
a HLA call.
Performance improvements when HLA calling during panel sequencing. This skips running bwa-kit during the initial
mapping for consensus UMI detection, greatly speeding up panel sequencing runs.
Allow custom options to be passed to featureCounts.
Fix race condition when running tests.
Add TOPMed as a datatarget.
Add predicted transcript and peptide output to arriba.
Add mm10 as a supported genome for arriba.
Skip bcbioRNASeq for more than 100 samples.
Add rRNA_pseudogene as a rRNA biotype.
Add --genomicsdb-use-vcf-codec when running GenotypeGVCF. See https://gatk.broadinstitute.org/hc/en-us/articles/360040509751- GenotypeGVCFs#--genomicsdb-use-vcf-codec for
a discussion. Thanks to @amizeranschi for finding the issue and posting the solution.
update VEP to v100
Add consensus peak calling using https://bedops.readthedocs.io/en/latest/content/usage-examples/master-list.html
to collapse overlapping peaks.
Pre-filter consensus peaks by removing peaks with FDR > 0.05 before performing consensus peak calling.
Add support for Qiagen's Qiaseq UPX 3' transcriptome kit for DGE. Support for 96 and 384 well configurations
by specifying umi_type: qiagen-upx-96 or umi_type: qiagen-upx-384.
Add consensus peak counting using featureCounts.
Skip using autosomal-reference when calling ataqv for mouse/human, as this has a problem with ataqv
(see ParkerLab/ataqv#10) for discussion and followup.
Add pre-generated ataqv HTML report to upload directory.
Support single-end reads for ATAC-seq.
Move featureCount output files to featureCounts directory in project directory.
Remove RNA and reads in peak stats from MultiQC table when they are not calculated for a pipeline.
Only show somatic variant counts in the general stats table, if germline variants are calculated.
Add kit parameter for setting options for pipelines via just listing the kit. Currently only implemented for WGBS.

Assets 2

07 Apr 16:35

roryk

v1.2.3

f0d6a88

v1.2.3

Hotfix for not being able to upgrade from stable distribution.

Assets 2

05 Apr 22:53

roryk

v1.2.2

4b88065

v.1.2.2

Fix for not properly looking up R environment variables in the base environment.
Remove --use-new-qual-calculator which was eliminated in GATK 4.1.5.0.
Ensure header is not written for a Series. In pandas 0.24.0 the default for header was changed from
False to True so we have to set it explictly now.
Remove unused Dockerfile. Thanks to @matthdsm.
ATAC-seq: Skip peak-calling on fractions with < 1000 reads.

Assets 2

25 Mar 18:01

roryk

v1.2.1

48139c4

v1.2.1

Update ChIP and ATAC bowtie2 runs to use --very-sensitive.
Properly pad TSS BED file for ataqv TSS enrichment metrics.
Skip bcbioRNASeq if there are less than three samples.
Run joint-calling with single cores to save resources.
Re-support PureCN.
Skip segments with no informative SNPs when creating the LOH VCF file from PureCN output.
Fix for duplicated output for mosdepth in quality control report.
Fix for missing rRNA statistics.

Assets 2

07 Feb 20:02

roryk

v1.2.0

9289bdf

v1.2.0

Fix for bismark not being a supported aligner.
Run ataqv (https://github.com/ParkerLab/ataqv) to calculate additional ATAQ-seq quality control
metrics.
Workaround for some bcbioRNASeq plots failing with many samples when interesting_groups is not set.
Add known_fusions parameter for passing in known fusions to arriba.
Fix for tx2gene not working properly on some GTF files.
Sort MACS2 output with UNIX sort to avoid memory issues.
Run RiP on full peak file for ATAC-seq.
Run ataqv on unfiltered BAM file with the full peak file.
Run peddy on the population variant file, not the individual sample level file if joint calling was done.
Add STAR to MultiQC metrics.
Throw an error if STAR is run on a genome with alts.
Don't run bcbioRNASeq if there is only one sample. Thanks to @kmendler for the suggestion.
Improve arriba sensitivity by setting --peOverlapNbasesMin 10 and --alignSplicedMateMapLminOverLmate 0.5 when
running STAR (see suhrig/arriba#41).
Make TPM and counts files from tximport automatically.
Use --keepDuplicates when making the Salmon index. This keeps transcripts that are identical in the index instead of
randomly choosing one. This helps when comparing to other ways of quantifying the transcripts, ensuring all of
the transcripts are represented.
Remove unnecessary "quant" subdirectory for Salmon runs. This allows MultiQC to properly name the samples.
Ensure STAR log file is propagated to the upload directory.
Fix issue with memory not being specified properly when running bcbio_prepare_samples.py.
Run tximport automatically and store TPM in project/date/tpm and counts in project/date/counts.
Calculate ENCODE quality flags for ATAC-seq. See https://www.encodeproject.org/data-standards/terms/#library for a
description of what the metrics mean.
Fix for command line being too long while joint genotyping thousands of samples.
Fix for command line being too long when running the CWL workflow with cromwell.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

1.2.5 (01 January 2021)

1.2.4 (21 September 2020)

Releases: bcbio/bcbio-nextgen

v1.2.9

Contributors

v1.2.8

v1.2.7

v1.2.6

v1.2.5

1.2.5 (01 January 2021)

v1.2.4

1.2.4 (21 September 2020)

v1.2.3

v.1.2.2

v1.2.1

v1.2.0