Skip to content

Commit

Permalink
Merge pull request #62 from nf-core/initial-release-review-changes
Browse files Browse the repository at this point in the history
Apply the second set of reviewer recommendations
  • Loading branch information
scwatts authored Jul 5, 2024
2 parents 26abf3f + bbf66f6 commit 1f30f98
Show file tree
Hide file tree
Showing 7 changed files with 57 additions and 48 deletions.
16 changes: 14 additions & 2 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,17 @@ search = version = '{current_version}'
replace = version = '{new_version}'

[bumpversion:file:README.md]
search = -revision v{current_version}
replace = -revision v{new_version}
search = -revision {current_version}
replace = -revision {new_version}

[bumpversion:file (example commands):docs/usage.md]
search = -revision {current_version}
replace = -revision {new_version}

[bumpversion:file (urls):docs/usage.md]
search = /{current_version}/
replace = /{new_version}/

[bumpversion:file (templated example):docs/usage.md]
search = {current_version}`
replace = {new_version}`
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ For detailed information on each component of the Hartwig workflow, please refer

## Pipeline summary

The following processes and tools can be run with oncoanalyser:
The following processes and tools can be run with `oncoanalyser`:

- Simple DNA/RNA alignment (`bwa-mem2`, `STAR`)
- Post-alignment processing (`MarkDups`, `Picard MarkDuplicates`)
Expand Down Expand Up @@ -69,7 +69,7 @@ P1__wgts,P1,SB,tumor,dna,fastq,library_id:SB_library;lane:001,/path/to/SB.tumor.
P1__wgts,P1,SC,tumor,rna,fastq,library_id:SC_library;lane:001,/path/to/SC.tumor.rna.wts.001.R1.fastq.gz;/path/to/SC.tumor.rna.wts.001.R2.fastq.gz
```

Launch oncoanalyser:
Launch `oncoanalyser`:

```bash
nextflow run nf-core/oncoanalyser \
Expand All @@ -96,7 +96,7 @@ For more details about the output files and reports, please refer to the

### Extended support

As oncoanalyser is used in clinical settings and subject to accreditation standards in some instances, there is a need
As `oncoanalyser` is used in clinical settings and subject to accreditation standards in some instances, there is a need
for long-term stability and reliability for feature releases in order to meet operational requirements. This is
accomplished through long-term support of several nominated feature releases, which all receive bug fixes and security
fixes during the period of extended support.
Expand All @@ -111,22 +111,22 @@ Versions nominated to have current long-term support:

### Release parity

Versioning between oncoanalyser and hmftools naturally differ, however it is often necessary to relate the functional
Versioning between `oncoanalyser` and hmftools naturally differ, however it is often necessary to relate the functional
equivalence of these two pieces of software. The functional/feature parity with regards to version releases are detailed
in the below table.

| oncoanalyser | hmftools |
| ------------------- | -------- |
| 0.1.0 through 0.2.7 | 5.33 |
| 0.3.0 through 0.4.5 | 5.34 |
| 0.3.0 through 1.0.0 | 5.34 |

## Known issues

There are currently no known issues.

## Credits

The oncoanalyser pipeline was written by Stephen Watts while in the [Genomics Platform
The `oncoanalyser` pipeline was written by Stephen Watts while in the [Genomics Platform
Group](https://mdhs.unimelb.edu.au/centre-for-cancer-research/our-research/genomics-platform-group) at the [University
of Melbourne Centre for Cancer Research](https://mdhs.unimelb.edu.au/centre-for-cancer-research).

Expand All @@ -146,7 +146,7 @@ channel](https://nfcore.slack.com/channels/oncoanalyser) (you can join with [thi

## Citations

You can cite the oncoanalyser zenodo record for a specific version using the following doi:
You can cite the `oncoanalyser` zenodo record for a specific version using the following doi:
[10.5281/zenodo.XXXXXXX](https://doi.org/10.5281/zenodo.XXXXXXX)

An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md)
Expand Down
2 changes: 0 additions & 2 deletions assets/methods_description_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ description: "Suggested text and references to use when describing pipeline usag
section_name: "nf-core/oncoanalyser Methods Description"
section_href: "https://github.com/nf-core/oncoanalyser"
plot_type: "html"
## TODO nf-core: Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
## You inject any metadata in the Nextflow '${workflow}' object
data: |
<h4>Methods</h4>
<p>Data was processed using nf-core/oncoanalyser v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
Expand Down
16 changes: 8 additions & 8 deletions assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
group_id,subject_id,sample_id,sample_type,sequence_type,filetype,filepath
subject_one__to__dna,subject_one,sample_a,tumor,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_a.tumor.bam
subject_one__to__dna,subject_one,sample_a,tumor,dna,bam,/path/to/subject_one/sample_a.tumor.bam

subject_one__tn__dna,subject_one,sample_a,tumor,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_a.tumor.bam
subject_one__tn__dna,subject_one,sample_b,normal,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_b.normal.bam
subject_one__tn__dna,subject_one,sample_a,tumor,dna,bam,/path/to/subject_one/sample_a.tumor.bam
subject_one__tn__dna,subject_one,sample_b,normal,dna,bam,/path/to/subject_one/sample_b.normal.bam

subject_one__tn__dna_rna,subject_one,sample_a,tumor,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_a.tumor.bam
subject_one__tn__dna_rna,subject_one,sample_b,normal,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_b.normal.bam
subject_one__tn__dna_rna,subject_one,sample_c,tumor,rna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_c.tumor_rna.bam
subject_one__tn__dna_rna,subject_one,sample_a,tumor,dna,bam,/path/to/subject_one/sample_a.tumor.bam
subject_one__tn__dna_rna,subject_one,sample_b,normal,dna,bam,/path/to/subject_one/sample_b.normal.bam
subject_one__tn__dna_rna,subject_one,sample_c,tumor,rna,bam,/path/to/subject_one/sample_c.tumor_rna.bam

subject_one__to__dna_rna,subject_one,sample_a,tumor,dna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_a.tumor.bam
subject_one__to__dna_rna,subject_one,sample_c,tumor,rna,bam,/Users/stephen/repos/oncoanalyser/subject_one/sample_c.tumor_rna.bam
subject_one__to__dna_rna,subject_one,sample_a,tumor,dna,bam,/path/to/subject_one/sample_a.tumor.bam
subject_one__to__dna_rna,subject_one,sample_c,tumor,rna,bam,/path/to/subject_one/sample_c.tumor_rna.bam
4 changes: 1 addition & 3 deletions conf/targeted_parameters.config
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
process.'withName:^.*:COBALT_PROFILING:COBALT'.ext.args = [
'-pcf_gamma 50',
].join(' ').trim()
process.'withName:^.*:COBALT_PROFILING:COBALT'.ext.args = '-pcf_gamma 50'

process.'withName:^.*:SAGE_CALLING:SOMATIC'.ext.args = [
'-high_depth_mode',
Expand Down
4 changes: 2 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,13 @@ output/

### Simple DNA/RNA alignment

Alignment functionality in oncoanalyser is simple and rigid, and exists only to meet the exact requirements of the
Alignment functionality in `oncoanalyser` is simple and rigid, and exists only to meet the exact requirements of the
hmftools.

#### bwa-mem2

[bwa-mem2](https://github.com/bwa-mem2/bwa-mem2) is a short-read mapping tool used to align reads to a large reference
sequences. In oncoanalyser, bwa-mem2 is used to align DNA reads to the human genome.
sequences. In `oncoanalyser`, bwa-mem2 is used to align DNA reads to the human genome.

_No outputs are published directly from bwa-mem2, see [MarkDups](#markdups) for the fully processed alignment outputs_

Expand Down
49 changes: 25 additions & 24 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,20 @@
## Introduction

The oncoanalyser pipeline typically runs from FASTQs or BAMs and supports two modes: (1) whole genome and/or
The `oncoanalyser` pipeline typically runs from FASTQs or BAMs and supports two modes: (1) whole genome and/or
transcriptome, and (2) targeted panel. Launching an analysis requires only the creation of a samplesheet that describes
details of each input such as the sample type (tumor or normal), sequence type (DNA or RNA), and filepath.

Various aspects of an oncoanalyser analysis can be configured to fit a range of needs, and many of these are considered
[advanced usage](#advanced-usage) of the pipeline. The most useful include:
Various aspects of an `oncoanalyser` analysis can be configured to fit a range of needs, and many of these are
considered [advanced usage](#advanced-usage) of the pipeline. The most useful include:

- precise process selection
- starting from existing data
- granular control over reference/resource files

These features enable oncoanalyser to be run in a highly flexible way. For example, an analysis can be run with existing
PURPLE data as the starting point and skip variant calling processes. Additionally, reference/resource files can be
staged locally to optimise execution or modified to create user-defined driver gene panels.
These features enable `oncoanalyser` to be run in a highly flexible way. For example, an analysis can be run with
existing PURPLE data as the starting point and skip variant calling processes. Additionally, reference/resource files
can be staged locally to optimise execution or modified to create user-defined driver gene panels.

:::danger

Expand All @@ -35,7 +35,7 @@ When starting from BAMs rather than FASTQ it is expected that:

## Supported analyses

A variety of analyses are accessible in oncoanalyser and are implicitly run according to the data described in the
A variety of analyses are accessible in `oncoanalyser` and are implicitly run according to the data described in the
samplesheet. The supported analysis types for each workflow are listed below.

| Input sequence data | WGS/WTS workflow | Targeted sequencing workflow<sup>\*</sup> |
Expand All @@ -50,7 +50,7 @@ samplesheet. The supported analysis types for each workflow are listed below.

## Samplesheet

A samplesheet that contains information of each input in CSV format is needed to run oncoanalyser. The required input
A samplesheet that contains information of each input in CSV format is needed to run `oncoanalyser`. The required input
details and columns are [described below](#column-descriptions).

Several different input filetypes beyond FASTQ and BAM are recognised, including intermediate output files generated
Expand Down Expand Up @@ -157,7 +157,7 @@ This will launch the pipeline with the `docker` configuration profile. See below

:::note

Reference data will be retrieved by oncoanalyser for every analysis run. It is therefore strongly recommended when
Reference data will be retrieved by `oncoanalyser` for every analysis run. It is therefore strongly recommended when
running multiple analyses to pre-stage reference data locally to avoid it being retrieved multiple times. See [Staging
reference data](#staging-reference-data).

Expand Down Expand Up @@ -214,7 +214,7 @@ nextflow pull nf-core/oncoanalyser

It is a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since.

First, go to the [nf-core/oncoanalyser releases page](https://github.com/nf-core/oncoanalyser/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag.
First, go to the [nf-core/oncoanalyser releases page](https://github.com/nf-core/oncoanalyser/releases) and find the latest pipeline version - numeric only (eg. `1.0.0`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.0.0`. Of course, you can switch to another version by changing the number after the `-r` flag.

This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports.

Expand All @@ -228,8 +228,8 @@ If you wish to share such profile (such as upload as supplementary material for

### Selecting processes

Most of the major components in oncoanalyser can be skipped using `--processes_exclude` (the full list of available
processes can be view [here](https://github.com/nf-core/oncoanalyser/blob/1.0.0/lib/Constants.groovy#L36-L56)).
Most of the major components in `oncoanalyser` can be skipped using `--processes_exclude` (the full list of available
processes can be viewed [here](https://github.com/nf-core/oncoanalyser/blob/1.0.0/lib/Constants.groovy#L36-L56)).
Multiple processes can be given as a comma-separated list. While there are some use-cases for this feature (e.g.
skipping resource intensive processes such as VIRUSBreakend), it becomes more powerful when combined with existing
inputs as described in the following section.
Expand All @@ -243,10 +243,11 @@ processes.

### Existing inputs

The oncoanalyser pipeline has been designed to allow entry at arbitrary points, which is particularly useful in
situations where previous outputs exist and re-running oncoanalyser is desired (e.g. to subsequently execute an
optional sensor or use an upgrade component such as PURPLE). The primary advantage of this approach is that only the
required processes are executed, reducing costs and runtimes by skipping unnecessary processes.
The `oncoanalyser` pipeline has been designed to allow entry at arbitrary points, which is particularly useful in
situations where previous outputs exist and re-running `oncoanalyser` is desired (e.g. to subsequently execute an
optional sensor/workflow or re-run an analysis with an upgraded tool such as PURPLE). The primary advantage of this
approach is that only the required processes are executed, reducing costs and runtimes by skipping unnecessary
processes.

In order to effectively utilise this feature, existing inputs must be set in the [samplesheet](#samplesheet) and the
appropriate [processes selected](#selecting-processes). Take the below example where existing PURPLE inputs are used so
Expand All @@ -261,7 +262,7 @@ P1__wgts,P1,SB,tumor,dna,purple_dir,/path/to/P1.purple_dir/

:::note

The original source input file (i.e. BAM or FASTQ) must always be provided for oncoanalyser to infer the correct
The original source input file (i.e. BAM or FASTQ) must always be provided for `oncoanalyser` to infer the correct
analysis type.

:::
Expand All @@ -281,7 +282,7 @@ nextflow run nf-core/oncoanalyser \

:::warning

Providing existing inputs will cause oncoanalyser to skip the corresponding process but _not any_ of the upstream
Providing existing inputs will cause `oncoanalyser` to skip the corresponding process but _not any_ of the upstream
processes. It is the responsibility of the user to skip all relevant processes.

:::
Expand Down Expand Up @@ -315,7 +316,7 @@ params {
}
```

To use these hmftools resource file overrides in oncoanalyser the local bundle directory must be provided with
To use these hmftools resource file overrides in `oncoanalyser` the local bundle directory must be provided with
`--ref_data_hmf_data_path`.

#### Customise other data
Expand All @@ -326,8 +327,8 @@ for the complete list.

#### Staging reference data

Default reference data can be staged locally with oncoanalyser by providing a samplesheet for the desired analysis and
setting the `--prepare_reference_only` argument. The samplesheet and oncoanalyser configuration will determine the
Default reference data can be staged locally with `oncoanalyser` by providing a samplesheet for the desired analysis and
setting the `--prepare_reference_only` argument. The samplesheet and `oncoanalyser` configuration will determine the
relevant reference data to download. For example the following command will download the `GRCh38_hmf` genome plus
indices, reference data, and databases required to run a WGTS analysis for tumor/normal DNA with tumor RNA:

Expand All @@ -353,7 +354,7 @@ Executing the above command will download and unpack default reference data with
complete the prepared reference files can found in `./prepare_reference/reference_data/1.0.0/<datetimestamp>/`. It is
recommended to remove the Nextflow work directory after staging data to free disk space.

For oncoanalyser to use locally staged reference data a custom config can be used:
For `oncoanalyser` to use locally staged reference data a custom config can be used:

```text title="refdata.local.config"
params {
Expand Down Expand Up @@ -428,13 +429,13 @@ params {
}
```

Each index required for the analysis will first be created before running the rest of oncoanalyser with the following
Each index required for the analysis will first be created before running the rest of `oncoanalyser` with the following
command:

:::note

In a process similar to [staging reference data](#staging-reference-data), you can first generate the required indexes
by setting `--prepare_reference_only` and then provide the prepared reference files to oncoanalyser through a custom
by setting `--prepare_reference_only` and then provide the prepared reference files to `oncoanalyser` through a custom
config file. This avoids having to regenerate indexes for each new analysis.

:::
Expand Down

0 comments on commit 1f30f98

Please sign in to comment.