Skip to content

Commit

Permalink
Major tombo update. Added 6mA alternative model, improved re-squiggle…
Browse files Browse the repository at this point in the history
… via parameter tuning, per-read statistics output including random genomic access API, cleaner UX (python3 compatible, mappy/minimap2 genomic mapping, simplified options) and more. Fixes #24. Addresses issues #23, #22, #21, #19, #17 and #16.
  • Loading branch information
marcus1487 committed Feb 13, 2018
1 parent 3980290 commit 1afadeb
Show file tree
Hide file tree
Showing 44 changed files with 4,503 additions and 4,030 deletions.
5 changes: 4 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
language: python
python:
- "2.7"
- "3.4"
- "3.5"
- "3.6"
dist: trusty
sudo: required

Expand Down Expand Up @@ -38,4 +41,4 @@ deploy:
target_branch: gh-pages
on:
branch: master
python: "2.7"
python: "3.6"
97 changes: 43 additions & 54 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
Tombo Summary
=============

.. image:: https://travis-ci.org/nanoporetech/tombo.svg?branch=master
|travis_badge|

.. |travis_badge| image:: https://travis-ci.org/nanoporetech/tombo.svg?branch=master
:target: https://travis-ci.org/nanoporetech/tombo

Tombo is a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data.
Expand All @@ -19,9 +21,9 @@ Installation
:target: http://bioconda.github.io/recipes/ont-tombo/README.html

.. |pypi_badge| image:: https://badge.fury.io/py/ont-tombo.svg
:target: https://badge.fury.io/py/ont-tombo
:target: https://pypi.org/project/ont-tombo/

Basic tombo installation (python2.7 support only)
Basic tombo installation (python 2.7 and 3.4+ support)

::

Expand All @@ -30,7 +32,7 @@ Basic tombo installation (python2.7 support only)

# or install pip package (numpy install required before tombo for cython optimization)
pip install numpy
pip install ont-tombo
pip install ont-tombo[full]

..
Expand All @@ -51,32 +53,32 @@ Re-squiggle (Raw Data Alignment)

::

tombo resquiggle path/to/amplified/dna/fast5s/ genome.fasta --minimap2-executable ./minimap2 --processes 4
tombo resquiggle path/to/amplified/dna/fast5s/ genome.fasta --processes 4

..
Only R9.4/5 data is supported at this time.

DNA or RNA is automatically determined from FAST5s (set explicitly with `--dna` or `--rna`).
DNA or RNA is automatically determined from FAST5s (set explicitly with ``--dna`` or ``--rna``).

FAST5 files need not contain Events data, but must contain Fastq slot. See `annotate_raw_with_fastqs` for pre-processing of raw FAST5s.
FAST5 files need not contain Events data, but must contain Fastq slot. See ``annotate_raw_with_fastqs`` for pre-processing of raw FAST5s.

Identify Modified Bases
^^^^^^^^^^^^^^^^^^^^^^^

::

# comparing to an alternative 5mC model (recommended method)
# comparing to an alternative 5mC and 6mA model (recommended method)
tombo test_significance --fast5-basedirs path/to/native/dna/fast5s/ \
--alternate-bases 5mC --statistics-file-basename sample_compare
--alternate-bases 5mC 6mA --statistics-file-basename sample

# comparing to a control sample (e.g. PCR)
tombo test_significance --fast5-basedirs path/to/native/dna/fast5s/ \
--control-fast5-basedirs path/to/amplified/dna/fast5s/ --statistics-file-basename sample_compare
--control-fast5-basedirs path/to/amplified/dna/fast5s/ --statistics-file-basename sample_compare

# compare to the canonical base model
tombo test_significance --fast5-basedirs path/to/native/dna/fast5s/ \
--statistics-file-basename sample --processes 4
--statistics-file-basename sample_de_novo --processes 4

..
Expand All @@ -100,7 +102,7 @@ Extract Sequences Surrounding Modified Positions

::

tombo write_most_significant_fasta --statistics-filename sample_compare.5mC.tombo.stats \
tombo write_most_significant_fasta --statistics-filename sample.6mA.tombo.stats \
--genome-fasta genome.fasta

Plotting Examples
Expand All @@ -117,11 +119,11 @@ Plotting Examples
# plot raw signal at genome locations with the most significantly/consistently modified bases
tombo plot_most_significant --fast5-basedirs path/to/native/rna/fast5s/ \
--statistics-filename sample_compare.5mC.tombo.stats --plot-alternate-model 5mC
--statistics-filename sample.5mC.tombo.stats --plot-alternate-model 5mC
# plot per-read test statistics using the 5mC alternative model testing method
# plot per-read test statistics using the 6mA alternative model testing method
tombo plot_per_read --fast5-basedirs path/to/native/rna/fast5s/ \
--genome-locations chromosome:1000 chromosome:2000:- --plot-alternate-model 5mC
--genome-locations chromosome:1000 chromosome:2000:- --plot-alternate-model 6mA

===============
Common Commands
Expand Down Expand Up @@ -190,66 +192,47 @@ Read Filtering:
Note on Tombo Models
====================

Tombo is currently provided with two standard models (DNA and RNA) and one alternative model (DNA::5mC). These models are applicable only to R9.4/5 flowcells with 1D or 1D^2 kits (not 2D).
Tombo is currently provided with two standard models (DNA and RNA) and two alternative models (DNA::5mC, DNA::6mA). These models are applicable only to R9.4/5 flowcells with 1D or 1D^2 kits (not 2D).

These models are used by default for the re-squiggle and testing commands. The correct model is automatically selected for DNA or RNA based on the contents of each FAST5 file and processed accordingly. Additional models will be added in future releases.

============
Requirements
============

At least one supported mapper:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- minimap2 (https://github.com/lh3/minimap2)
- BWA-MEM (http://bio-bwa.sourceforge.net/)
- graphmap (https://github.com/isovic/graphmap)

- HDF5 (http://micro.stanford.edu/wiki/Install_HDF5#Install)
python Requirements (handled by conda or pip):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

python Requirements (handled by pip):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- numpy (must be installed before installing tombo)
- numpy
- scipy
- h5py
- cython
- mappy

Optional packages for plotting (install R packages with ``install.packages([package_name])`` from an R prompt):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- rpy2 (along with an R installation)
- ggplot2 (required for any plotting subcommands)
- cowplot (required for plot_motif_with_stats subcommand)
Optional packages (handled by conda, but not pip):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Optional packages for alternative model estimation:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- Plotting Packages

+ R
+ rpy2
+ ggplot2
+ gridExtra (required for ``plot_motif_with_stats`` and ``plot_kmer`` subcommands)

- sklearn
- On-disk Random Fasta Access

+ pyfaidx

Advanced Installation Instructions
----------------------------------

Install tombo with all optional dependencies (for plotting and model estimation)

::

pip install ont-tombo[full]

Install tombo with plotting dependencies (requires separate installation
of R packages ggplot2 and cowplot)

::

pip install ont-tombo[plot]

Install tombo with alternative model estimation dependencies
Minimal tombo installation without optional dependencies (enables re-squiggle, all modified base testing methods and text output)

::

pip install ont-tombo[alt_est]
pip install ont-tombo

Install github version of tombo (most versions on pypi should be up-to-date)
Install github version of tombo (versions on conda/pypi should be up-to-date)

::

Expand All @@ -267,4 +250,10 @@ http://biorxiv.org/content/early/2017/04/10/094672
Gotchas
=======

- If plotting commands fail referencing rpy2 images, shared object files, etc., this may be an issue with the version of libraries installed by conda. In order to resolve this issue, remove the conda-forge channel and re-install ont-tombo.
- The Tombo conda environment (especially with python 2.7) may have installation issues.

+ The first troubleshooting step would be to install in a python 3.4+ environment.
+ The R ``cowplot`` package was also causing several installation issues. As of Tombo version 1.2 the ``cowplot`` dependency has been replaced by the ``gridExtra`` package which should resolve this inter-dependency issue.
+ If python2 is a requirement, un-installing and re-installing the offending package may help.
+ Moving ``conda-forge`` to the end of the conda channel list (or removing it altogether) may help ``conda config --append channels conda-forge``.
+ In python 2.7 there is an issue with the conda scipy.stats package. Down-grading to version 0.17 fixes this issue.
Binary file added docs/_images/dampened_fraction.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/roc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/_images/single_samp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
#

# Get the version number from __init__.py
verstrline = open(os.path.join('..', __pkg_name__, '_version.py'), 'r').read()
verstrline = open(os.path.join('..', __pkg_name__, '_version.py'), 'r').readlines()[-1]
vsre = r"^TOMBO_VERSION = ['\"]([^'\"]*)['\"]"
mo = re.search(vsre, verstrline)
if mo:
Expand Down
Loading

0 comments on commit 1afadeb

Please sign in to comment.