Skip to content

Commit

Permalink
add setup section
Browse files Browse the repository at this point in the history
  • Loading branch information
semiller10 committed Sep 6, 2024
1 parent a3e9172 commit 4eb0414
Showing 1 changed file with 34 additions and 10 deletions.
44 changes: 34 additions & 10 deletions anvio/docs/programs/anvi-reaction-network.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,49 @@
This program **stores a metabolic %(reaction-network)s in a %(contigs-db)s or %(pan-db)s.**

The network consists of data on biochemical reactions predicted to be encoded by the genome or pangenome, referencing the [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) and [ModelSEED Biochemistry](https://github.com/ModelSEED/ModelSEEDDatabase) databases.
The network consists of data on biochemical reactions predicted to be encoded by the genome or pangenome.

Information on the predicted reactions and the involved metabolites are stored in two tables of the %(contigs-db)s or %(pan-db)s. The program, %(anvi-get-metabolic-model-file)s, can be used to export the %(reaction-network)s from the database to a %(reaction-network-json)s file formatted for flux balance analysis.
Information on the predicted reactions and the involved metabolites are stored in tables of the %(contigs-db)s or %(pan-db)s. The program, %(anvi-get-metabolic-model-file)s, can be used to export the %(reaction-network)s from the database to a %(reaction-network-json)s file formatted for input into programs for flux balance analysis.

## Usage
## Setup

%(anvi-reaction-network)s takes a either a %(contigs-db)s OR a %(pan-db)s and %(genomes-storage-db)s as required input. Genes stored within the %(contigs-db)s or %(genomes-storage-db)s must have KO protein annotations, which can be assigned by %(anvi-run-kegg-kofams)s.
%(anvi-setup-kegg-data)s downloads [binary relations files](https://www.genome.jp/brite/br08906) needed to construct a %(reaction-network)s from [KEGG Orthology (KO)](https://www.genome.jp/kegg/ko.html) sequence annotations. Make sure to run that program with the `--kegg-snapshot` option to use the newest snapshot of %(kegg-data)s, [`v2024-08-30`](https://figshare.com/articles/dataset/KEGG_build_2024-08-30/26880559?file=48903154), which includes binary relations files.

{{ codestart }}
anvi-setup-kegg-data --kegg-snapshot v2024-08-30
{{ codestop }}

The KO and ModelSEED Biochemistry databases must be set up and available to the program. By default, these are expected to be set up in default anvi'o data directories. %(anvi-setup-kegg-data)s and %(anvi-setup-modelseed-database)s must be run to set up these databases.
%(anvi-setup-modelseed-database)s sets up the ModelSEED Biochemistry database, which harmonizes biochemical data from various reference databases, including KEGG.

{{ codestart }}
anvi-reaction-network -c /path/to/contigs-db
anvi-setup-modelseed-database
{{ codestop }}

Custom locations for the reference databases can be provided with the flags, `--ko-dir` and `--modelseed-dir`.
### Download newest available KEGG files

Alternatively, KEGG data including binary relations files can be set up not from a snapshot but by downloading the newest files available from KEGG using the `-D` flag. In the following command, a higher number of download threads than the default of 1 is provided by `-T`, which significantly speeds up downloading.

{{ codestart }}
anvi-reaction-network -c /path/to/contigs-db \
--ko-dir /path/to/set-up/ko-dir \
--modelseed-dir /path/to/set-up/modelseed-dir
anvi-setup-kegg-data -D -T 5
{{ codestop }}

### Install in non-default location

At the moment, KEGG data that includes binary relations files does _not_ include "stray" KOs (see %(anvi-setup-kegg-data)s) due to changes in the available model files. To preserve KEGG data that you already have set up, for this reason or another, the new snapshot or download can be placed in a non-default location using the option, `--kegg-data-dir`.

{{ codestart }}
anvi-setup-kegg-data --kegg-snapshot v2024-08-30 --kegg-data-dir path/to/other/directory
{{ codestop }}

`anvi-reaction-network` requires a `--kegg-dir` argument to seek KEGG data in a non-default location.

Likewise, different versions of the ModelSEED Biochemistry database can be set up in non-default locations and used with the `--modelseed-dir` argument.

## Usage

%(anvi-reaction-network)s takes a either a %(contigs-db)s OR a %(pan-db)s and %(genomes-storage-db)s as required input. Genes stored within the %(contigs-db)s or %(genomes-storage-db)s must have KO protein annotations, which can be assigned by %(anvi-run-kegg-kofams)s.

{{ codestart }}
anvi-reaction-network -c /path/to/contigs-db
{{ codestop }}

If a %(contigs-db)s already contains a %(reaction-network)s from a previous run of this program, the flag `--overwrite-existing-network` can overwrite the existing network with a new one. For example, if %(anvi-run-kegg-kofams)s is run again on a database using a newer version of KEGG, then %(anvi-reaction-network)s should be rerun to update the %(reaction-network)s derived from the KO annotations.
Expand Down

0 comments on commit 4eb0414

Please sign in to comment.