Skip to content

Commit

Permalink
Docs, missing clusterGenes fix. For issue #213 and update to PR #215.
Browse files Browse the repository at this point in the history
  • Loading branch information
ifiddes committed Nov 24, 2020
1 parent 551d22d commit f47dbdd
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 2 deletions.
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,19 @@ As described above, the primary method to executing the pipeline is to follow th

`--workers`: Number of local cores to use. If running `toil` in singleMachine mode, care must be taken with the balance of this value and the `--maxCores` parameter.

## Augustus config options

The augustus config files for all of the modes live in the CAT folder under `augustus_cfgs`. If you are running CAT from a folder that is not the installation folder, you will need to point CAT to these files directly.

`--tm-cfg`: Config file for AugustusTM. Defaults to `augustus_cfgs/extrinsic.ETM1.cfg`.

`--tmr-cfg`: Config file for AugustusTMR. Defaults to `augustus_cfgs/extrinsic.ETM2.cfg`.

`--augustus-cgp-cfg-template`: Config file template for AugustusCGP. Defaults to `augustus_cfgs/cgp_extrinsic_template.cfg`.

`--pb-cfg`": Config file for AugustusPB. Defaults to `augustus_cfgs/extrinsic.M.RM.PB.E.W.cfg`.


## transMap options
`--global-near-best`: Adjusts the `globalNearBest` parameter passed to `pslCDnaFilter`. Defaults to 0.15. The `globalNearBest` algorithm determines which set of alignments are within a certain distance of the highest scoring alignment for a given source transcript. Making this value smaller will increase the number of alignments filtered out, decreasing the apparent paralogous alignment rate. Alignments which survive this filter are putatively paralogous.

Expand Down
4 changes: 2 additions & 2 deletions cat/filter_transmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,10 +148,10 @@ def hash_aln(aln, aln_id):
tools.fileOps.print_row(out_coding, tx.get_gene_pred())
else:
tools.fileOps.print_row(out_noncoding, tx.get_gene_pred())
cmd = ['clusterGenes', '-cds', f'-minOverlappingBases={overlapping_ignore_bases}',
cmd = ['clusterGenes', '-cds', f'-ignoreBases={overlapping_ignore_bases}',
coding_tmp, 'no', coding_clusters]
tools.procOps.run_proc(cmd)
cmd = ['clusterGenes', f'-minOverlappingBases={overlapping_ignore_bases}',
cmd = ['clusterGenes', f'-ignoreBases={overlapping_ignore_bases}',
noncoding_tmp, 'no', noncoding_clusters]
tools.procOps.run_proc(cmd)
coding_clustered = pd.read_csv(coding_tmp, sep='\t')
Expand Down

0 comments on commit f47dbdd

Please sign in to comment.