Skip to content

Latest commit

 

History

History
58 lines (33 loc) · 3.7 KB

README.md

File metadata and controls

58 lines (33 loc) · 3.7 KB

🌹 RoseTTAFold-2track micro

GitHub Workflow Status (with branch) PyPI - Python Version PyPI

Predict protein-protein interaction interfaces using deep learning.

rf2t-micro uses an implementation of RoseTTAFold-2track with streamlined installation, a command-line interface, a Python API, and resilience to out-of-memory error. It takes (paired) multiple-sequence alignments in A3M format (as generated by tools like hhblits) as input, and outputs a matrix of inter-residue contacts.

Installation

Obtaining and setting up rf2t-micro is easy.

$ pip install git+https://github.com/scbirlab/rf2t-micro

Using the embedded model requires using the RoseTTAFold-2track weights. These are automatically downloaded, but by using rf2t-micro you agree that the trained weights for RoseTTAFold are made available for non-commercial use only under the terms of the Rosetta-DL Software license.

Usage

You can always get more help by running

$ rf2t-micro run --help

Once you have your (paired) MSA file in A3M format, you can run

$ rf2t-micro run msa-file.a3m --chain-a-length 224

It is required to specify the length of the first protein of the protein pair using the --chain-a-length option. If you haven't run the model before, the weights will be downloaded automatically which may take a few minutes.

By default, rf2t-micro tries to use GPU acceleration where available. To force only CPU use, use the --cpu option.

You can ask for visualisations of the residue interaction matrix by specifying --plot output-directory. You can use your own RF-2track parameters with the --params param-file.npz option.

... if you want larger scale and unpaired MSAs

The rf2t-micro package was made primarily to be a lightweight dependency of a portable pip-installable protein-protein interaction screening tool called yunta, which in turn could be used in our nf-ggi Nextflow pipeline so that HPC clusters could be conveniently used to screen for protein-protein interactions at scale.

yunta provides a command-line interface and Python API for predicting protein-protein interactions using GPU-accelerated direct coupling analysis (DCA), RoseTTAFold-2track, and structures usign AlphaFold2 from unpaired MSAs.

Credit to performer-pytorch and SE(3)-Transformer codes

The code in the network/performer_pytorch.py is strongly based on this repo which is pytorch implementation of Performer architecture. The codes in network/equivariant_attention is from the original SE(3)-Transformer repo which accompanies the paper 'SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks' by Fabian et al.

References

M. Baek, et al., Accurate prediction of protein structures and interactions using a three-track neural network, Science (2021). link

I.R. Humphreys, J. Pei, M. Baek, A. Krishnakumar, et al, Computed structures of core eukaryotic protein complexes, Science (2021). link