Predict protein-protein interaction interfaces using deep learning.
rf2t-micro
uses an implementation of RoseTTAFold-2track with streamlined installation, a command-line interface, a Python API, and resilience to out-of-memory error. It takes (paired) multiple-sequence alignments in A3M format (as generated by tools like hhblits
) as input, and outputs a matrix of inter-residue contacts.
Obtaining and setting up rf2t-micro
is easy.
$ pip install git+https://github.com/scbirlab/rf2t-micro
Using the embedded model requires using the RoseTTAFold-2track weights. These are automatically downloaded, but by using rf2t-micro
you agree that the trained weights for RoseTTAFold are made available for non-commercial use only under the terms of the Rosetta-DL Software license.
You can always get more help by running
$ rf2t-micro run --help
Once you have your (paired) MSA file in A3M format, you can run
$ rf2t-micro run msa-file.a3m --chain-a-length 224
It is required to specify the length of the first protein of the protein pair using the --chain-a-length
option. If you haven't run the model before, the weights will be downloaded automatically which may take a few minutes.
By default, rf2t-micro
tries to use GPU acceleration where available. To force only CPU use, use the --cpu
option.
You can ask for visualisations of the residue interaction matrix by specifying --plot output-directory
. You can use your own RF-2track parameters with the --params param-file.npz
option.
The rf2t-micro
package was made primarily to be a lightweight dependency of a portable pip
-installable protein-protein interaction screening tool called yunta
, which in turn could be used in our nf-ggi
Nextflow pipeline so that HPC clusters could be conveniently used to screen for protein-protein interactions at scale.
yunta
provides a command-line interface and Python API for predicting protein-protein interactions using GPU-accelerated direct coupling analysis (DCA), RoseTTAFold-2track, and structures usign AlphaFold2 from unpaired MSAs.
The code in the network/performer_pytorch.py is strongly based on this repo which is pytorch implementation of Performer architecture. The codes in network/equivariant_attention is from the original SE(3)-Transformer repo which accompanies the paper 'SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks' by Fabian et al.
M. Baek, et al., Accurate prediction of protein structures and interactions using a three-track neural network, Science (2021). link
I.R. Humphreys, J. Pei, M. Baek, A. Krishnakumar, et al, Computed structures of core eukaryotic protein complexes, Science (2021). link