NoLabs

Open source biolab

About

NoLabs is an open source biolab that lets you run experiments with the latest state-of-the-art models, bioinformatics tools and scalable no-code workflow engine for bio research.

Features

Workflow engine

Create workflows combining different models and data
Schedule jobs and observe results for big data processing
Adjust input parameters for particular jobs

Bio Buddy - lab assistant agent

BioBuddy is a drug discovery copilot that supports:

Downloading data from ChemBL
Downloading data from RcsbPDB
Questions about drug discovery process, targets, chemical components etc
Writing review reports based on published papers
Creating a workflow schema for you

For example, you can ask

"Can you pull some latest approved drugs?"
"Can you download 1000 rhodopsins?"
"How does an aspirin molecule look like?" and it will do this and answer other questions.

Starting

⚠️ Warning: For macOS users, there are known issues with running Docker Compose properly for certain setups. Please use COLIMA: https://github.com/abiosoft/colima

# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs
# Create .env files (you will be able to adjust them)
$ chmod +x scripts/gen-envs.sh
$ make gen-envs

OR if you use Windows (untested!)

# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs
# Create .env files (you will be able to adjust them)
$ Makefile.bat gen-envs

Generate a new token for docker registry https://github.com/settings/tokens/new?scopes=read:packages Select 'read:packages' (should be automatically selected when navigating)

$ docker login ghcr.io -u username -p ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If you want to run a single feature (recommended)

$ docker compose up nolabs-frontend nolabs-api nolabs-worker mongo redis
# mongo, redis and worker are required
$ docker compose up esmfold-light
$ docker compose up diffdock
...

OR if you want to run everything on one machine:

$ docker compose up

Server will be available on http://localhost:9000

These commands will start

NoLabs frontend (appication UI server)
NoLabs worker (fault-tolerant distributed workflow scheduler)
NoLabs API backend (API backend for UI)

Since workflow is backend by Celery and Redis, Redis instance must be accessible from all workers!

Scalability

Workflow engine is scalable thanks to Celery and Redis, so you can run multiple instances of, let's say, Diffdock and jobs will be automatically distributed on multiple machines. By the default each worker processes one job at a time.

Settings

You can find .env files in nolabs/infrastructure/.env and under microservices folder. You can adjust settings as you wish. To regenerate .env files from templates run

make gen-envs

APIs

We provide individual Docker containers backed by FastAPI for each feature, which are available in the /microservices folder. You can use them individually as APIs.

For example, to run the esmfold service, you can use Docker Compose:

$ docker compose up esmfold-api

Once the service is up, you can make a POST request to perform a task, such as predicting a protein's folded structure. Here's a simple Python example:

import requests

# Define the API endpoint
url = 'http://127.0.0.1:5736/inference'

# Specify the protein sequence in the request body
data = {
    'fasta_sequence': ''
}

# Make the POST request and get the response
response = requests.post(url, json=data)

# Extract the PDB content from the response
pdb_content = response.json().get('pdb_content', '')

print(pdb_content)

This Python script makes a POST request to the esmfold microservice with a protein sequence and prints the predicted PDB content.

Running workers/API on a separate machine

Since we provide individual Docker containers backed by Celery\Fastapi for each feature, available in the microservices folder, you can run them on separate machines. This setup is particularly useful if you're developing on a computer without GPU support but have access to a VM with a GPU for tasks like folding, docking, etc.

For instance, to run the diffdock service, use Docker Compose on the VM or computer equipped with a GPU.

On machine with a GPU do the following steps:

1) Adjustmicroservices/diffdock/service/.envfile withREDIS_URL` pointing to your Redis deployment 2) Run following command

$ docker compose up diffdock

You should see celery start log message. Now you are ready to use this service hosted on a separate machine!

In case you want to start just an API (there is no integration in NoLabs, however you can use service separately):

$ docker compose up diffdock-api

Once the service is up, you can check that you can access it from your computer by navigating to http://< gpu_machine_ip>:5737/docs

How to run workers

1) RFdiffusion (protein design)

Model: RFdiffusion

RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).

make download-rfdiffusion-weights
docker compose up rfdiffusion

2) ESMFold (folding)

Model: ESMFold - Evolutionary Scale Modeling

make download-esmfold-weights
docker compose up esmfold

3) ESMAtlas (folding)

Model: ESMAtlas

docker compose up esmfold-light

4) Diffdock (protein-ligand binding prediction)

Model: DiffDock

make download-diffdock-weights
docker compose up diffdock

5) Proteinmpnn (design fasta from pdb)

Model: Proteinmpnn

docker compose up proteinmpnn

6) Adaptyv bio Protein Affinity Characterization component

Uses adaptyv bio api

To enable real data flow to adaptyv bio change ENVIRONMENT environment variable in nolabs/infrastructure/.env to production

Also you need to change NOLABS_ADAPTYV_BIO_API_TOKEN to actual token. You must request it on adaptyv bio

7) Adaptyv bio Protein Binding Screening component

Uses adaptyv-bio api

To enable real data flow to adaptyv bio change ENVIRONMENT environment variable in nolabs/infrastructure/.env to production

Also you need to change NOLABS_ADAPTYV_BIO_API_TOKEN to actual token. You must request it on adaptyv bio

8) Biobuddy

To enable biobuddy:

Adjust NOLABS_ENABLE_BIOBUDDY environment variable in nolabs/infrastructure/.env
Adjust OPENAI_API_KEY and TAVILY_API_KEY in microservices/biobuddy/biobuddy/.env
Start docker compose

$ docker compose up biobuddy nolabs-frontend nolabs-worker nolabs-api mongo redis

Nolabs is running on GPT4 for the best performance. You can adjust the model you use in microservices/biobuddy/biobuddy/services.py

9) Blast (using this api)

docker compose up blast-query

10) Arxiv abstracts AI search (Standalone)

This microservice contains LLM RAG search over arXiv abstracts (up to 01/12/2024). How to use this docker image:

Generate a new token for docker registry https://github.com/settings/tokens/new?scopes=read:packages Select 'read:packages' (should be automatically selected when navigating link above).

$ docker login ghcr.io -u username -p ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # generated token

Download ChromaDb for vector search

$ make gen-envs
$ make download-arxiv-abstracts-db

You must set your openai api key either in microservices/arxiv_abstracts/service/.env or on UI
Start docker

$ docker compose -f docker-compose.api.yaml up arxiv-ai-abstractions-search-api

Wait until fastapi messages appear
You can access UI in browser http://0.0.0.0:8001/chat

Requirements

[Recommended for laptops]

RAM > 16GB
[Optional] GPU memory >= 16GB (REALLY speeds up the inference)

[Recommended for powerful workstations] Else, if you want to host everything on your machine and have faster inference (also a requirement for folding sequences > 400 amino acids in length):

RAM > 30GB
[Optional] GPU memory >= 40GB (REALLY speeds up the inference)

Made by Igor Bruev and Tim Ishmuratov

Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 805 Commits
.github		.github
build		build
docs		docs
frontend		frontend
media		media
microservices		microservices
nolabs		nolabs
scripts		scripts
tests		tests
training		training
.dockerignore		.dockerignore
.env.template		.env.template
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
Makefile.bat		Makefile.bat
README.md		README.md
docker-compose.api.yaml		docker-compose.api.yaml
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoLabs

Open source biolab

Contents

About

Features

Workflow engine

Bio Buddy - lab assistant agent

Starting

Scalability

Settings

APIs

Running workers/API on a separate machine

How to run workers

1) RFdiffusion (protein design)

2) ESMFold (folding)

3) ESMAtlas (folding)

4) Diffdock (protein-ligand binding prediction)

5) Proteinmpnn (design fasta from pdb)

6) Adaptyv bio Protein Affinity Characterization component

7) Adaptyv bio Protein Binding Screening component

8) Biobuddy

9) Blast (using this api)

10) Arxiv abstracts AI search (Standalone)

Requirements

About

Releases 3

Packages

Contributors 4

Languages

License

BasedLabs/NoLabs

Folders and files

Latest commit

History

Repository files navigation

NoLabs

Open source biolab

Contents

About

Features

Workflow engine

Bio Buddy - lab assistant agent

Starting

Scalability

Settings

APIs

Running workers/API on a separate machine

How to run workers

1) RFdiffusion (protein design)

2) ESMFold (folding)

3) ESMAtlas (folding)

4) Diffdock (protein-ligand binding prediction)

5) Proteinmpnn (design fasta from pdb)

6) Adaptyv bio Protein Affinity Characterization component

7) Adaptyv bio Protein Binding Screening component

8) Biobuddy

9) Blast (using this api)

10) Arxiv abstracts AI search (Standalone)

Requirements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 4

Languages

Packages