Skip to content

Huang-lab/MAVEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAVEN v1.0.0

MAVEN is a machine learning model trained to predict variant effects on protein stability and function. MAVEN is trained with labels generated from Multiplexed Assays of Variant Effect (MAVE) along with various evolutionary, structural, and amino acid physiochemical features.



Directory

training_pipeline/
training_pipeline/                 # Directory containing MAVEN training scripts
|--  train_maven.py                     # Main pipeline for training MAVEN
|--  train_maven_support.py             # Functions called by train_maven.py
training_data/
training_data/                 # Directory containing training datasets and files
|--  categorical_features.txt         # List of all categorical features
|--  maven_all_features_order.txt     # Order of features MAVEN takes as input
|--  maven_y_data.csv                 # Binary MAVE-guided labels for different cutoff thresholds for all variants
|--  metadata.csv                     # Metadata for all MAVE datasets 
|--  numeric_features.txt             # List of all numeric features
|--  X_testing_data_all.csv           # Testing data with features
|--  X_training_data_all.csv          # Training data with features
|--  y_testing_data_all.csv           # Testing data labels
|--  y_training_data_all.csv          # Training data labels
maven_v1.0.0/
maven_v1.0.0/                 # Directory stores data, results, figures, and models during MAVEN training
|--  data/...                     
|--  figures/...             
|--  model/...            



Installation

1. Clone MAVEN repo

git clone https://github.com/Huang-lab/MAVEN.git

cd MAVEN/

2. Create and activate python virtual environment and install dependencies

python3 -m venv venv_maven

source venv_maven/bin/activate

pip install -r requirements.txt

3. Download training datasets

mkdir training_data/

*** Training data will be available at a later date ***



Usage

Training MAVEN

cd training_pipeline/

python3 train_maven.py



Disclaimer

The MAVEN tool is designed for research purposes only and is not intended for medical use. MAVEN outputs and models should not be directly used to make medical decisions, provide medical advice, or substitute for professional healthcare consultation. The predictions generated by MAVEN are based on current scientific data and methodologies, which may evolve over time. Users are advised to consult with qualified healthcare professionals for any medical concerns or decisions.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages