conda env create --file=environment.yml
Download the drug response data in IC50 called PANCANCER_IC from GDSC. And download the gene expression data called CCLE_expression from CCLE under mRNA expression.
- Create a folder in your project directory called
root_folder
.
mkdir root_folder
- Place the PANCANCER_IC data under folder
data/GDSC
and place the CCLE_expression data under folderdata/CCLE
. Run the following command to preprocess the data. The data will be saved underroot_folder/<branch_num>
.
python load_data.py <branch_num>
python train.py \
--model <model_num>
--branch <branch_num>
--do_cv
--do_attn
- Available models: 0:GCN, 1:GAT, 2:GAT_Edge, 3:GATv2, 4:SAGE, 5:GIN, 6:GINE, 7:WIRGAT, 8:ARGAT, 9:RGCN, 10:FiLM
python gnnexplainer.py \
--model <model_num>
--branch <branch_num>
--do_attn
--explain_type <type>
python draw_gnnexplainer.py \
--model <model_num>
--branch <branch_num>
--explain_type <type>
--annotation <type>
- Available explaining types: 0:model, 1:phenomenon
- Available annotation types: 0:numbers, 1:heatmap, 2:both, 3:functional group-level heatmap
python integrated_gradients.py \
--model <model_num>
--branch <branch_num>
--do_attn
--iqr_baseline
- Download the gene sets from MSigDB and place them under
data/
. - Refer to
pathway_analysis.ipynb
for the pathway analysis experiments based on the gene saliency scores.