output | ||||
---|---|---|---|---|
|
This repository implements the method developed in Latent Multimodal Functional Graphical Model Estimation.
This form documents the artifacts associated with the article (i.e., the data and code supporting the computational findings) and describes how to reproduce the findings.
- This paper does not involve analysis of external data (i.e., no data are used or the only data are generated by the authors via simulation in their code).
- I certify that the author(s) of the manuscript have legitimate access to and permission to use the data used in this manuscript.
The dataset consists of simulated and real data of EEG-fMRI. The simulated data along with data generation code is available. We provide generation code for four types of simulated graph under 2 different noise models, as detailed in Appendix K. We will provide a download link for a simulated dataset for these four types of graphs and with the sample size 100, and the dimension 50,100,150. The real data is available upon request. Since we do not own the original dataset of concurrent measurements of EEG-fMRI, we kindly ask to send the request to the original authors referenced in the manuscript.
- Data are publicly available.
- Data cannot be made publicly available.
If the data are publicly available, see the Publicly available data section. Otherwise, see the Non-publicly available data section, below.
-
Data are available online at:here
-
Data are available as part of the paper’s supplementary material.
-
Data are publicly available by request, following the process described here:
-
Data are or will be made available through some other mechanism, described here:
The data we use is originally from Morillon et al. (2010). We have contacted the data owner, Anne-Lise Giraud, for data sharing. She has agreed to share data when individuals request it. Please contact the Anne-Lise Giraud (email:anne-lise.giraud-mamessier@pasteur.fr) to request the data. The simulated data are available in the above link. Partial simulated data, i.e, the data that run the sample complexity in section 7.2 of the manuscript are not available online due to the fact that the size is too big (~1TB). But the data generation code is provided so practitioners can generate data on their own.
Reference
Morillon, B., Lehongre, K., Frackowiak, R. S., Ducorps, A., Kleinschmidt, A., Poeppel, D., & Giraud, A. L. (2010). Neurophysiological origin of human brain asymmetry for speech and language. Proceedings of the National Academy of Sciences, 107(43), 18688-18693.
- CSV or other plain text.
- Software-specific binary format (.Rda, Python pickle, etc.): pkcle
- Standardized binary format (e.g., netCDF, HDF5, etc.):
- Other (please specify):
- Provided by authors in the following file(s): Under the directory
data/README.md
- Data file(s) is(are) self-describing (e.g., netCDF files)
- Available at the following URL:
The code contains source files and testing files. We briefly outline the content of each directory. The code/synth_data
directory contains files to generate synthetic data and script files to generate a batch of synthetic data. The code/src
contains all the source code. We do not provide the codes for other comparing methods as we do not possess the ownership. Under the directory code/tests
, the directory notebook
contains all the step-by-step instruction and visualization code, the script
folder contains the execution script. The code/experiments
directory contains the data preprocessing code and graph estimation code for real data.
- Script files
- R
- Python
- Matlab
- Other:
- Package
- R
- Python
- MATLAB toolbox
- Other:
- Reproducible report
- R Markdown
- Jupyter notebook
- Other:
- Shell script
- Other (please specify):
R version 3.6.0 Python version 3.7.3
- R-packages
- wordspace_0.2-6
- fields_12.5
- viridis_0.6.1
- viridisLite_0.4.0
- spam_2.7-0
- dotCall64_1.0-1
- plotly_4.10.0
- ggplot2_3.3.5
- pracma_2.3.3
- R.matlab_3.6.2
- far_0.6-5
- nlme_3.1-139
- matrixcalc_1.0-5
- poweRlaw_0.70.6
- fgm_1.0
- mvtnorm_1.1-2
- fda_5.4.0
- deSolve_1.30
- fds_1.8
- RCurl_1.98-1.5
- rainbow_3.6
- pcaPP_1.9-74
- MASS_7.3-51.3
- Matrix_1.2-17
- RSpectra_0.16-0
- doParallel_1.0.16
- iterators_1.0.10
- foreach_1.4.4
- Python packages
- numpy_1.19.1
- scipy_1.5.2
- pathos_0.2.8
- matplotlib_3.4.3
- multiprocessing_0.70.12.2
- nilearn_0.9.0
- rpy2_2.9.4
Platform: x86_64-conda_cos6-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
All experiments are run on a cluster and no GPUs are required. Any single run of the experiment can be run on a standalone desktop. However, if practitioners want to test different variables to generate ROC curves, it is highly recommended to use a cluster.
- No parallel code used
- Multi-core parallelization on a single machine/node
- Number of cores used: 8-160
- Multi-machine/multi-node parallelization
- Number of nodes and cores used:
- MIT License (default)
- BSD
- GPL v3.0
- Creative Commons
- Other: (please specify)
The provided workflow reproduces:
- Any numbers provided in text in the paper
- The computational method(s) presented in the paper (i.e., code is provided that implements the method(s))
- All tables and figures in the paper
- Selected tables and figures in the paper, as explained and justified below:
The workflow is available:
- As part of the paper’s supplementary material.
- In this Git repository: The git respository will be made public if accepted. Now we include the contents under the directory
code/
- Other (please specify):
- Single master code file
- Wrapper (shell) script(s)
- Self-contained R Markdown file, Jupyter notebook, or other literate programming approach
- Text file (e.g., a readme-style file) that documents workflow
- Makefile
- Other (more detail in Instructions below)
Each simulated experiment is consisted of three steps (i) generate simulated data (ii) run the proposed algorithm (w/ variable selection) (iii) visualize the results.
The code for the first step is under the directory code/synth_data
. The file for the second step is under code/tests
or the proposed algorithm can be run in batch by running the script files in code/tests
. The tools to visualize the results are under code/tests/notebook
. Each directory also contains the README.md
file for more detailed instruction.
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_N2.sh
- Data downlaod link. Store the files under
data_batch_N2
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model_2/*N100.sh
. Modify the script file to specify the conda environment, file path, and save path.
- Script file:
- Visualization
- Result download link. The directory
/proposed
contain the results of proposed method. The directory/comparison/
constain the results of other comparison methods. - Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table:
to print the AUC and AUC15, set
verbose=True
- Result download link. The directory
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_sample.sh
- Data generation script:
- Estimation
- Script file:
./code/tests/script/sample_sample/
. Modify the script file to specify the conda environment, file path, and save path.
- Script file:
- Visualization
- Result download link. Please select to download the directory
./p50
,./p100
,./p150
- Visualization notebook:
/code/notebook/plot_SampleComplexity.ipynb
- Result download link. Please select to download the directory
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_N1.sh
- Data downlaod link. Store the files under
data_batch_N1
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model_1/*N100.sh
. Modify the script file to specify the conda environment, file path, and save path.
- Script file:
- Visualization
- Result download link
The directory
/proposed
contain the results of proposed method. The directory/comparison/
constain the results of other comparison methods. - Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table:
to print the AUC and AUC15, set
verbose=True
- Result download link
The directory
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_N2.sh
- Data downlaod link. Store the files under
data_batch_N2
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model_2/
. Modify the script file to specify the conda environment, file path, and save path.
- Script file:
- Visualization
- Result download link
The directory
/proposed
contain the results of proposed method. - Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table:
to print the AUC and AUC15, set
verbose=True
- Result download link
The directory
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_N1.sh
- Data downlaod link. Store the files under
data_batch_N1
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model_1/
. Modify the script file to specify the conda environment, file path, and save path.
- Script file:
- Visualization
- Result download link
The directory
/proposed
contain the results of proposed method. - Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table:
to print the AUC and AUC15, set
verbose=True
- Result download link
The directory
- Data preparation
- Data generation script:
/code/synth_data/run_dgp_k.sh
, - Data downlaod link
- Data generation script:
- Estimation
- Script file:
./code/tests/script/sample_k/
.
- Script file:
- Visualization
- Result download link
- Visualization notebook:
/code/notebook/plot_SampleComplexity_2.ipynb
.
- Estimation
- Run
/code/tests/notebook/plot_elbo.ipynb
and save the result
- Run
- Visualization
- Visualization notebook:
/code/tests/notebook/plot_elbo2.ipynb
- Visualization notebook:
- Estimation
- Run
/code/tests/notebook/plot_elbo.ipynb
and save the result
- Run
- Visualization
- Visualization notebook:
/code/tests/notebook/plot_elbo2.ipynb
- Visualization notebook:
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_N2.sh
- Data downlaod link. Store the files under
data_batch_N2
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model_2/
. Modify the script file to specify the conda environment, file path, and save path
- Script file:
- Visualization
- Result download link
- Visualization notebook:
/code/tests/notebook/plot_VariableSelection.ipynb
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_sample.sh
- Data generation script:
- Estimation
- Script file:
./code/tests/script/sample_alpha/
.
- Script file:
- Visualization
- Result download link
- Visualization notebook:
/code/notebook/plot_SampleComplexity_2.ipynb
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_kmk.sh
- Data downlaod link
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model1_varykmk/
- Script file:
- Visualization
- Result download link
- Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table: to print the AUC and AUC15, set
verbose=True
- Data preparation
- Data generation script:
./code/synth_data/run_dgp_kmk.sh
- Data downlaod link
- Data generation script:
- Estimation
- Script file:
./code/tests/script/noise_model1_varykmk/
- Script file:
- Visualization
- Result download link
- Visualization notebook:
/code/notebook/plot_Comparison.ipynb
- Instruction to generate table: to print the AUC and AUC15, set
verbose=True
Approximate time needed to reproduce the analyses on a standard desktop machine:
- < 1 minute
- 1-10 minutes
- 10-60 minutes
- 1-8 hours
- > 8 hours
- Not feasible to run on a desktop machine, as described here: It is safest to run on a cluster as original tests are implemented with parallelization. One can modify the number of cores in the test file to make it suitable for a desktop machine.
We provide a demo example that can be run on standard desktop. Please see the /code/README.md
for further instruction.