Test and Reproducibility for Dialysis of Weak Polyelectrolytes (#36)

* Add test of G-RxMC (ideal). * Adjust G-RxMC test * Add test on cpH * add test mode to speed up CI tests, get rid of numpy warnings, clean up sample scripts, change order of the CI tests * Remove plot option from samples * CI check * Test and reproducibility of Landsgesell dialysis of weak PE * Changes suggested by PMB * remove verbose from test, add README on how to reproduce the data --------- Co-authored-by: Pablo M. Blanco <pblanco@dama.icp.uni-stuttgart.de>
pyMBE-dev · Apr 19, 2024 · 55b78a2 · 55b78a2
1 parent 8a8f2c5
commit 55b78a2
Show file tree

Hide file tree

Showing 10 changed files with 1,621 additions and 309 deletions.
diff --git a/Makefile b/Makefile
@@ -10,12 +10,13 @@ docs:
 tests:
 	python3 testsuite/lj_tests.py
 	python3 testsuite/generate_perpendicular_vectors_test.py
+	python3 testsuite/create_molecule_position_test.py
 	python3 testsuite/read-write-df_test.py
 	python3 testsuite/henderson_hasselbalch_tests.py
 	python3 testsuite/cph_ideal_tests.py
 	python3 testsuite/grxmc_ideal_tests.py
 	python3 testsuite/peptide_tests.py
-	python3 testsuite/create_molecule_position_test.py
+	python3 testsuite/weak_polyelectrolyte_dialysis_test.py
 
 visual:
 	python3 handy_scripts/vmd-traj.py

diff --git a/maintainer/standarize_data.py b/maintainer/standarize_data.py
@@ -8,7 +8,8 @@
 pmb = pyMBE.pymbe_library()
 
 # Expected inputs
-supported_filenames=["Glu-HisMSDE.csv",
+supported_filenames=["data_landsgesell.csv",
+                     "Glu-HisMSDE.csv",
                      "Lys-AspMSDE.csv",
                      "histatin5_SoftMatter.txt"]
 
@@ -23,18 +24,20 @@
 filename=args.src_filename
 
 # Outputs
-output_filenames={"Lys-AspMSDE.csv": "Lunkad2021a.csv",
+output_filenames={"data_landsgesell.csv": "Landsgesell2020a.csv",
+                  "Lys-AspMSDE.csv": "Lunkad2021a.csv",
                   "Glu-HisMSDE.csv": "Lunkad2021b.csv",
                   "histatin5_SoftMatter.txt": "Blanco2020a.csv"}
 
 # Sanity checks
 if filename not in supported_filenames:
     ValueError(f"Filename {filename} not supported, supported files are {supported_filenames}")
 
-# Extact the data from Ref.
+# Extract the data from Ref.
 ref_path=pmb.get_resource(f"testsuite/data/src/{filename}")
 Refs_lunkad=["Glu-HisMSDE.csv","Lys-AspMSDE.csv"]
 Ref_blanco=["histatin5_SoftMatter.txt"]
+Ref_landsgesell=["data_landsgesell.csv"]
 
 if filename in Refs_lunkad:
     data=pd.read_csv(ref_path)
@@ -47,16 +50,22 @@
     data=np.loadtxt(ref_path, delimiter=",")
     Z_ref=data[:,1]         
     Z_ref_err=data[:,2]
+
+elif filename in Ref_landsgesell:
+    data = pd.read_csv(ref_path, sep="\t", index_col=False)
+
 else:
     raise RuntimeError()
 
-pH_range = np.linspace(2, 12, num=21)
 
-# Store the data
-data=pd.DataFrame({"pH": pH_range,
-                  "charge": Z_ref,
-                  "charge_error": Z_ref_err})
+if filename in Refs_lunkad+Ref_blanco:
+    pH_range = np.linspace(2, 12, num=21)
+
+    # Store the data
+    data=pd.DataFrame({"pH": pH_range,
+                      "charge": Z_ref,
+                      "charge_error": Z_ref_err})
 
 data_path=pmb.get_resource(f"testsuite/data")
 data.to_csv(f"{data_path}/{output_filenames[filename]}", 
-            index=False)
+            index=False)
diff --git a/samples/Beyer2024/README.md b/samples/Beyer2024/README.md
@@ -0,0 +1,13 @@
+The scripts in this folder are designed to reproduce the data and plots showcased in our publication [1].
+To reproduce the data, one simply needs to run the following script:
+```bash
+python3 create_paper_data.py --fig_label 7a --mode long-run --plot
+```
+where the previous line will run the script to produce Fig. 7a in Ref.[1] The user can use the argparse argument `--fig_label` to create any of the plots that we presented in that publication as benchmarks: 7a, 7b, 7c, 8a, 8b, 9. The argparse `--mode` controls the statiscal accuracy (i.e. the number of samples) that the script measures. The mode `long-run` should be used to generate data with the same statistical accuracy than in Ref.[1]. The mode `short-run` can be used for a shorter run for testing or to trying out the scripts for each of our benchmarks:
+- peptide.py: for the peptide benchmarks
+- globular_protein.py: for the globular protein benchmarks
+- weak_polyelectrolyte_dialysis.py: for the weak polyelectrolyte dialysis benchmarks
+The optional argparse argument `--plot` controls if the scripts generates the corresponding plot or if the data is simply stored to file. We note that the format of the plots can differ from that of our publication [1]. This scripts are part of the continous integration (CI) scheme of the pyMBE library and they are used to ensure that any stable version of the library reproduces the benchmarks.
+
+
+[1] Beyer, D., Torres, P. B., Pineda, S. P., Narambuena, C. F., Grad, J. N., Košovan, P., & Blanco, P. M. (2024). pyMBE: the Python-based Molecule Builder for ESPResSo. arXiv preprint [arXiv:2401.14954](https://arxiv.org/abs/2401.14954).