Splitting database and website #24

GiovanniBussi · 2019-04-17T09:28:24Z

I think it is a bit annoying to update the website in the way it works now. I would like to restructure it as follows:

One repository (say plumed-nest/eggs-database) could contain only what's currently in the eggs19 directory as well as a single yml file corresponding to the current _data/eggs.yml file. This repo would be the one where the script pushes all the updates to the database and will be automatically generated.
Another repo (say plumed-nest/plumed-nest.github.io) could contain only the website information (everything else in the current plumed-nest/plumed-nest) plus a git submodule eggs-database corresponding to the repository mentioned above. In addition, a file _data/eggs.yml could be a symbolic link to eggs-database/eggs.yml. This repository will be edited by us maintainers manually.

The repository plumed-nest would still run nest.py and push to plumed-nest/eggs-database (as it does now). Whenever we commit on plumed-nest, the script should do the following:

recreate the database and, if everything is ok, push to eggs-database. The push would be done maintaining history
open a pull request on plumed-nest/plumed-nest.github.io asking to merge the submodule update.

This change will improve the workflow in several ways:

We will be able to fix the website without regenerating the database everytime.
We will be able to double check updates to the database before they end up in the real page. Since the tables in the eggs directories are done with markdown, we can also check them from github directly (not rendered). After checking, one can merge the pull request on the website with a click.
We will be able to revert the database to previous versions (by using an older submodule in the website) if there are issues in one build.

The is no urgency but I will have a look at this when I have time, so I open the issue in case anyone wants to comment.

The text was updated successfully, but these errors were encountered:

carlocamilloni · 2019-04-17T11:51:04Z

I agree it would make sense

maxbonomi · 2019-04-17T14:11:02Z

I also agree with this!

GiovanniBussi · 2019-04-24T13:50:55Z

Related to this, I though we should make the construction of eggs parallelizable. This could be done in the following way:

Each time an egg is processed we only create a zip file (say eggs/19/003.zip). This file contain a portion of the finale _data/eggs.yml file plus all the generated md files.
Once all eggs are done we concatenate the individual yml files and push to the website.

If processing becomes slow, we could parallelize the generation using multiple travis jobs (each job processes some of the eggs), provided we find an easy way to combine the resulting files. Probably can be done by taking a hash of, say, plumed-nest version, plumed version, and nest.yml file (in principle given these three things the build is reproducible) and saving the resulting zip files somewhere (even on a temporary github repository where plumedbot has write permission).

GiovanniBussi mentioned this issue May 3, 2019

Move eggs to a single directory #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splitting database and website #24

Splitting database and website #24

GiovanniBussi commented Apr 17, 2019

carlocamilloni commented Apr 17, 2019

maxbonomi commented Apr 17, 2019

GiovanniBussi commented Apr 24, 2019

Splitting database and website #24

Splitting database and website #24

Comments

GiovanniBussi commented Apr 17, 2019

carlocamilloni commented Apr 17, 2019

maxbonomi commented Apr 17, 2019

GiovanniBussi commented Apr 24, 2019