Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLIP Tutorial #4982

Open
wants to merge 24 commits into
base: python
Choose a base branch
from
Open

Add MLIP Tutorial #4982

wants to merge 24 commits into from

Conversation

PythonFZ
Copy link
Contributor

@PythonFZ PythonFZ commented Aug 16, 2024

Add a Tutorial on Machine-Learning Interatomic Potentials

This tutorial requires some additional tools:

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@RudolfWeeber
Copy link
Contributor

Thank you, Fabian.
@SamTov agreed to do a detailed review next week.

Whlie having a quick look, I noticed the following:

  • there are some terms and abbreviaitons which people new to the field might not know. A bit more contet and explanaiton in the intro would be useful
  • Maybe the role of the different external parckges could be stated mor explicilty. In particular, how Espresso, ASe and Mace interact
  • Units: It wans't completely clear to me at first glance what units are used in the simulatoion. Is ESPResSo run in the usual reduced units?
  • As long as there are only external forces, the Verlet-Lists and skin can//should eb switche off
  • It would be important to get the external depencies stable, as we will have to include them in the CI

@PythonFZ
Copy link
Contributor Author

Hey Rudolf,

Thanks for the feedback. I have added some more clarification w.r.t. to the code that is used and added some more sources. I'd appreciate if you could help me improve which parts still need more clarification.

  • Units: It wans't completely clear to me at first glance what units are used in the simulatoion. Is ESPResSo run in the usual reduced units?

I have talked to @reinaual about the units and if I understood it correctly, I am defining the unit system in ESPResSo such that it resembles the one used within ASE.

  • As long as there are only external forces, the Verlet-Lists and skin can//should eb switche off

As I am not an ESPResSo user, would it be possible to give me an example or edit the file direclty.

  • It would be important to get the external depencies stable, as we will have to include them in the CI

I have removed the ipsuite dependency because all the features required are within the rdkit2ase package.
The new dependencies would be:

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think the title should either be Machine-Learned Interatomic Potentials or Machine Learning for Interatomic Potentials.
  • Maybe mention what kind model is? Something like, Equivariant graph model would be sufficient along with a link to the model paper. This is just the foundation model paper but a direct reference to the architecture would be nice too.
  • Do you have an overview of the ASE features somewhere?


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not explain each import to highlight what it is and why you need it? You can start with a code block and then write out a list of why you need each one. For example, pandas is not a clear thing to import in a simulation study unless you need it somewhere. I also usually split my imports by category in tutorials, something like:

# Simulation imports
import espressomd
from espressomd.plugins.ase import ASEInterface
from mace.calculators import mace_mp
import pint

# Linalg
import numpy as np
import pandas as pd

# Plotting and Visualisation
import plotly.express as px
from espressomd.zn import Visualizer

# Utils
import tqdm

# Analysis
from rdkit2ase import smiles2atoms, pack

Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use full variable names to alleviate any uncertainty. Things aren't clear to everyone. For me, kb would be clearer than boltmk but as you never introduce the fact that you will talk about Boltzmann, I would recommend just writing out boltzmann_constant or boltzmann_k. Maybe also add comments after each line highlight what unit conversions you are doing.


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty random after you talk about SMILES strings. Why not have a section about what this initialisation does and then show it. After, introduce SMILES and do the system construction.


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The peopl at the summer school may not know what SMILES is. For the sake of understanding what "CCO" means, it is probably worth just putting a picture of ethanol in and showing how the SMILES string is built out of it.


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    for atom in etoh:

It isn't clear why the atoms should be in the model object. Have you introduced the type mapping? What it is, why you need it?


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you tell people to open it in a different window as they will be running the descent after the opening? Maybe go to split screen or something?


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We haven't seen the first parts yet. Up to now, I don't know if it has accurate forces or can run MD. There is no comparison. You can do a classical force-field comparison, for example, or show DFT data. I actually have DFT ethanol trajectories you can use. Or, run a short DFT ethanol MD with PySCF, it is pretty fast.
  • Maybe list some chemical reaction methods.

Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.    system.thermostat.turn_off()

Why? People will ask why you are doing things like this. If you had to ask someone for help in making the script, the people in the tutorial will be at the same level but also without the ML knowledge. Explain everything.


Reply via ReviewNB

@@ -0,0 +1,447 @@
{
Copy link

@SamTov SamTov Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write up a conlcusion, what have we learned, what can we do next, why is this cool.


Reply via ReviewNB

@SamTov
Copy link

SamTov commented Aug 26, 2024

Let us consider what we want from the tutorial. Is a reaction the only analysis we want to do? Don't most people want to run MD before doing a reaction study? Both are nice, but we could do RDF calculations or even Rg computations on something like ethanol. Or go for a bigger molecular and do a quick polymer study.

Copy link
Contributor Author

@PythonFZ PythonFZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

@SamTov
Copy link

SamTov commented Sep 12, 2024

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

@PythonFZ
Copy link
Contributor Author

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

I see - I opened #4993 which would resolve this and make it work.

@SamTov
Copy link

SamTov commented Sep 12, 2024

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

I see - I opened #4993 which would resolve this and make it work.

Is it necessary? They can plot the plotly figure in the notebook and the syntax to have it opened in ZnDraw is quite odd, particularly for those who aren't familiar with the program.

@PythonFZ
Copy link
Contributor Author

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

Is there any reason you removed the interactive plotting part for the visualizer vis.zndraw.figures = {"distance": fig.to_json()}?

It didn't work when we ran it on the CIP pool computers.

I see - I opened #4993 which would resolve this and make it work.

Is it necessary? They can plot the plotly figure in the notebook and the syntax to have it opened in ZnDraw is quite odd, particularly for those who aren't familiar with the program.

  • opened in ZnDraw is quite odd could you elaborate what is odd about it and ideally give a suggestion on how to improve it?
  • I would like to showcase the interactive part of having plots alongside the data in the visualizer. It is not required but I thought it is a nice touch. From the MLIP standpoint it is not required.

Copy link

@SamTov SamTov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it.

@PythonFZ
Copy link
Contributor Author

Thanks for all the work you put into this 👍

The

#TODO: figure out if this works

as well as the vis.zndraw.figures = {"distance": fig.to_json()} might both depend on #4993

Unfortunately, I don't know what issues came up with this version and they did not mention or make any issues on the ZnDraw repo. If the issue will not be resolved before the tutorial it might be easiest to remove this part.

"source": [
"**Exercise:**\n",
"- Take a look at the energies and the molecule in the visualizer during the minimisation. \n",
"- How does the energy compare to the literature value of ~3341.86 eV for the ground state?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • this energy value needs a citation
  • this energy value is not reproduced by the MACE-MP-0 model in the current form of the tutorial

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will remove this as we have no good explanation for why it does not come out right. It could be due to the temperature or a spurious comparison value.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Energy is now resolved and correct.

"**Exercise:**\n",
"- Indentify the highlighted atoms in the visualization window, which are the two hydrogen atoms of the sulfuric acid.\n",
"- Observe the moment they undergo the protonation event visually and also in the hydrogen-oxygen distance plots.\n",
"- Can you also see changes in the potential energy? Why is that?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me, what is the answer to this question. The dip in energy could be due to the exothermic reaction $\mathrm{H_2SO_4 + H_2O \to HSO_4^{-} + H_3O^{+}}$, but the energy difference doesn't reproduce experimentally determined Gibbs free energy values. I'm also unsure if the potential energy can be directly equated to a Gibbs free energy, a Helmholtz free energy, an enthalpy or an internal energy.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually like to discuss this point during the tutorial. The question of whether this is a physically realistic process is fundamental to the explainability of ML potentials. During the discussion, I would also ask how to extend the study to determine the answer. Namely, running it longer and generating some statistics or pH measurements that can be compared to experimental values.

@RudolfWeeber
Copy link
Contributor

Is something till missing or can ew go ahed?

@SamTov
Copy link

SamTov commented Oct 4, 2024

The ground state energies for the water study are resolved and correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants