Optimize your atomistic data and interatomic potential models in your molecular dynamic workflows.
Reduce expensive Density functional theory calculations while maintaining training accuracy by intelligently subsampling your atomistic dataset:
- Subsample your atomistic configurations using a Determinantal Point Process (DPP) based algorithm that compares energy descriptors computed with the Atomic Cluster Expansion (ACE).
ds = DataSet(conf_train .+ e_descr)
dataset_selector = kDPP(ds, GlobalMean(), DotProduct())
inds = get_random_subset(dataset_selector)
conf_train = @views conf_train[inds]
- Export the reduced dataset, use Density functional theory (DFT) on it, and fit your model.
See example.
We are working to provide different intelligent subsampling algorithms based on DPP, DBSCAN, and CUR; highly scalable parallel subsampling via hierarchical subsampling and distributed parallelism; and optimal subsampler selection.
Get fast and accurate interatomic potential models through parallel multi-objective hyper-parameter optimization:
- Define the interatomic potential model, hyper-parameter value ranges, and custom loss function. Then, optimize your model.
model = ACE
pars = OrderedDict( :body_order => [2, 3, 4],
:polynomial_degree => [3, 4, 5], ...)
function custom_loss(metrics::OrderedDict)
...
return w_e * e_mae + w_f * f_mae + w_t * time_us
end
iap, res = hyperlearn!(model, pars, conf_train; loss = custom_loss);
- Export optimal values to your molecular dynamic workflow.
See example.
The models are compatible with the interfaces of our sister package InteratomicPotentials.jl. In particular, we are interested in maintaining compatibility with ACESuit, as well as integrating LAMMPS based potentials such as ML-POD and ML-PACE. We are also working to provide neural network potential architecture optimization.
Compress your interatomic potential data and model using dimensionality reduction of energy and force descriptors:
- Define a PCA state, fit PCA with your the energy and force descriptors of your dataset, and transform all dataset descriptors.
pca = PCAState(tol = n_desc)
fit!(ds_train, pca)
transform!(ds_train, pca)
- Export PCA fitted data to be used in your workflow.
See example.
We are working to provide feature selection of energy and force descriptors based on CUR.
Additionally, this package includes utilities for loading input data (such as XYZ files), computing various metrics (including MAE, MSE, RSQ, and COV), exporting results, and generating plots.
Acknowledgment: Center for the Exascale Simulation of Materials in Extreme Environments (CESMIX). Massachusetts Institute of Technology (MIT).