Smart Potential with Atomistic Rare Events and Continuous Learning
For More Information, Please Visit SPARC Documentation.
Try SPARC Tutorial
SPARC is a Python package build around the ASE wrapper that implements an automated workflow of developing machine learning potential for reactive chemical systems. It automates the process of identifying new structures in the configurational space without having to run initial ab-initio MD simulations. SPARC is designed to work seamlessly within the Python framewrok to efficiently improve ML model.
- Automated active learning workflow
- Ab initio molecular dynamics (AIMD) using VASP and CP2K
- Machine learning potential training with DeepMD-kit
- ML/MD simulations and iterative model refinement
- Monitor atomic force deviations and query-by-committee to identify new configurations
- Reactive trajectory generation with PLUMED integration
- Python 3.xx
- DeepMD-kit (version: 2.2.10)
- ASE (Atomic Simulation Environment)
- VASP (First-Principle Calculations)
- PLUMED (PES Exploration)
- numpy
- pandas
- dpdata
- cython
- pandoc
- nbsphinx
- Create and activate a conda environment:
conda create -n sparc python=3.10
conda activate sparc- Use any of following methods to install Deepmd-kit:
pip install deepmd-kit[gpu,cu12]==2.2.10conda install deepmd-kit=2.2.10=*gpu libdeepmd=2.2.10=*gpu lammps horovod -c https://conda.deepmodeling.com -c defaults- Clone repository and install pacakge:
git clone --depth 1 https://github.com/rahulumrao/sparc.git
cd sparc
pip install .Note
Some Collective Variables (CVs), such as Generic CVs (e.g., SPRINT), are part of the additional module and are not included in a standard PLUMED installation. To enable them, we need to manually install PLUMED and wrap with Python environment:
- Install PLUMED:
Download PLUMED package from the website, and install with the following flags (make sure conda env is active):
./configure --enable-mpi=no --enable-modules=all PYTHON_BIN=$(which python) --prefix=$CONDA_PREFIX
make -j$(nproc) && make installRefer to the official PLUMED installation page for more details.
If you don’t need additional modules, you can skip the manual installation and install PLUMED directly from conda-forge.
conda install -c conda-forge py-plumed- Set Environment Variables:
export VASP_PP_PATH=/path/to/vasp/potcar_files # POTCAR files pathIf you have installed PLUMED manually (skip if you used conda-forge), you also need to set PLUMED environment before running the code:
export PLUMED_KERNEL="$CONDA_PREFIX/lib/libplumedKernel.so"
export PYTHONPATH="$CONDA_PREFIX/lib/plumed/python:$PYTHONPATH"- Prepare input file (see example below)
- navigate to
scriptsfolder for full input tempelate
general:
structure_file: "POSCAR" # Input structure
md_simulation:
ensemble: "NVT" # Ensemble for MD simulation
thermostat: "Nose" # Thermostat type (nose-Hoover)
timestep_fs: 1.0 # TimeStep for MD simulation
md_steps: 10 # Number of MD Steps
temperature: 300 # Temperature in Kelvin
log_frequency: 4 # Interval for MD log and save trajectories
use_dft_plumed: False # Use PLUMED for MD simulation
dft_calculator:
name: "VASP" # DFT package name
prec: "Normal" # Precision level
kgamma: True # Gamma point calculation
incar_file: "INCAR" # Path/Name of VASP input file
# Active Learning
active_learning: True # Active Learning protocol
iteration: 10 # Number of Active Learning iteration
model_dev:
f_min_dev: 0.1 # Force uncertainity cutoff/s
f_max_dev: 0.8Once the installation is complete and required dependencies are setup, follow these steps to run SPARC.
Ensure you have the necessary input files (eg., input.yaml, input.json, INCAR, POSCAR). You can find a template in scripts/input.yaml in the root directory.
- Run SPARC
sparc -i input.yamlMonitor log and output stored in iter_xxxxxx directories.
>> Project Root
├── INCAR
├── POSCAR
├── input.json
├── input.yaml
├──── Dataset
│ └── training_data
│ └── validation_data
├── iter_000000
│ ├── 00.dft
│ ├── 01.train
│ └── 02.dpmd
├── iter_000001
├── 00.dft
├── 01.train
└── 02.dpmd- Supports both ab initio and ML molecular dynamics within ASE MD engine
- NVT ensemble with Nose-Hoover and Langevin thermostat
- Checkpoint/restart capabilities
- PLUMED integration for accelerated configuration space sampling
- Automated model training
- Ensemble model generation
- Configurable network architecture and training parameters
- Query by Committee approach for configuration selection
- Atomic force based error metrics
- Automated structure labeling and retraining
- Fixed model update in active learning iterations restart with added key:
learning_restart: Truelatest_model: 'path/to/frozen_model.pb'
- Structured log formatting for better readability
- Implemented Umbrella Sampling for reaction study on-the-fly
- Utility tools for analysing model accuracy, active learning status, and structural properties.
- Support for ORCA, Psi4 and xTB calculators
- Documentation under development
Important
There are some version dependencies, currently the latest version of deepmd-kit is not supported. Check DeePMD documentation for installation of older version.
- Currently only supports DeepMD-kit 2.2.10 (newer versions not yet supported)
- Documentation is still being developed
Important
- Deepmd-kit
pip install tensorflow[and-cuda]installation soetimes does not detect GPU. - To verify if TensorFlow detects your GPU, run the following command:
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Check TensorFlow pip installation page to fix this. \
Some hardware have also shown issues with conda channels
LibMambaUnsatisfiableError: Encountered problems while solving:
- nothing provides __cuda needed by libdeepmd-2.2.10-0_cuda10.2_gpu
- nothing provides __cuda needed by tensorflow-2.9.0-cuda102py310h7cc18f4_0
- Could not solve for environment specs
- The following packages are incompatible
- ├─ deepmd-kit 2.2.10 *gpu is not installable because it requires
- │ └─ tensorflow 2.9.* cuda*, which requires
- │ └─ __cuda, which is missing on the system;
- └─ libdeepmd 2.2.10 *gpu is not installable because it requires
- └─ __cuda, which is missing on the system.pip install sphinx sphinx-autodoc-typehints sphinx_rtd_theme
cd docs/
make htmlThis will create a html file in a build folder, open index.html in any browser.
This project is licensed under the MIT License.
Contributions are welcome! Please feel free to submit a Pull Request.
We used ruff and pre-commit for code styling and linting to keep the database consistant. Configurations are defined inside the pyproject.toml and pre-commit-config.yaml file.
pip install ruff
pip install pre-commitAfter installation, run all hooks
pre-commit run --all-filesWarning
This package is under active development. Features and APIs may change.
Also, this code is designed to work in a Linux environment. It may not be fully compatible with macOS systems.
If you use this software or the dataset in your research, please cite:
@article{joss,
author = {Verma, Rahul and Joshi, Nisarg and Pfaendtner, Jim},
title = {{SPARC}: An Automated Workflow Toolkit for Accelerated Active Learning of Reactive Machine Learning Interatomic Potentials},
journal = {Journal of Open Source Software},
volume = {11},
number = {120},
pages = {9468},
year = {2026},
month = {apr},
doi = {10.21105/joss.09468},
url = {https://doi.org/10.21105/joss.09468}
}
@software{sparc,
author = {Verma, Rahul and Joshi, Nisarg and Pfaendtner, Jim},
doi = {https://doi.org/10.5281/zenodo.19389278},
license = {MIT},
month = {Apr},
title = {{SPARC}: An Automated Workflow Toolkit for Accelerated Active Learning of Reactive Machine Learning Interatomic Potentials},
url = {https://github.com/rahulumrao/sparc},
year = {2026}
}
@dataset{sparc,
author = {Verma, Rahul and Joshi, Nisarg and Pfaendtner, Jim},
doi = {https://doi.org/10.5281/zenodo.18261342},
license = {MIT},
month = {jan},
title = {{SPARC}: An Automated Workflow Toolkit for Accelerated Active Learning of Reactive Machine Learning Interatomic Potentials},
url = {https://zenodo.org/records/18261342},
year = {2026}
}