Wrapper around Meta’s Universal Model for Atoms (UMA) that drives an ensemble workflow per input XYZ file: geometry optimizations (incl. TS), single points, vibrational frequencies, quasi-RRHO thermochemistry, optional Gaussian input emission, and optional IRC tracing.
umadriver is exposed as a command-line tool and as a small Python API.
- Ensemble-first pipeline: every run goes through the conformer ensemble workflow and writes a results CSV.
- Geometry: standard optimization or TS (first-order saddle) via Sella.
- Frequencies & Thermochemistry: finite-difference frequencies, qRRHO by default (RRHO available), temperature/pressure controls.
- Gaussian inputs: emit M052X/6-31G* (optionally SMD solvent) inputs for downstream QM.
- IRC: forward/backward intrinsic reaction coordinate after a TS (or directly).
- Batch mode: run many jobs via a manifest file or globbing multiple XYZs.
- HPC-friendly: thread controls, scratch directories, safe defaults to avoid JAX/CPU thread contention.
(adapted from fairchem repo)
Create a free Hugging Face account, request access to the UMA model repository, and have your personal access token ready.
pip install -U "huggingface_hub[cli]"
hf --help # should print CLI usage (the 'hf' CLI is the modern tool)
# (Alternative older name: huggingface-cli --help)
# Interactive (recommended):
hf auth login
# Or non-interactive if you already exported HF_TOKEN:
# export HF_TOKEN=hf_xxx... # put this in your shell profile to persist
hf auth login --token "$HF_TOKEN" --add-to-git-credential
# Verify:
hf auth whoamiIssues:
- 401 / permission denied: You haven’t been granted access to
facebook/UMAor you aren’t logged in. Runhf auth whoamiand re-apply for repository access if needed.
Requires Python ≥ 3.9.
conda create -n umadriver -c conda-forge python=3.10 -y
conda activate umadriver
git clone https://github.com/msh-yi/umadriver.git
cd umadriver
pip install .
umadriver -hDeclared in pyproject.toml:
numpy>=1.23ase>=3.22.1fairchem-core>=2.4.0
Notes:
- GPU use is optional. Set
--device cudato use a CUDA GPU if yourfairchem-core/PyTorch install supports it.- The driver touches JAX only to set safe defaults so that JAX does not hog GPU memory from UMA/torch. No JAX code is required from your side.
- If you have more than one GPU they will be parallelized at the ensemble level (i.e. if you have four .xyz files and three GPUs, each will handle all the structures in one .xyz file)
# Basic geometry optimization of all conformers in molecule.xyz
umadriver --xyz molecule.xyz
# Tight optimization + frequencies + qRRHO at 343.15 K
umadriver --xyz molecule.xyz --opt-mode Tight --freq --temp 343.15
# Single-point energies (no optimization)
umadriver --xyz molecule.xyz --sp
# Transition-state optimization, then freq and IRC
umadriver --xyz ts_guess.xyz --optts --freq --irc
# Emit Gaussian inputs (gas phase)
umadriver --xyz molecule.xyz --solv none
# Emit Gaussian inputs with SMD(acetonitrile), custom resources
umadriver --xyz molecule.xyz --solv acetonitrile --gauss-mem 80GB --gauss-nproc 8
# Explicit charge / multiplicity
umadriver --xyz anion.xyz --charge -1 --mult 1Outputs
By default, results go to <basename>.ensemble/ (override with --outdir). The driver prints the path to a summary CSV at the end:
Ensemble complete. Results CSV: <outdir>/energies.csv
If --freq is requested you’ll also get per-structure, ORCA-style vibration outputs and a thermochemistry summary (qRRHO on by default).
--xyz PATH(required unless usingbatch): input XYZ with one or more conformers/frames.- Geometry
--opt(default: Sella): optimizer (use--spfor single-point only).--optts: TS optimization (1st-order saddle; uses Sella).--opt-mode {Loose,Normal,Tight,VeryTight}(default: Normal).--maxcycles INT(default: 300).
- Frequencies / Thermo
--freq: run vibrational analysis.--freq-delta FLOAT(Å, default 0.01),--freq-nfree 2,--freq-scale FLOAT.--temp K(default 298.15),--pressure-atm(default 1).--qrrho/--no-qrrho(default qRRHO on),--cutoff-cm1,--qrrho-ref-cm1(default 100),--qrrho-alpha(default 4.0).--symmetry-number,--point-group(optional thermochemistry inputs).
- Gaussian inputs
--solv NAME→ writes M052X/6-31G* inputs; ifNAMEgiven, usesscrf(SMD,solvent=<NAME>).--gauss-mem,--gauss-nproc.
- IRC
--irc,--irc-dx FLOAT(default 0.1).
- Model / device / cache
--model(default: uma-m-1p1).--device {cuda,cpu,auto}(default: cuda; batch default also cuda).--cache-dir(default: driver’sDEFAULT_FAIRCHEM_CACHE).
- Scratch & logging
--scratch-root PATH(default:$UMA_SCRATCH_ROOTor driver default).--use-local-scratch.--verbose,--debug.
Run umadriver -h to see the full help.
Use the batch subcommand to process multiple ensembles in one process (one GPU), either from a manifest or with globs.
umadriver batch \
--xyz-glob "inputs/*.xyz" "more/*.xyz" \
--out-root runs \
--resume \
--model uma-m-1p1 --device cuda \
--optimizer Sella --optts \
--do-freq \
--irc --irc-dx 0.1--out-rootbecomes the parent folder for each job’sout_dir.--resume(default on) skips jobs that already have anenergies.csv. Use--no-resumeto force reruns.- Flags like
--optimizer,--optts,--do-freq,--solv,--irc,--irc-dxact as broadcast overrides applied to all jobs in this batch.
Two styles are supported per job entry:
- Per-job
overrides:map (recommended for clarity) - Flattened keys directly under a job (handy for SP/solv/freq shortcuts)
# manifest.yaml
jobs:
- xyz: path/to/molecule_A.xyz
out_dir: runs/molecule_A
overrides:
charge: 0
optimizer: Sella
opt_mode: Tight
do_freq: true
temp: 298.15
- xyz: path/to/ts_guess.xyz
out_dir: runs/ts_pipeline
overrides:
charge: -1
optimizer: Sella
optts: true # TS optimization
do_freq: true
irc: true
irc_dx: 0.1
# Single-point + Gaussian input emission (no optimization, no freq)
- xyz: path/to/solv_sp.xyz
out_dir: runs/solv_sp
optimizer: null # same effect as CLI --sp
optts: false
do_freq: false
solv: acetonitrile
gauss_mem: 80GB
gauss_nproc: "8"
# Pure frequency/thermo at elevated T (no optimization)
- xyz: path/to/freq_only.xyz
out_dir: runs/freq_only
overrides:
optimizer: null # no optimization
do_freq: true
temp: 343.15
pressure_atm: 1.0Run it:
umadriver batch --manifest manifest.yaml --out-root runs --device cudaThe CLI will print a one-line summary per job:
[OK] inputs/molecule_A.xyz -> runs/molecule_A
umadriver does a small early parse of two special flags before importing heavy deps:
-
--sella-threads INT
Sets the Sella/BLAS thread pool size. If omitted, it falls back to:$SLURM_CPUS_PER_TASK(if present), elseos.cpu_count().
It also relaxes caps on
OMP_NUM_THREADS,MKL_NUM_THREADS,OPENBLAS_NUM_THREADS, andNUMEXPR_NUM_THREADSaccordingly. -
--jax-platform {cpu,cuda}(default: cpu)
PinsJAX_PLATFORMS/JAX_PLATFORM_NAME. Unless on multi-GPU, CPU is safer so JAX doesn’t reserve GPU memory that UMA/torch needs.
Examples
# On a SLURM node with 8 CPUs:
srun -c 8 umadriver --xyz mol.xyz --sella-threads 8
# Force JAX to CPU; use CUDA for UMA
umadriver --xyz mol.xyz --jax-platform cpu --device cuda<outdir>/
energies.csv # ensemble summary (paths, energies, statuses, etc.)
conformer_0001/ # per-conformer working folders
opt.log
freq/ # if --freq
vibrations.dat # ORCA-style prints
thermo_summary.json # qRRHO/RRHO summary, temperature, etc.
gaussian_inputs/ # if --solv was set
conf0001.gjf
...
(Exact layout may evolve; rely on the printed CSV path for aggregation.)
If you prefer calling from Python:
from umadriver.ensemble import run_conformer_workflow
csv_path = run_conformer_workflow(
"molecule.xyz",
out_dir="molecule.ensemble",
charge=0, mult=1,
model="uma-m-1p1", device="cuda",
cache_dir=None, use_local_scratch=False,
optimizer="Sella", opt_mode="Normal", optts=False,
maxcycles=300,
do_freq=False,
freq_delta=0.01, freq_nfree=2, freq_scale=1.0,
temp=298.15, pressure_atm=1.0,
symmetry_number=1, point_group=None,
qrrho=True, cutoff_cm1=None,
qrrho_ref_cm1=100.0, qrrho_alpha=4.0,
solv=None, gauss_mem="160GB", gauss_nproc="16",
sella_internal=True, sella_eta=2e-2, sella_gamma=1e-4, sella_delta0=0.02,
irc=False, irc_dx=0.1,
)
print("CSV:", csv_path)The arguments mirror the CLI options. See the source for the exact signature and defaults.
- CUDA OOM or contention
- Try
--jax-platform cpu(default) to keep JAX off the GPU. - Reduce
--gauss-nproc/ threads; set--sella-threadsexplicitly.
- Try
- Runs are being skipped in batch
--resumeis on by default. Use--no-resumeto force re-runs.
- Slow frequencies
- Increase
--freq-deltaslightly (with care), or run fewer conformers at once.
- Increase
- Thermochemistry doesn’t match expectations
- Remember qRRHO is enabled by default (
--no-qrrhoto switch to RRHO). - Provide
--symmetry-number/--point-groupwhen known.
- Remember qRRHO is enabled by default (
- UMA (Universal Model for Atoms) by Meta AI.
- Uses ASE for atoms & vibrations, Sella for robust geometry/TS, and fairchem-core for the underlying ML potential.
See LICENSE in this repository.
- 0.1.0 — initial public release.
jobs:
# Gas-phase optimization with frequencies at 298 K
- xyz: inputs/complex.xyz
out_dir: runs/complex_opt
overrides:
charge: -1
optimizer: Sella
opt_mode: VeryTight
do_freq: true
temp: 298.15
# TS → freq → IRC (anionic)
- xyz: inputs/ts_guess.xyz
out_dir: runs/ts_pipeline
overrides:
charge: -1
optimizer: Sella
optts: true
do_freq: true
irc: true
irc_dx: 0.1
# Single-point + Gaussian input emission in acetonitrile (no freq)
- xyz: inputs/solv_only.xyz
out_dir: runs/solv_only
optimizer: null
optts: false
do_freq: false
solv: acetonitrile
gauss_mem: 80GB
gauss_nproc: "8"