Skip to content

Latest commit

 

History

History
51 lines (38 loc) · 2.04 KB

File metadata and controls

51 lines (38 loc) · 2.04 KB

Model Card

We release two pretrained checkpoints, both trained on the Alex-20s dataset with the same model architecture but different cfg_drop_prob settings.

Multitask

A unified model for both de novo generation (DNG) and crystal structure prediction (CSP), trained with 0 < cfg_drop_prob < 1. The model seamlessly switches between DNG and CSP depending on whether a chemical formula is provided.

CSP-Only

A dedicated crystal structure prediction model, trained with cfg_drop_prob=0 (formula conditioning is always enabled). This model is optimized for CSP tasks only.

Model Architecture

Both checkpoints share the same Transformer architecture:

params, transformer = make_transformer(
        key=jax.random.PRNGKey(42),
        Nf=5,
        Kx=16,
        Kl=4,
        n_max=21,
        h0_size=256,
        num_layers=16,
        num_heads=8,
        key_size=32,
        model_size=256,
        embed_size=256,
        atom_types=119,
        wyck_types=28,
        dropout_rate=0.1,
        attn_dropout=0.1,
        widening_factor=4,
        sigmamin=1e-3
)

Training Dataset

Alex-20s: ~1.7M general inorganic materials curated from the Alexandria database, filtered by:

  • Energy above hull: $E_{hull} &lt; 0.1$ eV/atom
  • Structure complexity: no more than 20 Wyckoff sites in the conventional cell

Speeds, Sizes, Times

  • Both models contain ~13.8M parameters
  • Generating 45,000 crystal samples on a single A100 GPU takes ~440 seconds (~10 ms per sample)