This repository contains the core modeling code for the paper: PonderLM-2: Pretraining LLM with Latent Thoughts in Continuous Space.
Paper: https://www.arxiv.org/abs/2509.23184
This repository provides the core implementation of the PonderLM-2 architecture within modeling_gpt_neox.py. This file can be used as a drop-in replacement for the standard GPT-NeoX implementation in the Hugging Face transformers library to load and use our pretrained models.
Please note that our implementation is based on transformers version 4.46.1. To load the model from the Hugging Face Hub and use our custom code, you must set trust_remote_code=True:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "zeng123/PonderLM-2-1.4b" # Replace with your actual model path
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True
)The official model weights are available on the Hugging Face Hub.
If you find this work useful, please consider citing the paper:
@article{zeng2025pretraining,
title={PonderLM-2: Pretraining LLM with Latent Thoughts in Continuous Space},
author={Zeng, Boyi and Li, He and Song, Shixiang and Wang, Yixuan and He, Ziwei and Wang, Xinbing and Lin, Zhouhan},
journal={arXiv preprint arXiv:2509.23184},
year={2025}
}