Question about SphereAR for language modeling

Thanks for sharing this great work!

We are currently working on a related direction in the language domain with a framework we call Continuous Autoregressive Language Models (CALM, https://arxiv.org/abs/2510.27688). In CALM, an autoencoder learns a mapping from a chunk of discrete tokens to a single continuous vector. A subsequent generative model then learns to autoregressively predict the sequence of these vectors.

A key difference is the model input: the predicted vector is first passed through the autoencoder to be reconstructed into discrete tokens, and the embeddings of these tokens form the input for the next step.

My question is: in a setup like this, do you think the benefits of SphereAR would still apply? In other words, do you think SphereAR's improvements comes primarily from providing a more stable latent vector as **input** to the next step, or does it also make the latent vectors themselves more **learnable** for a generative model to predict?

Any thoughts would be very helpful. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about SphereAR for language modeling #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about SphereAR for language modeling #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions