Skip to content

Conversation

@liujch1998
Copy link
Contributor

There are a few experimental config files and launch scripts that resulted in the official 1B release. I thought it'd be nice to push these in for completeness. Not urgent.

Most stuff in this PR are experiments for the 1B pretraining investigation.

@liujch1998 liujch1998 requested review from aman-17 and dirkgr June 12, 2025 21:13
@aman-17
Copy link
Contributor

aman-17 commented Jun 12, 2025

Dirk would probably have an opinion on this, we typically avoid releasing ablations and experiment setups because once we do it for one model, people start requesting checkpoints and data mixes for every other size, You did it for 1B, please do it for 7B, 13B, 32B..I ran into this issue with olmoe, I got a lot of requests for ablation-related artifacts, and it took significant effort to track everything down and upload it to R2.

@liujch1998
Copy link
Contributor Author

Lol okay, lemme make a clean version with ablation experiments purged

@liujch1998 liujch1998 closed this Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants