Reproducing Results from Table 2 for Llama-1B

Hi,

Following up on #4, I am having trouble reproducing the results reported in Table 2 of the paper (for Llama-1B). I was able to achieve accuracy comparable to the reported SFT-CoT results after training the model for 25 epochs, so this one is fine.

Could you please confirm the values used for the following arguments for both the Coconut baseline and Coconut with SIM-CoT?

1. `max_latent_stage`
2. `num_epochs`

Thanks!
Neeraj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reproducing Results from Table 2 for Llama-1B #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing Results from Table 2 for Llama-1B #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions