Bro，还请指教。

<img width="3548" height="1769" alt="Image" src="https://github.com/user-attachments/assets/d47200df-70cf-4b98-80f0-083dbff5c286" />
我在6个问题，10个epoch的测试集上不断的去监控与ground truth的similarity，得到了这个图表，我不是很理解它为什么会波动如此。

比如用绿色折线代表的问题举例，按我的直觉，在第五个epoch出现了急速下降，那么第六个epoch会反思到这个版本的playbook是不好的，则会通过这个惩罚因子修正走向。换句话说，我觉得这些曲线应该是不断收敛的呀。

为什么好像它并未察觉，导致相似度越来越低呢？还是说我去评估这个系统应该用均值来评估？

另外有没有办法去让某个问题的最佳playbook版本有所保留，类似于针对性的保存最优权重。



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bro，还请指教。 #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bro，还请指教。 #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions