Skip to content

Better scaling behavior with larger datasets #34

@laoliu5280

Description

@laoliu5280

With the current algorithms, we fail to see the quality of hypotheses, i.e., hypothesis-based inference accuracy scale as data size increases.

Currently we do see a lot of minor changes on new hypotheses, and we are seeing duplicates.

Some easy changes to try (maybe right now) would be try generating more hypotheses (increase batch size, increase number of hypothesis to generate, etc.) and using a smaller temperature.

For more difficult changes, we should carefully look into each update and see if we are getting new information from each update. We will likely need to improve the reward function and our generation modules for this, which can be one of the next steps after the benchmark

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions