Better scaling behavior with larger datasets

With the current algorithms, we fail to see the quality of hypotheses, i.e., hypothesis-based inference accuracy scale as data size increases. 

Currently we do see a lot of minor changes on new hypotheses, and we are seeing duplicates.

Some easy changes to try (maybe right now) would be try generating more hypotheses (increase batch size, increase number of hypothesis to generate, etc.) and using a smaller temperature.

For more difficult changes, we should carefully look into each update and see if we are getting new information from each update. We will likely need to improve the reward function and our generation modules for this, which can be one of the next steps after the benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better scaling behavior with larger datasets #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better scaling behavior with larger datasets #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions