Skip to content

Conversation

@josephdviviano
Copy link
Collaborator

  • I've read the .github/CONTRIBUTING.md file
  • My code follows the typing guidelines
  • I've added appropriate tests
  • I've run pre-commit hooks locally

Description

This is a DRAFT (no working example yet) which adds the core functionality implemented in https://github.com/GFNOrg/Chunk-GFN

I will update this with details when I have a working implementation.

Please see testing/test_chunking.py for working examples of core functionality.

@josephdviviano josephdviviano changed the base branch from master to generalize_samplers October 8, 2025 17:54
@josephdviviano josephdviviano changed the base branch from generalize_samplers to master October 8, 2025 17:55
@josephdviviano josephdviviano marked this pull request as draft October 8, 2025 17:58
device = states_active.device
mask = torch.zeros(B, N, dtype=torch.bool, device=device)

for b in range(B):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible to parallelize this.

non_exit_ids = [i for i in range(env.n_actions) if i != env.exit_token_id]
seen = set(env.vocab)
out: set[Hashable] = set()
while len(out) < n_tokens_to_add and len(out) < 10_000:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason the 10000 is hard coded here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants