Fixes in RL SCHEMA

1. Finding input prompt length in RL_SCHEMA after tokenizer fix
2. Generate attention mask for `generate_k_completions` so that the later functions can use the mask.
3. Fix the famous indexing issue: 
```
student_output_scores, collected_exit_logits = student(completions['tokens'][:, :-1], 
                                                       prescribed_exit_layer_idxs = prescribed_exit_layers[:,1:]) 
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes in RL SCHEMA #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fixes in RL SCHEMA #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions