Inquiry about Binary (All-or-Nothing) Compression Behavior with the TQA Model

Hello,
Thank you for making your model and code available. I have been testing the model (created by merging your Hugging Face adapter with the base model) using your code repository on a subset of the TQA dataset (dev.json, top 50 documents, ~5000 tokens,Llama3.1-8B-Instruct).
The overall results are approximately consistent with those reported in the paper:
{
  "compressor": "exit",
  "dataset": "TQA",
  "dataset_type": "standard",
  "num_samples": 500,
  "successful_samples": 500,
  "failed_samples": 0,
  "avg_em": 0.564,
  "avg_f1": 0.6387630837277899,
  "avg_compression_ratio": 0.15294212103817084,
  ...
}

However, during a case-by-case analysis, I observed a specific behavior: for each example, the model tends to either retain almost all sentences or discard almost all sentences. It rarely performs fine-grained, selective compression.
I would like to ask if you have encountered this "binary" or "all-or-nothing" compression behavior before? If so, I would be very grateful for any insights into its potential causes and any suggestions for mitigating it.
Thank you for your time and consideration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry about Binary (All-or-Nothing) Compression Behavior with the TQA Model #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inquiry about Binary (All-or-Nothing) Compression Behavior with the TQA Model #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions