-
Notifications
You must be signed in to change notification settings - Fork 3
Inquiry about Binary (All-or-Nothing) Compression Behavior with the TQA Model #3
Description
Hello,
Thank you for making your model and code available. I have been testing the model (created by merging your Hugging Face adapter with the base model) using your code repository on a subset of the TQA dataset (dev.json, top 50 documents, ~5000 tokens,Llama3.1-8B-Instruct).
The overall results are approximately consistent with those reported in the paper:
{
"compressor": "exit",
"dataset": "TQA",
"dataset_type": "standard",
"num_samples": 500,
"successful_samples": 500,
"failed_samples": 0,
"avg_em": 0.564,
"avg_f1": 0.6387630837277899,
"avg_compression_ratio": 0.15294212103817084,
...
}
However, during a case-by-case analysis, I observed a specific behavior: for each example, the model tends to either retain almost all sentences or discard almost all sentences. It rarely performs fine-grained, selective compression.
I would like to ask if you have encountered this "binary" or "all-or-nothing" compression behavior before? If so, I would be very grateful for any insights into its potential causes and any suggestions for mitigating it.
Thank you for your time and consideration.