Thank you for the interesting work, and making the code easily accessible. I have some confusion on the relationship between the ratio and iterative_size parameters.
In the case I am interested, there is a single demonstration that I want to compress using only the token-level compression approach. I've noticed that, in general, the final ratio between the compressed and original length can vary quite a bit for large enough' ratio' values. However, I noticed that when I make the iterative_size parameter small, e.g. 10, the final compressed ratio is more truthful to the value specified for the ratio parameter.
I'm confused as to why this is the case. From the paper, my understanding was that \gamma_j threshold for segment s_j (whose length is defined by the iterative_size parameter), was based primarily on the ratio parameter. Meaning that, regardless of the iterative_size, LLMLingua would always prune ratio percentage of the tokens in that segment.
Any clarifications of this would be useful, including where in the code \gamma_j is computed.
Thank you for the interesting work, and making the code easily accessible. I have some confusion on the relationship between the
ratioanditerative_sizeparameters.In the case I am interested, there is a single demonstration that I want to compress using only the token-level compression approach. I've noticed that, in general, the final ratio between the compressed and original length can vary quite a bit for large enough' ratio' values. However, I noticed that when I make the
iterative_sizeparameter small, e.g. 10, the final compressed ratio is more truthful to the value specified for theratioparameter.I'm confused as to why this is the case. From the paper, my understanding was that \gamma_j threshold for segment s_j (whose length is defined by the
iterative_sizeparameter), was based primarily on theratioparameter. Meaning that, regardless of theiterative_size, LLMLingua would always pruneratiopercentage of the tokens in that segment.Any clarifications of this would be useful, including where in the code \gamma_j is computed.