[unified arch] Cache the outputs of the vision encoder #241

neilmehta24 · 2025-10-29T14:36:02Z

Start tracking the tokens and cache in cache_wrapper. When we receive a followup prompt, we now no longer reprocess the images.
Cases:

The hash of the input images changes --> Always do full reprocessing
The prompt is extended --> Do text-only prompt processing
The prompt is trimmed --> Trim the cache and do text-only prompt processing
The cache cannot be trimmed --> Full vision reprocessing

I added two tests, one for SWA caches which often cannot be trimmed, and one for non-SWA caches, which are usually always trimmable.

Note that there are still opportunities for improvement. Namely, we could be caching the embeddings per image so that we can selectively re-use the embeddings. This can be added as a feature in a future PR, in cases where the cache cannot be trimmed.

Note that this doesn't cache images going through the non-unified stack, that can be added in a future PR.

neilmehta24 added 12 commits October 24, 2025 14:55

refactor qwen2_vl vision add on

db145da

working p1

2ce6b9d

use text-only hook

6a04c0e

working test

c9777fe

simplify

cb2dba7

test non-swa model

1ae4732

checkpoint

47e1d8a

checkpoint

c05c241

checkpoint

973907b

remove list

4e94de9

cleanup

36c360a

cleanup

fd38585

neilmehta24 requested review from mattjcly, will-lms and yagil October 29, 2025 14:36

github-actions bot added the CLA signed Indicates that all contributors have signed label Oct 29, 2025

cleanup

7dfe3cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[unified arch] Cache the outputs of the vision encoder #241

[unified arch] Cache the outputs of the vision encoder #241

Uh oh!

neilmehta24 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[unified arch] Cache the outputs of the vision encoder #241

Are you sure you want to change the base?

[unified arch] Cache the outputs of the vision encoder #241

Uh oh!

Conversation

neilmehta24 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants