Skip to content

Add optimized versions to models#4

Open
rijuyuezhu wants to merge 15 commits intodebugfrom
optimized-versions
Open

Add optimized versions to models#4
rijuyuezhu wants to merge 15 commits intodebugfrom
optimized-versions

Conversation

@rijuyuezhu
Copy link
Collaborator

  • {quest, raas, full, sink}_optimized models

rijuyuezhu and others added 6 commits January 21, 2025 13:32
+ Transplant from original quest repo
+ Mocked GQA: slow. A possible solution is to turn off GQA for the full (aligned)
+ float16 only. A possible solution is to use float16 for all methods
+ Use torch to implement most of the operations
+ Seems to be work normally
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant