-
Notifications
You must be signed in to change notification settings - Fork 300
Description
Use Case
When I use the VectorChord PG extension v1.1.0
Problem Statement
Embedding models from high-dimensional space, such as 4096 from Qwen3-embedding-8B, consume a lot of space
How This Feature Would Help
This reduces the storage footprint
Proposed Solution
utilize VectorChord's quantization types rabitq8 or rabitq4 to reduce the storage.
https://docs.vectorchord.ai/vectorchord/usage/quantization-types.html
Alternatives Considered
No response
Priority
Important - affects my workflow
Additional Context
Quantization data types have an advantage: It drastically reduce storage footprint from 32-bit to 8/4 bit.
On the other hand, quantization is losing precision.
For scalability purposes quntization might be acceptable (and the only option) and may need testing to understand if its acceptible for the user.
In theory, I would expect Q4 quantization to reduce accuracy by no more than a few percent on average. But testing is needed.
Checklist
- I would be willing to contribute this feature