Skip to content

[Feature]: I'd like to benefit from disk savings with Embedding Quantization #385

@qdrddr

Description

@qdrddr

Use Case

When I use the VectorChord PG extension v1.1.0

Problem Statement

Embedding models from high-dimensional space, such as 4096 from Qwen3-embedding-8B, consume a lot of space

How This Feature Would Help

This reduces the storage footprint

Proposed Solution

utilize VectorChord's quantization types rabitq8 or rabitq4 to reduce the storage.

https://docs.vectorchord.ai/vectorchord/usage/quantization-types.html

Alternatives Considered

No response

Priority

Important - affects my workflow

Additional Context

Quantization data types have an advantage: It drastically reduce storage footprint from 32-bit to 8/4 bit.
On the other hand, quantization is losing precision.
For scalability purposes quntization might be acceptable (and the only option) and may need testing to understand if its acceptible for the user.

In theory, I would expect Q4 quantization to reduce accuracy by no more than a few percent on average. But testing is needed.

Checklist

  • I would be willing to contribute this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions