Optimize PrimaryKey memory: serialize CqlValue into compact byte buffer#363
Optimize PrimaryKey memory: serialize CqlValue into compact byte buffer#363swasik wants to merge 1 commit intoscylladb:masterfrom
Conversation
66d766c to
79fbc06
Compare
Replace PrimaryKey(Vec<CqlValue>) with a serialized byte buffer
representation. CqlValue is 72 bytes per element due to enum sizing,
wasting significant memory for typical primary key types like Int (4
bytes of data, 68 bytes of padding).
The new PrimaryKey stores values as Box<[u8]> with a 1-byte type tag
followed by the minimal binary encoding for each value:
Int: 5 bytes (was 72) — 14× smaller
BigInt: 9 bytes (was 72) — 8× smaller
Uuid: 17 bytes (was 72) — 4× smaller
Text(s): 5+len (was 72) — variable
For a single Int primary key column, total per-row memory drops from
~96 bytes (24 Vec overhead + 72 CqlValue) to ~22 bytes (16 Box<[u8]> +
6 heap), a 4× improvement. With millions of indexed rows stored in the
BiMap, this substantially reduces RSS.
Additional improvements:
- Hash/Eq now operate on raw bytes instead of format!("{:?}"), which
is both faster and more correct.
- Values are decoded on demand via get(index) or iter(), acceptable
for primary keys with 1–3 columns.
Changes:
- Add primary_key.rs with encode/decode, Hash, Eq, Debug, Iterator
- Update usearch.rs: closures return owned CqlValue instead of refs
- Update httproutes.rs: use PrimaryKey::len()/get() API
- Remove old PrimaryKey struct and format-based Hash impl from lib.rs
Fixes: VECTOR-526
| /// and more correct than the previous `format!("{:?}")` hashing approach. | ||
| #[derive(Clone)] | ||
| pub struct PrimaryKey { | ||
| data: Arc<[u8]>, |
There was a problem hiding this comment.
Arc can be outside of PrimaryKey. It seems that Arc<PrimaryKey> has clearer semantics than single PrimaryKey
| } | ||
|
|
||
| impl Eq for PrimaryKey {} | ||
| // PrimaryKey is defined in primary_key.rs and re-exported above. |
There was a problem hiding this comment.
I would wait a little with this change - as there is work in progress on changes to the model of cache for db column values. There could be finally an enum of vectors instead of vector of enums.
| impl Hash for PrimaryKey { | ||
| fn hash<H: Hasher>(&self, state: &mut H) { | ||
| self.data.hash(state); | ||
| } | ||
| } |
There was a problem hiding this comment.
There is a need for a partition key which is a subset of primary key type, so there should be a method for getting a hash from the first n columns of the primary key.
| } | ||
| CqlValue::TinyInt(v) => { | ||
| buf.push(TAG_TINY_INT); | ||
| buf.push(*v as u8); |
There was a problem hiding this comment.
cast from i8 to u8 loose precision
|
|
||
| TAG_BOOLEAN => (CqlValue::Boolean(data[1] != 0), 2), | ||
|
|
||
| TAG_TINY_INT => (CqlValue::TinyInt(data[1] as i8), 2), |
There was a problem hiding this comment.
casting u8 to i8 loose precision
ewienik
left a comment
There was a problem hiding this comment.
Let's wait a little with this PR
| @@ -0,0 +1,556 @@ | |||
| /* | |||
| * Copyright 2025-present ScyllaDB | |||
| CqlValue::Uuid(_) | CqlValue::Timeuuid(_) => 17, | ||
| CqlValue::Inet(IpAddr::V4(_)) => 5, // tag + 4 octets | ||
| CqlValue::Inet(IpAddr::V6(_)) => 17, // tag + 16 octets | ||
| CqlValue::Text(s) => 1 + 4 + s.len(), |
There was a problem hiding this comment.
"1 + 4" - these are magic numbers - should be rather some meaningful constants.
| (CqlValue::SmallInt(v), 3) | ||
| } | ||
| TAG_INT => { | ||
| let v = i32::from_le_bytes(data[1..5].try_into().unwrap()); |
There was a problem hiding this comment.
"1..5" - this is magic range. The numbers should be meaningful constants.
Replace PrimaryKey(Vec) with a serialized byte buffer representation. CqlValue is 72 bytes per element due to enum sizing, wasting significant memory for typical primary key types like Int (4 bytes of data, 68 bytes of padding).
The new PrimaryKey stores values as Box<[u8]> with a 1-byte type tag followed by the minimal binary encoding for each value:
Int: 5 bytes (was 72) — 14× smaller
BigInt: 9 bytes (was 72) — 8× smaller
Uuid: 17 bytes (was 72) — 4× smaller
Text(s): 5+len (was 72) — variable
For a single Int primary key column, total per-row memory drops from ~96 bytes (24 Vec overhead + 72 CqlValue) to ~22 bytes (16 Box<[u8]> + 6 heap), a 4× improvement. With millions of indexed rows stored in the BiMap, this substantially reduces RSS.
Additional improvements:
Changes:
Fixes: VECTOR-526