Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
157 commits
Select commit Hold shift + click to select a range
ec93173
memory plans
leshy Mar 4, 2026
5c23e89
spec iteration
leshy Mar 5, 2026
dc0f946
spec iteration
leshy Mar 5, 2026
69540aa
query objects spec
leshy Mar 5, 2026
cc5939b
mem3 iteration
leshy Mar 5, 2026
a5d3f3c
live/passive transforms
leshy Mar 5, 2026
c87d955
initial pass on memory
leshy Mar 5, 2026
76fbea1
transform materialize
leshy Mar 5, 2026
ef7fe1d
sqlite schema: decomposed pose columns, separate payload table, R*Tre…
leshy Mar 5, 2026
936e2ce
JpegCodec for Image storage (43x smaller), ingest helpers, QualityWin…
leshy Mar 5, 2026
852a7e9
Wire parent_id lineage through transforms for automatic source data p…
leshy Mar 5, 2026
d44aaaf
Wire parent_stream into _streams registry, add tasks.md gap analysis
leshy Mar 5, 2026
fa31471
Implement project_to() for cross-stream lineage projection
leshy Mar 5, 2026
41a06f0
Make search_embedding auto-project to source stream
leshy Mar 5, 2026
bce7586
CaptionTransformer + Florence2 batch fix
leshy Mar 5, 2026
8ad469e
ObservationSet: fetch() returns list-like + stream-like result set
leshy Mar 5, 2026
2163179
search_embedding accepts str/image with auto-embedding
leshy Mar 5, 2026
1bdd496
Add sqlite_vec to mypy ignore list (no type stubs available)
leshy Mar 5, 2026
5130511
Fix mypy + pytest errors across memory and memory_old modules
leshy Mar 5, 2026
219341f
Improve similarity heatmap with normalized values and distance spread
leshy Mar 5, 2026
e7f3fcd
Remove plans/ from tracking (kept locally)
leshy Mar 5, 2026
92ee380
Address Greptile review: SQL injection guards, distance ordering, stubs
leshy Mar 5, 2026
de0efb9
Add memory Rerun visualization, fix stream iteration, update docs
leshy Mar 5, 2026
314b4d3
Rename run_e2e_export → test_e2e_export, delete viz.py + run_viz_demo…
leshy Mar 5, 2026
0b2983c
added docs
leshy Mar 5, 2026
ec499cd
removed tasks.md
leshy Mar 5, 2026
e344498
Optimize memory pipeline: TurboJPEG codec, sharpness downsample, thre…
leshy Mar 5, 2026
bf0b79a
text embedding transformer
leshy Mar 6, 2026
b4f9f96
cleanup
leshy Mar 6, 2026
19a8db3
Use Codec protocol type instead of concrete union, remove dead _pose_…
leshy Mar 6, 2026
2cbf162
correct db sessions
leshy Mar 6, 2026
74df2de
record module cleanup
leshy Mar 6, 2026
e039f90
memory elements are now Resource, simplification of memory Module
leshy Mar 7, 2026
bd3a572
Rename stream.appended to stream.observable()/subscribe()
leshy Mar 7, 2026
24a13de
repr, embedding fetch simplification
leshy Mar 7, 2026
d5db010
Make Observation generic: Observation[T] with full type safety
leshy Mar 7, 2026
2d0bedc
Simplify Stream._clone with copy.copy, remove subclass overrides
leshy Mar 7, 2026
ce9f5e8
loader refactor
leshy Mar 7, 2026
33ad5e1
Extract backend.load_data(), add stream.load_data(obs) public API
leshy Mar 7, 2026
1b67959
Add rich colored __str__ to Stream and Filter types
leshy Mar 7, 2026
2f66bc0
Unify __repr__ and __str__ via _rich_text().plain, remove duplicate r…
leshy Mar 7, 2026
e9078a9
renamed types to type
leshy Mar 7, 2026
2b79919
one -> first, time range
leshy Mar 7, 2026
dfd06c4
getitem for streams
leshy Mar 8, 2026
4734acf
readme sketch
leshy Mar 8, 2026
d6e5efc
bigoffice db in lfs, sqlite accepts Path
leshy Mar 8, 2026
31cf8a8
projection transformers
leshy Mar 8, 2026
04337db
stream info removed, stream accessor helper, TS unique per stream
leshy Mar 8, 2026
6fc6e8d
Add colored summary() output and model= param to search_embedding
leshy Mar 8, 2026
a6a06e1
stream delete
leshy Mar 8, 2026
b9af997
florence model detail settings and prefix filter
leshy Mar 8, 2026
1e42408
extracted formatting to a separate file
leshy Mar 8, 2026
0c09d49
extract rich text rendering to formatting.py, add Stream.name, fix st…
leshy Mar 8, 2026
a954f79
matching based on streams
leshy Mar 8, 2026
ab48171
projection experiments
leshy Mar 8, 2026
a80bbb9
projection bugfix
leshy Mar 8, 2026
c7522d3
observationset typing fix
leshy Mar 8, 2026
decd090
detections, cleanup
leshy Mar 8, 2026
f51923d
mini adjustments
leshy Mar 8, 2026
9edcbef
transform chaining
leshy Mar 9, 2026
24c708d
memory2: lazy pull-based stream system
leshy Mar 9, 2026
363f094
memory2: fix typing — zero type:ignore, proper generics
leshy Mar 9, 2026
90a636a
memory2: fix .live() on transform streams — reject with clear error
leshy Mar 9, 2026
2f029e4
memory2: replace custom Disposable with rxpy DisposableBase
leshy Mar 9, 2026
4061b8f
memory2: extract filters and StreamQuery from type.py into filter.py
leshy Mar 9, 2026
a44d870
memory2: store transform on Stream node, not as source tuple
leshy Mar 9, 2026
09ada62
memory2: move live logic from Stream into Backend via StreamQuery
leshy Mar 9, 2026
9ef10ab
memory2: extract impl/ layer with MemoryStore and SqliteStore scaffold
leshy Mar 10, 2026
87b94ad
memory2: add buffer.py docstring and extract buffer tests to test_buf…
leshy Mar 11, 2026
8070379
memory2: add Codec protocol and grid test for store implementations
leshy Mar 11, 2026
dde8017
memory2: add codec implementations (pickle, lcm, jpeg) with grid tests
leshy Mar 11, 2026
7ce2364
resource: add context manager to Resource; make Store/Session Resources
leshy Mar 11, 2026
d5dde81
resource: add CompositeResource with owned disposables
leshy Mar 11, 2026
9d37f1d
memory2: add BlobStore ABC with File and SQLite implementations
leshy Mar 11, 2026
a83d7a2
memory2: move blobstore.md into blobstore/ as module readme
leshy Mar 11, 2026
b6c9543
memory2: add embedding layer, vector/text search, live safety guards
leshy Mar 11, 2026
94aa659
memory2: add documentation for streaming model, codecs, and backends
leshy Mar 11, 2026
1dc68b7
query application refactor
leshy Mar 11, 2026
4d31779
memory2: replace LiveBackend with pluggable LiveChannel, add Configur…
leshy Mar 11, 2026
690c5ec
memory2: make backends Configurable, add session→stream config propag…
leshy Mar 11, 2026
f73d8d4
memory2: wire VectorStore into ListBackend, add MemoryVectorStore
leshy Mar 11, 2026
c655739
memory2: wire BlobStore into ListBackend with lazy/eager blob loading
leshy Mar 11, 2026
5b565db
memory2: allow bare generator functions as stream transforms
leshy Mar 11, 2026
da676f6
memory2: update docs to reflect current API
leshy Mar 11, 2026
a0c9c70
memory2: implement full SqliteBackend with vec0 vector search, JSONB …
leshy Mar 11, 2026
0b09404
memory2: stream rows via cursor pagination instead of fetchall()
leshy Mar 11, 2026
df076ce
memory2: add lazy/eager blob tests and spy store delegation grid tests
leshy Mar 11, 2026
bcb98bd
memory2: add R*Tree spatial index for NearFilter SQL pushdown, add e2…
leshy Mar 11, 2026
3c01a6e
auto index tags
leshy Mar 11, 2026
f368297
memory/stream str, and observables
leshy Mar 12, 2026
f89ad3f
live stream is a resource
leshy Mar 12, 2026
a32b44d
readme work
leshy Mar 12, 2026
db23275
streams and intro
leshy Mar 12, 2026
9b14894
renamed readme to arch
leshy Mar 12, 2026
67a6a83
Rename memory2 → memory, fix all imports and type errors
leshy Mar 12, 2026
f35cfe5
Merge remote-tracking branch 'origin/dev' into feat/memory/embedding
leshy Mar 12, 2026
1a6c8a1
Revert memory rename: restore memory/ from dev, new code lives in mem…
leshy Mar 12, 2026
2076ba4
Remove stray old memory module references
leshy Mar 12, 2026
05c091d
Remove LFS test databases from PR
leshy Mar 12, 2026
0570bc3
Address review findings: SQL injection guards, type fixes, cleanup
leshy Mar 12, 2026
2dcfcd9
Revert detection type changes: keep image as required field
leshy Mar 12, 2026
e88e0e5
add libturbojpeg to docker image
leshy Mar 12, 2026
f29f766
Make turbojpeg import lazy so tests skip gracefully in CI
leshy Mar 12, 2026
c56e283
Give each SqliteBackend its own connection for WAL-mode concurrency
leshy Mar 12, 2026
93d6afe
Block search_text on SqliteBackend to prevent full table scans
leshy Mar 12, 2026
317562c
Catch RuntimeError from missing turbojpeg native library in codec tests
leshy Mar 12, 2026
5a418c6
pr comments
leshy Mar 12, 2026
99c3f3e
occupancy change undo
leshy Mar 13, 2026
1103e3d
tests cleanup
leshy Mar 13, 2026
32d75d8
compression codec added, new bigoffice db uploaded
leshy Mar 13, 2026
b7e25a9
correct jpeg codec
leshy Mar 13, 2026
c2e91d8
PR comments cleanup
leshy Mar 13, 2026
8be106a
blobstore stream -> stream_name
leshy Mar 13, 2026
1e28b50
vectorstore stream -> stream_name
leshy Mar 13, 2026
6f3ef51
resource typing fixes
leshy Mar 13, 2026
30959af
move type definitions into dimos/memory2/type/ subpackage
leshy Mar 13, 2026
367fa4e
lz4 codec included, utils/ cleanup
leshy Mar 13, 2026
a0becc6
Merge remote-tracking branch 'origin/dev' into feat/memory2
leshy Mar 13, 2026
02a2332
migrated stores to a new config system
leshy Mar 13, 2026
b3e7236
config fix
leshy Mar 13, 2026
895f7d6
rewrite
leshy Mar 13, 2026
aa841f1
update memory2 docs to reflect new architecture
leshy Mar 13, 2026
63eba26
rename LiveChannel → Notifier, SubjectChannel → SubjectNotifier
leshy Mar 13, 2026
4e38491
rename Index → MetadataStore, drop Backend property boilerplate, simp…
leshy Mar 13, 2026
2e05a67
rename MetadataStore → ObservationStore
leshy Mar 13, 2026
3b298f0
self-contained SQLite components with dual-mode constructors (conn/path)
leshy Mar 13, 2026
ee6d541
move ObservationStore classes into observationstore/ directory
leshy Mar 13, 2026
28705f6
add RegistryStore to persist fully-resolved backend config per stream
leshy Mar 14, 2026
13db243
move ABCs from type/backend.py into their own dirs, rename livechanne…
leshy Mar 14, 2026
597cfa5
move serialize() to base classes, drop deserialize() in favor of cons…
leshy Mar 14, 2026
285f72e
move _create_backend to Store base, MemoryStore becomes empty subclass
leshy Mar 14, 2026
f7ddfb1
move connection init from __init__ to start(), make ObservationStore …
leshy Mar 14, 2026
3d08c15
rename impl/ → store/, move store.py → store/base.py
leshy Mar 14, 2026
91d3a0f
remove section separator comments from memory2/
leshy Mar 14, 2026
dfeb6f8
remove __init__.py re-exports, use direct module imports
leshy Mar 14, 2026
3f13abf
delete livechannel/ backwards-compat shim
leshy Mar 14, 2026
f8fef6f
simplify RegistryStore: drop legacy schema migration
leshy Mar 14, 2026
9c10dda
use context managers in standalone component tests
leshy Mar 14, 2026
e8f9318
delete all __init__.py files from memory2/
leshy Mar 14, 2026
e99973f
make all memory2 sub-store components Configurable
leshy Mar 14, 2026
1dae92b
add open_disposable_sqlite_connection and use it everywhere
leshy Mar 14, 2026
4c596c4
add StreamAccessor for attribute-style stream access on Store
leshy Mar 14, 2026
c618609
small cleanups: BlobStore.delete raises KeyError on missing, drop _MI…
leshy Mar 14, 2026
05161f7
checkout mapping/occupancy/gradient.py from dev
leshy Mar 14, 2026
673eb12
limit opencv threads to 2 by default, checkout worker.py from dev
leshy Mar 14, 2026
cd7abd2
test for magic accessor
leshy Mar 14, 2026
b6bf487
Merge remote-tracking branch 'origin/dev' into feat/memory2
leshy Mar 14, 2026
e946263
ci/pr comments
leshy Mar 14, 2026
5c026dc
widen flaky pointcloud AABB tolerance from 0.1 to 0.2
leshy Mar 14, 2026
88879bb
suppress mypy false positive on scipy distance_transform_edt return type
leshy Mar 14, 2026
8759ec9
ci test fixes
leshy Mar 14, 2026
a654a27
sam mini PR comments
leshy Mar 14, 2026
137c6a0
replace Generator[T, None, None] with Iterator[T] in memory2 tests
leshy Mar 14, 2026
26561d8
fix missing TypeVar import in subject.py
leshy Mar 14, 2026
d4d19fc
skipping turbojpeg stuff in CI
leshy Mar 14, 2026
9728085
removed db from lfs for now
leshy Mar 14, 2026
cabc072
turbojpeg
leshy Mar 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions dimos/core/library_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright 2026 Dimensional Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Process-wide library defaults.
# Modules that need different settings can override in their own start().


def apply_library_config() -> None:
"""Apply process-wide library defaults. Call once per process."""
# Limit OpenCV internal threads to avoid idle thread contention.
try:
import cv2

cv2.setNumThreads(2)
except ImportError:
pass
45 changes: 43 additions & 2 deletions dimos/core/resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,19 @@
# See the License for the specific language governing permissions and
# limitations under the License.

from abc import ABC, abstractmethod
from __future__ import annotations

from abc import abstractmethod
from typing import TYPE_CHECKING, Self

class Resource(ABC):
if TYPE_CHECKING:
from types import TracebackType

from reactivex.abc import DisposableBase
from reactivex.disposable import CompositeDisposable


class Resource(DisposableBase):
@abstractmethod
def start(self) -> None: ...

Expand Down Expand Up @@ -43,3 +52,35 @@ def dispose(self) -> None:

"""
self.stop()

def __enter__(self) -> Self:
self.start()
return self

def __exit__(
self,
exctype: type[BaseException] | None,
excinst: BaseException | None,
exctb: TracebackType | None,
) -> None:
self.stop()


class CompositeResource(Resource):
"""Resource that owns child disposables, disposed on stop()."""

_disposables: CompositeDisposable

def __init__(self) -> None:
self._disposables = CompositeDisposable()

def register_disposables(self, *disposables: DisposableBase) -> None:
"""Register child disposables to be disposed when this resource stops."""
for d in disposables:
self._disposables.add(d)

def start(self) -> None:
pass

def stop(self) -> None:
self._disposables.dispose()
2 changes: 2 additions & 0 deletions dimos/core/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from typing import TYPE_CHECKING, Any

from dimos.core.global_config import GlobalConfig, global_config
from dimos.core.library_config import apply_library_config
from dimos.utils.logging_config import setup_logger
from dimos.utils.sequential_ids import SequentialIds

Expand Down Expand Up @@ -292,6 +293,7 @@ def _suppress_console_output() -> None:


def _worker_entrypoint(conn: Connection, worker_id: int) -> None:
apply_library_config()
instances: dict[int, Any] = {}

try:
Expand Down
2 changes: 1 addition & 1 deletion dimos/mapping/occupancy/gradient.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def gradient(
distance_cells = ndimage.distance_transform_edt(1 - obstacle_map)

# Convert to meters and clip to max distance
distance_meters = np.clip(distance_cells * occupancy_grid.resolution, 0, max_distance)
distance_meters = np.clip(distance_cells * occupancy_grid.resolution, 0, max_distance) # type: ignore[operator]

# Invert and scale to 0-100 range
# Far from obstacles (max_distance) -> 0
Expand Down
114 changes: 114 additions & 0 deletions dimos/memory2/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# memory

Observation storage and streaming layer for DimOS. Pull-based, lazy, composable.

## Architecture

```
Live Sensor Data
Store → Stream → [filters / transforms / terminals] → Stream → [filters / transforms / terminals] → Stream → Live hooks
↓ ↓ ↓
Backend (ObservationStore + BlobStore + VectorStore + Notifier) Backend In Memory
```

**Store** owns a storage location (file, in-memory) and directly manages named streams. **Stream** is the query/iteration surface — lazy until a terminal is called. **Backend** is a concrete composite that orchestrates ObservationStore + BlobStore + VectorStore + Notifier for each stream.

Supporting Systems:

- BlobStore — separates large payloads from metadata. FileBlobStore (files on disk) and SqliteBlobStore (blob table per stream). Supports lazy loading.
- Codecs — codec_for() auto-selects: JpegCodec for images (TurboJPEG, ~10-20x compression), LcmCodec for DimOS messages, PickleCodec fallback.
- Transformers — Transformer[T,R] ABC wrapping iterator-to-iterator. EmbedImages/EmbedText enrich observations with embeddings. QualityWindow keeps best per time window.
- Backpressure Buffers — KeepLast, Bounded, DropNew, Unbounded — bridge push/pull for live mode.


## Modules

| Module | What |
|----------------|-------------------------------------------------------------------|
| `stream.py` | Stream node — filters, transforms, terminals |
| `backend.py` | Concrete Backend composite (ObservationStore + Blob + Vector + Live) |
| `store.py` | Store, StoreConfig |
| `transform.py` | Transformer ABC, FnTransformer, FnIterTransformer, QualityWindow |
| `buffer.py` | Backpressure buffers for live mode (KeepLast, Bounded, Unbounded) |
| `embed.py` | EmbedImages / EmbedText transformers |

## Subpackages

| Package | What | Docs |
|-----------------|------------------------------------------------------|--------------------------------------------------|
| `type/` | Observation, EmbeddedObservation, Filter/StreamQuery | |
| `store/` | Store ABC + implementations (MemoryStore, SqliteStore) | [store/README.md](store/README.md) |
| `notifier/` | Notifier ABC + SubjectNotifier | |
| `blobstore/` | BlobStore ABC + implementations (file, sqlite) | [blobstore/blobstore.md](blobstore/blobstore.md) |
| `codecs/` | Encode/decode for storage (pickle, JPEG, LCM) | [codecs/README.md](codecs/README.md) |
| `vectorstore/` | VectorStore ABC + implementations (memory, sqlite) | |
| `observationstore/` | ObservationStore Protocol + implementations | |

## Docs

| Doc | What |
|-----|------|
| [streaming.md](streaming.md) | Lazy vs materializing vs terminal — evaluation model, live safety |
| [embeddings.md](embeddings.md) | Embedding layer design — EmbeddedObservation, vector search, EmbedImages/EmbedText |
| [blobstore/blobstore.md](blobstore/blobstore.md) | BlobStore architecture — separate payload storage from metadata |

## Query execution

`StreamQuery` holds the full query spec (filters, text search, vector search, ordering, offset/limit). It also provides `apply(iterator)` — a Python-side execution path that runs all operations as in-memory predicates, brute-force cosine, and list sorts.

This is the **default fallback**. ObservationStore implementations are free to push down operations using store-specific strategies instead:

| Operation | Python fallback (`StreamQuery.apply`) | Store push-down (example) |
|----------------|---------------------------------------|----------------------------------|
| Filters | `filter.matches()` predicates | SQL WHERE clauses |
| Text search | Case-insensitive substring | FTS5 full-text index |
| Vector search | Brute-force cosine similarity | vec0 / FAISS ANN index |
| Ordering | `sorted()` materialization | SQL ORDER BY |
| Offset / limit | `islice()` | SQL OFFSET / LIMIT |

`ListObservationStore` delegates entirely to `StreamQuery.apply()`. `SqliteObservationStore` translates the query into SQL and only falls back to Python for operations it can't express natively.

Transform-sourced streams (post `.transform()`) always use `StreamQuery.apply()` since there's no index to push down to.

## Quick start

```python
from dimos.memory2 import MemoryStore

store = MemoryStore()
images = store.stream("images")

# Write
images.append(frame, ts=time.time(), pose=(x, y, z), tags={"camera": "front"})

# Query
recent = images.after(t).limit(10).fetch()
nearest = images.near(pose, radius=2.0).fetch()
latest = images.last()

# Transform (class or bare generator function)
edges = images.transform(Canny()).save(store.stream("edges"))

def running_avg(upstream):
total, n = 0.0, 0
for obs in upstream:
total += obs.data; n += 1
yield obs.derive(data=total / n)
avgs = stream.transform(running_avg).fetch()

# Live
for obs in images.live().transform(process):
handle(obs)

# Embed + search
images.transform(EmbedImages(clip)).save(store.stream("embedded"))
results = store.stream("embedded").search(query_vec, k=5).fetch()
```

## Implementations

| ObservationStore | Status | Storage |
|-----------------|----------|----------------------------------------|
| `ListObservationStore` | Complete | In-memory (lists + brute-force search) |
| `SqliteObservationStore` | Complete | SQLite (WAL, FTS5, vec0) |
Loading
Loading