Background
Spun off from #14 (device layer refactoring). The data services layer needs engineering attention independent of the hardware abstraction work.
Current State
- DataStore (
gently/core/data_store.py): UID-based persistence with multiple backend options (DatabrokerStore, TiledStore)
- VizServer (
gently/visualization/server.py): Serves volumes via HTTP, maintains its own caching
- ImageManager (
gently/agent/image_manager.py): Agent-side data access, bridges DataStore and agent tools
Areas Needing Work
1. DataStore Interface Cleanup
- Current interface grew organically; some methods are diSPIM-specific
- Need clear separation between:
- Core operations (store, retrieve, delete, query)
- Backend-specific implementations
- Lineage/provenance tracking
2. Unified Storage Backend
Current state has multiple storage patterns:
- TIFF files (raw volumes)
- Zarr (chunked array storage)
- Databroker (Bluesky event model)
- In-memory caches
Questions to resolve:
- Should we standardize on one format for volumes?
- How do we handle format conversion transparently?
- What's the right chunking strategy for large volumes?
3. Streaming Access Patterns
For large volumes (200+ slices × 2048 × 2048):
- Current: Load entire volume into memory
- Target: Stream slices on demand, memory-map when possible
- Affects: VizServer slice endpoints, agent analysis tools
4. Garbage Collection / Retention Policies
- When should old data be cleaned up?
- Per-session retention vs. global policies
- User-configurable cleanup (max age, max size, keep N per session)
- Crash recovery: reconcile index with actual files on disk
Relationship to #14
The SharedMemoryPool from #14 will become a key component here:
- Pool handles hot data (recently acquired volumes)
- DataStore indexes all data (hot and cold)
- VizServer accesses through unified interface
This issue focuses on the DataStore/VizServer side of that integration.
Proposed Tasks
Files Involved
| File |
Role |
gently/core/data_store.py |
Primary data persistence |
gently/visualization/server.py |
Volume serving |
gently/agent/image_manager.py |
Agent data access |
gently/core/memory_pool.py |
SharedMemoryPool (from #14) |
cc @subindevs @pskeshu
Background
Spun off from #14 (device layer refactoring). The data services layer needs engineering attention independent of the hardware abstraction work.
Current State
gently/core/data_store.py): UID-based persistence with multiple backend options (DatabrokerStore, TiledStore)gently/visualization/server.py): Serves volumes via HTTP, maintains its own cachinggently/agent/image_manager.py): Agent-side data access, bridges DataStore and agent toolsAreas Needing Work
1. DataStore Interface Cleanup
2. Unified Storage Backend
Current state has multiple storage patterns:
Questions to resolve:
3. Streaming Access Patterns
For large volumes (200+ slices × 2048 × 2048):
4. Garbage Collection / Retention Policies
Relationship to #14
The SharedMemoryPool from #14 will become a key component here:
This issue focuses on the DataStore/VizServer side of that integration.
Proposed Tasks
Files Involved
gently/core/data_store.pygently/visualization/server.pygently/agent/image_manager.pygently/core/memory_pool.pycc @subindevs @pskeshu