Skip to content

feat(openclaw-plugin): add multimodal attachment support to memory_store#837

Open
zhangMINGkeq1 wants to merge 1 commit intovolcengine:mainfrom
zhangMINGkeq1:feature/multimodal-memory-attachments
Open

feat(openclaw-plugin): add multimodal attachment support to memory_store#837
zhangMINGkeq1 wants to merge 1 commit intovolcengine:mainfrom
zhangMINGkeq1:feature/multimodal-memory-attachments

Conversation

@zhangMINGkeq1
Copy link

@zhangMINGkeq1 zhangMINGkeq1 commented Mar 21, 2026

Summary

Adds optional attachments parameter to the memory_store tool in the OpenClaw plugin, enabling agents to associate local files (images, documents, JSON, etc.) with memories.

Type of Change

  • New feature (feat)
  • Bug fix (fix)
  • Documentation (docs)
  • Refactoring (refactor)
  • Other

Motivation

Agents frequently produce output files (images, configs, JSON results) that are semantically linked to a task. Without this feature, there is no way to retrieve "the file from last time" — the memory exists but the artifact is lost.

OpenViking already has a powerful Resources API with VLM description and multimodal embedding (via doubao-embedding-vision). This PR simply wires the OpenClaw plugin to use it.

Usage

// Store a memory with associated files
await memory_store({
  text: "Completed kitchen spot-difference puzzle, 25 spots",
  attachments: ["/tmp/base.png", "/tmp/result.png", "/tmp/spots.json"]
})
// details.attachments = [{uri: "viking://resources/...", mime_type: "image/png", abstract: "A kitchen scene with..."}]

Changes

  • examples/openclaw-plugin/client.ts (+154 lines):

    • AttachmentItem type definition
    • storeAttachments(filePaths) — uploads files via temp_upload → addResource, returns URI + mime_type + VLM abstract
    • hashFile() — streaming SHA-256 for content-addressed dedup (no OOM on large files)
    • getMimeType() — extension-based MIME type lookup
  • examples/openclaw-plugin/index.ts (+55 lines):

    • memory_store accepts optional attachments: string[] parameter
    • Uploads attachments before session creation
    • Reports success/failure counts in both status text and details

Design Decisions

  1. Content-addressed storage: URI = viking://resources/attachments/{sha256_16}_{filename} — same file uploaded twice gets the same URI
  2. Concurrency limit = 3: Prevents VLM avalanche when storing many files at once
  3. Independent 60s timeout per file: One slow upload does not block others
  4. Graceful degradation: Individual file failures return null, never crash the batch. Caller sees attachmentsFailed count
  5. Clean separation: Memory text stays pure semantic content. No URIs or attachment markers embedded in text (would pollute embedding vector space). Attachments are returned via details.attachments

Scope (V0.5)

This PR is intentionally scoped:

  • ✅ Upload files to Viking Resources (VLM description + multimodal embedding)
  • ✅ Return structured metadata to the caller
  • ✅ Fully backward compatible (no attachments = identical behavior)
  • ⬜ Memory↔Resource first-class linking (requires Viking schema changes — future PR)
  • memory_recall returning attachments (depends on schema linking — future PR)

Testing

# Manual test: store with attachments
curl -X POST http://localhost:1933/api/v1/resources/temp_upload -F "file=@test.png"
# Verify VLM abstract and multimodal embedding are generated

Backward compatibility verified: existing memory_store calls without attachments parameter work identically.

Checklist

  • Code follows project style guidelines
  • Backward compatible — no breaking changes
  • Documentation updated in commit message
  • Tests added for new functionality (will add in follow-up if reviewers request)
  • All existing tests pass (no test files modified)

@CLAassistant
Copy link

CLAassistant commented Mar 21, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link

Failed to generate code suggestions for PR

Add optional `attachments` parameter to `memory_store` tool, enabling
agents to associate local files (images, documents, JSON, etc.) with
memories. Files are uploaded to Viking Resources via temp_upload +
addResource, which triggers VLM description and multimodal embedding
automatically.

Key design decisions:
- Content-addressed storage (SHA-256 hash in URI) for deduplication
- Concurrency limited to 3 to prevent VLM avalanche on large batches
- Individual file failures return null, never crash the whole batch
- Each upload gets independent 60s timeout via AbortController
- Memory text stays pure semantic content (no URIs embedded)
- Attachments returned in details.attachments for caller consumption

This is V0.5: files are uploaded and searchable via Viking Resources,
but Memory↔Resource linking requires Viking schema changes (V1).

Changes:
- client.ts: add AttachmentItem type, storeAttachments() method,
  hashFile() helper, getMimeType() with common MIME mappings
- index.ts: memory_store accepts optional attachments string array,
  uploads before session creation, reports success/failure counts
@zhangMINGkeq1 zhangMINGkeq1 force-pushed the feature/multimodal-memory-attachments branch 8 times, most recently from c4a67c9 to f731c98 Compare March 21, 2026 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants