Skip to content

feat: Chat media support — multi-modal messages and media rendering in tool responses #922

@chubes4

Description

@chubes4

Context

Data Machine has a full media abilities layer:

  • datamachine/upload-media — upload/fetch images and videos
  • datamachine/generate-image — AI image generation
  • datamachine/optimize-images — image optimization
  • Alt text, templates, platform presets
  • ImageGeneration chat tool — AI can generate images during conversation

But the chat backend only supports string content in messages. When the AI generates an image via tool call, the URL is buried in the tool result JSON. And users cannot send images to the AI for vision/analysis.

Changes needed

1. Multi-modal message format

ConversationManager::buildConversationMessage() types $content as string. For multi-modal AI providers (Anthropic vision, OpenAI vision), content must be an array of blocks:

// Current:
'content' => 'Describe this image'

// Needed:
'content' => [
    ['type' => 'text', 'text' => 'Describe this image'],
    ['type' => 'image_url', 'image_url' => ['url' => 'https://...']],
]
  • Update ConversationManager::buildConversationMessage() to accept string|array content
  • Update ChatOrchestrator to build multi-modal messages when attachments are present
  • Ensure AI provider HTTP layer passes content blocks through to the API correctly

2. File upload in chat endpoint

POST /datamachine/v1/chat currently accepts only { message: string }.

  • Register attachments parameter (or handle multipart/form-data)
  • Process uploaded files via wp_handle_upload() or media_handle_sideload()
  • Store attachment WordPress media IDs in message metadata
  • Build multi-modal content blocks from uploaded files + text message

3. Media in tool responses

When tools like image_generation return media URLs, the response needs structured metadata so the UI can render it:

  • Standardize a tool response format for media: { type: 'image', url: '...', alt: '...' }
  • Update ImageGeneration tool to return this format
  • Ensure tool results with media are stored with renderable metadata in session messages

4. Message storage

datamachine_chat_sessions.messages stores JSON arrays of {role, content, metadata}.

  • Support attachments array per message in the stored JSON
  • Each attachment: { id, type, url, thumbnail_url, filename, mime_type, size }
  • Backward compatible — existing text-only messages continue working

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions