title	API Specification
version	1.0.0-draft
last_updated	2026-01-17
status	Draft
category	API Contract

API Specification

Version: 1.0.0-draft Base URL: https://api.example.com/api Content Type: application/json Auth: Bearer token in Authorization header

Note: This specification is framework and implementation agnostic. All endpoints are defined by their HTTP interface contract, not internal implementation details.

Authentication
Documents
Conversations
Messages
Search
Summaries
User Profile
Configuration
Health Check
Error Handling
Rate Limiting

Authentication

All authenticated endpoints require a valid JWT token in the Authorization header:

Authorization: Bearer <token>

POST /api/auth/register

Request Body:

{
  "email": "user@example.com",
  "password": "securepassword",
  "displayName": "John Doe"  // optional
}

Response: 201 Created

{
  "user": {
    "id": "usr_123abc",
    "email": "user@example.com",
    "displayName": "John Doe",  // or email if not provided
    "createdAt": "2024-01-15T10:00:00Z"
  },
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

POST /api/auth/login

Authenticate an existing user.

Request Body:

{
  "email": "user@example.com",
  "password": "securepassword"
}

Response: 200 OK

{
  "user": {
    "id": "usr_123abc",
    "email": "user@example.com",
    "displayName": "John Doe"
  },
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Documents

POST /api/documents

Upload or create a new document.

Auth Required: Yes

Request:

For file upload, use multipart/form-data
For text document, use application/json

Multipart Upload:

POST /api/documents
Content-Type: multipart/form-data

file: [PDF/TXT/MD file]
title: "My Document" (optional)
tags: ["research", "ai"] (optional)

JSON Request (text document):

{
  "title": "My Notes",
  "content": "Document content in plain text or markdown...",
  "contentType": "text/markdown",
  "tags": ["notes", "personal"]
}

Response: 201 Created

{
  "document": {
    "id": "doc_456def",
    "userId": "usr_123abc",
    "title": "My Document",
    "contentType": "application/pdf",
    "size": 245680,
    "status": "processing",
    "tags": ["research", "ai"],
    "createdAt": "2024-01-15T10:30:00Z",
    "updatedAt": "2024-01-15T10:30:00Z",
    "processedAt": null,
    "chunkCount": null,
    "url": "https://storage.example.com/documents/doc_456def.pdf"
  }
}

Status Field:

processing - Document is being processed/embedded
ready - Document is ready for queries
failed - Processing failed

GET /api/documents

List all documents for the authenticated user.

Auth Required: Yes

Query Parameters:

limit (optional, default: 20, max: 100) - Number of documents per page
offset (optional, default: 0) - Pagination offset
tag (optional) - Filter by tag
status (optional) - Filter by status (processing, ready, failed)
sortBy (optional, default: createdAt) - Sort field (createdAt, updatedAt, title)
sortOrder (optional, default: desc) - Sort order (asc, desc)

Response: 200 OK

{
  "documents": [
    {
      "id": "doc_456def",
      "title": "My Document",
      "contentType": "application/pdf",
      "size": 245680,
      "status": "ready",
      "tags": ["research", "ai"],
      "createdAt": "2024-01-15T10:30:00Z",
      "updatedAt": "2024-01-15T10:30:00Z",
      "processedAt": "2024-01-15T10:31:45Z",
      "chunkCount": 23
    }
  ],
  "pagination": {
    "total": 45,
    "limit": 20,
    "offset": 0,
    "hasMore": true
  }
}

GET /api/documents/:id

Get a single document by ID.

Auth Required: Yes

Response: 200 OK

{
  "document": {
    "id": "doc_456def",
    "userId": "usr_123abc",
    "title": "My Document",
    "content": "Full text content if text-based...",
    "contentType": "application/pdf",
    "size": 245680,
    "status": "ready",
    "tags": ["research", "ai"],
    "createdAt": "2024-01-15T10:30:00Z",
    "updatedAt": "2024-01-15T10:30:00Z",
    "processedAt": "2024-01-15T10:31:45Z",
    "chunkCount": 23,
    "url": "https://storage.example.com/documents/doc_456def.pdf",
    "metadata": {
      "author": "John Doe",
      "pages": 12
    }
  }
}

PUT /api/documents/:id

Update a document's metadata or content.

Auth Required: Yes

Request Body:

{
  "title": "Updated Title",
  "content": "Updated content for text documents",
  "tags": ["updated", "tags"]
}

Response: 200 OK

{
  "document": {
    "id": "doc_456def",
    "title": "Updated Title",
    "status": "processing",
    "updatedAt": "2024-01-15T11:00:00Z"
  }
}

DELETE /api/documents/:id

Delete a document and all associated embeddings.

Auth Required: Yes

Response: 204 No Content

Processing Status

Documents undergo asynchronous processing after upload (parsing, chunking, embedding). Implementations MUST provide a way for clients to determine when processing completes.

Acceptable Approaches:

Polling — Client periodically calls GET /api/documents/:id and checks the status field
Webhooks — Implementation accepts a callback URL and POSTs status updates
Real-time Subscriptions — WebSocket, SSE, or framework-native subscriptions (e.g., Convex reactive queries, Firebase listeners)

Status Values:

processing — Document is being parsed, chunked, and embedded
ready — Document is fully processed and available for queries
failed — Processing failed (check error field for details)

Example (Polling):

GET /api/documents/doc_456def
→ { "status": "processing" }

# Wait, then retry...

GET /api/documents/doc_456def
→ { "status": "ready", "chunkCount": 23 }

Example (Subscription - conceptual):

// Framework-native subscription (Convex example)
const document = useQuery(api.documents.get, { id: documentId });
// `document.status` updates automatically when processing completes

Note: The spec does not prescribe which approach to use. Implementations should choose based on their technology stack. Frontends should support at minimum polling, with real-time updates as an enhancement.

Conversations

POST /api/conversations

Create a new conversation thread.

Auth Required: Yes

Request Body:

{
  "title": "Questions about AI",
  "documentIds": ["doc_456def", "doc_789ghi"]
}

Note: documentIds is optional. If provided, conversation will be scoped to those documents.

Response: 201 Created

{
  "conversation": {
    "id": "conv_abc123",
    "userId": "usr_123abc",
    "title": "Questions about AI",
    "documentIds": ["doc_456def", "doc_789ghi"],
    "messageCount": 0,
    "createdAt": "2024-01-15T11:00:00Z",
    "updatedAt": "2024-01-15T11:00:00Z"
  }
}

GET /api/conversations

List all conversations for the authenticated user.

Auth Required: Yes

Query Parameters:

limit (optional, default: 20, max: 100)
offset (optional, default: 0)
sortBy (optional, default: updatedAt)
sortOrder (optional, default: desc)

Response: 200 OK

{
  "conversations": [
    {
      "id": "conv_abc123",
      "title": "Questions about AI",
      "messageCount": 12,
      "lastMessage": {
        "role": "assistant",
        "content": "Based on your documents...",
        "createdAt": "2024-01-15T11:05:00Z"
      },
      "createdAt": "2024-01-15T11:00:00Z",
      "updatedAt": "2024-01-15T11:05:00Z"
    }
  ],
  "pagination": {
    "total": 8,
    "limit": 20,
    "offset": 0,
    "hasMore": false
  }
}

GET /api/conversations/:id

Get a conversation with all messages.

Auth Required: Yes

Query Parameters:

limit (optional, default: 50) - Number of messages to return
before (optional) - Return messages before this message ID
after (optional) - Return messages after this message ID

Response: 200 OK

{
  "conversation": {
    "id": "conv_abc123",
    "userId": "usr_123abc",
    "title": "Questions about AI",
    "documentIds": ["doc_456def"],
    "createdAt": "2024-01-15T11:00:00Z",
    "updatedAt": "2024-01-15T11:05:00Z"
  },
  "messages": [
    {
      "id": "msg_001",
      "conversationId": "conv_abc123",
      "role": "user",
      "content": "What does my document say about neural networks?",
      "createdAt": "2024-01-15T11:01:00Z"
    },
    {
      "id": "msg_002",
      "conversationId": "conv_abc123",
      "role": "assistant",
      "content": "Based on your document, neural networks are...",
      "citations": [
        {
          "documentId": "doc_456def",
          "documentTitle": "My Document",
          "chunkId": "chunk_123",
          "excerpt": "Neural networks are composed of layers...",
          "relevanceScore": 0.92,
          "page": 5
        }
      ],
      "tokenUsage": {
        "prompt": 1250,
        "completion": 180,
        "total": 1430
      },
      "createdAt": "2024-01-15T11:01:05Z"
    }
  ],
  "pagination": {
    "hasMore": false,
    "before": null,
    "after": null
  }
}

PUT /api/conversations/:id

Update conversation metadata.

Auth Required: Yes

Request Body:

{
  "title": "Updated Title",
  "documentIds": ["doc_456def", "doc_999zzz"]
}

Response: 200 OK

{
  "conversation": {
    "id": "conv_abc123",
    "title": "Updated Title",
    "documentIds": ["doc_456def", "doc_999zzz"],
    "updatedAt": "2024-01-15T11:10:00Z"
  }
}

DELETE /api/conversations/:id

Delete a conversation and all its messages.

Auth Required: Yes

Response: 204 No Content

Messages

POST /api/conversations/:id/messages

Send a message in a conversation and get AI response.

Auth Required: Yes

Request Body:

{
  "content": "What are the main themes in my documents?",
  "documentIds": ["doc_456def"],
  "stream": true
}

Note:

documentIds (optional) - Override conversation's default documents
stream (optional, default: false) - Enable Server-Sent Events streaming

Response (Non-Streaming): 201 Created

{
  "userMessage": {
    "id": "msg_003",
    "conversationId": "conv_abc123",
    "role": "user",
    "content": "What are the main themes in my documents?",
    "createdAt": "2024-01-15T11:15:00Z"
  },
  "assistantMessage": {
    "id": "msg_004",
    "conversationId": "conv_abc123",
    "role": "assistant",
    "content": "The main themes in your documents are...",
    "citations": [
      {
        "documentId": "doc_456def",
        "documentTitle": "My Document",
        "chunkId": "chunk_456",
        "excerpt": "The primary focus is on...",
        "relevanceScore": 0.88,
        "page": 3
      }
    ],
    "relatedDocuments": ["doc_789ghi"],
    "tokenUsage": {
      "prompt": 2340,
      "completion": 245,
      "total": 2585
    },
    "createdAt": "2024-01-15T11:15:08Z"
  }
}

Response (Streaming): 200 OK

Content-Type: text/event-stream

event: message_start
data: {"messageId":"msg_004","conversationId":"conv_abc123"}

event: content_delta
data: {"delta":"The"}

event: content_delta
data: {"delta":" main"}

event: content_delta
data: {"delta":" themes"}

event: citations
data: {"citations":[{"documentId":"doc_456def","documentTitle":"My Document","excerpt":"...","relevanceScore":0.88}]}

event: message_end
data: {"messageId":"msg_004","tokenUsage":{"prompt":2340,"completion":245,"total":2585},"finishReason":"stop"}

event: done
data: {}

Streaming Event Types:

message_start - Message generation started
content_delta - Incremental content chunk
citations - Citations found (sent once after content)
message_end - Message complete with metadata
error - Error occurred
done - Stream complete

Optional Reasoning Events:

Implementations MAY stream AI reasoning/thinking separately from the final response. This enables UI patterns like "Thought Accordion" where reasoning is shown collapsed or hidden.

reasoning_start - Reasoning/thinking phase started
reasoning_delta - Incremental reasoning chunk
reasoning_end - Reasoning complete, response content follows

Example (with reasoning):

event: message_start
data: {"messageId":"msg_004","conversationId":"conv_abc123"}

event: reasoning_start
data: {}

event: reasoning_delta
data: {"delta":"Let me search the documents for information about neural networks..."}

event: reasoning_delta
data: {"delta":" I found 3 relevant chunks discussing layer architecture."}

event: reasoning_end
data: {}

event: content_delta
data: {"delta":"Based on your documents, neural networks are..."}

event: citations
data: {"citations":[...]}

event: message_end
data: {"messageId":"msg_004","tokenUsage":{...},"finishReason":"stop"}

event: done
data: {}

Note: Reasoning support depends on the underlying LLM's capabilities. Models like OpenAI o1, Claude with extended thinking, or custom chains may expose reasoning; others may not. Implementations without reasoning support simply omit these events.

Search

POST /api/search

Perform semantic search across documents.

Auth Required: Yes

Request Body:

{
  "query": "machine learning techniques",
  "documentIds": ["doc_456def", "doc_789ghi"],
  "limit": 10,
  "minRelevance": 0.7
}

Note:

documentIds (optional) - Limit search to specific documents
limit (optional, default: 10, max: 50) - Number of results
minRelevance (optional, default: 0.0, range: 0-1) - Minimum relevance score

Response: 200 OK

{
  "results": [
    {
      "documentId": "doc_456def",
      "documentTitle": "My Document",
      "chunkId": "chunk_789",
      "content": "Machine learning techniques include supervised...",
      "relevanceScore": 0.94,
      "metadata": {
        "page": 7,
        "section": "Chapter 3"
      }
    },
    {
      "documentId": "doc_789ghi",
      "documentTitle": "Another Doc",
      "chunkId": "chunk_234",
      "content": "Various ML approaches can be categorized...",
      "relevanceScore": 0.86,
      "metadata": {
        "page": 12
      }
    }
  ],
  "query": "machine learning techniques",
  "total": 2
}

Summaries

POST /api/documents/:id/summarize

Generate a summary of a single document.

Auth Required: Yes

Request Body:

{
  "length": "medium",
  "focus": "key_points"
}

Note:

length (optional, default: medium) - Values: short, medium, long
focus (optional, default: general) - Values: general, key_points, technical, conclusions

Response: 200 OK

{
  "summary": {
    "id": "sum_xyz789",
    "documentId": "doc_456def",
    "content": "This document discusses the fundamentals of neural networks...",
    "length": "medium",
    "focus": "key_points",
    "tokenUsage": {
      "prompt": 3450,
      "completion": 280,
      "total": 3730
    },
    "createdAt": "2024-01-15T11:20:00Z"
  }
}

POST /api/summaries

Generate a summary across multiple documents or a topic.

Auth Required: Yes

Request Body:

{
  "documentIds": ["doc_456def", "doc_789ghi"],
  "query": "How do these documents discuss AI safety?",
  "length": "long"
}

Note:

documentIds (required) - Documents to summarize
query (optional) - Focus the summary on a specific question/topic
length (optional, default: medium)

Response: 200 OK

{
  "summary": {
    "id": "sum_multi_123",
    "documentIds": ["doc_456def", "doc_789ghi"],
    "query": "How do these documents discuss AI safety?",
    "content": "Across your documents, AI safety is approached from...",
    "citations": [
      {
        "documentId": "doc_456def",
        "excerpt": "Safety considerations include...",
        "page": 15
      }
    ],
    "length": "long",
    "tokenUsage": {
      "prompt": 5680,
      "completion": 420,
      "total": 6100
    },
    "createdAt": "2024-01-15T11:25:00Z"
  }
}

User Profile

GET /api/user

Get current user profile and preferences.

Auth Required: Yes

Response: 200 OK

{
  "user": {
    "id": "usr_123abc",
    "email": "user@example.com",
    "displayName": "John Doe",
    "preferences": {
      "defaultDocumentScope": "all",
      "summaryLength": "medium",
      "streamingEnabled": true
    },
    "memory": {
      "facts": [
        "User is researching neural networks for thesis",
        "Interested in AI safety and ethics"
      ]
    },
    "usage": {
      "documentsCount": 45,
      "conversationsCount": 8,
      "totalTokensUsed": 145230
    },
    "createdAt": "2024-01-10T08:00:00Z"
  }
}

PUT /api/user

Update user profile and preferences.

Auth Required: Yes

Request Body:

{
  "displayName": "New Name",
  "preferences": {
    "defaultDocumentScope": "recent",
    "summaryLength": "short"
  }
}

Response: 200 OK

{
  "user": {
    "id": "usr_123abc",
    "displayName": "New Name",
    "preferences": {
      "defaultDocumentScope": "recent",
      "summaryLength": "short",
      "streamingEnabled": true
    },
    "updatedAt": "2024-01-15T11:30:00Z"
  }
}

Configuration

GET /api/config

Get system configuration and AI provider information.

Auth Required: No (public endpoint)

Response: 200 OK

{
  "config": {
    "embeddingModel": "text-embedding-3-small",
    "embeddingDimension": 1536,
    "chatModel": "gpt-4-turbo",
    "vectorStore": "pgvector",
    "chunkSize": 1000,
    "chunkOverlap": 200,
    "version": "1.0.0"
  }
}

Purpose:

Allows frontends to display provider information
Helps with debugging and transparency
Enables users to understand what models are being used
Useful for comparing different implementations

Note: Implementations MUST expose this information. It reveals which providers are being used, making architectural decisions visible.

Health Check

GET /api/health

Check health status of the service and its dependencies.

Auth Required: No (public endpoint)

Response (Healthy): 200 OK

{
  "status": "healthy",
  "version": "1.0.0",
  "timestamp": "2024-01-15T12:00:00Z",
  "checks": {
    "database": "healthy",
    "embedder": "healthy",
    "vectorStore": "healthy",
    "llm": "healthy"
  }
}

Response (Unhealthy): 503 Service Unavailable

{
  "status": "unhealthy",
  "version": "1.0.0",
  "timestamp": "2024-01-15T12:00:00Z",
  "checks": {
    "database": "healthy",
    "embedder": "unhealthy",
    "vectorStore": "healthy",
    "llm": "degraded"
  },
  "errors": {
    "embedder": "Connection timeout to embedding API"
  }
}

Check Status Values:

healthy - Service is functioning normally
degraded - Service is functioning but with reduced performance
unhealthy - Service is not functioning

Requirements:

Implementations SHOULD check actual connectivity to dependencies
Implementations SHOULD NOT just check configuration
Health checks SHOULD complete in < 5 seconds
Health checks MAY cache results for 30-60 seconds

Purpose:

Enables monitoring and alerting
Helps diagnose issues
Required for production deployments
Reveals architectural dependencies

Error Handling

All errors follow a consistent format:

Error Response Structure:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message",
    "details": {
      "field": "Additional context if applicable"
    }
  }
}

Common Error Codes:

Status	Code	Description
400	`INVALID_REQUEST`	Malformed request body or parameters
401	`UNAUTHORIZED`	Missing or invalid authentication token
403	`FORBIDDEN`	User doesn't have access to resource
404	`NOT_FOUND`	Resource doesn't exist
409	`CONFLICT`	Resource conflict (e.g., duplicate)
413	`PAYLOAD_TOO_LARGE`	File or request too large
422	`VALIDATION_ERROR`	Request validation failed
429	`RATE_LIMIT_EXCEEDED`	Too many requests
500	`INTERNAL_ERROR`	Server error
503	`SERVICE_UNAVAILABLE`	Service temporarily unavailable

Example Error:

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Document title is required",
    "details": {
      "field": "title",
      "validation": "required"
    }
  }
}

Rate Limiting

All endpoints are subject to rate limiting. Rate limit information is returned in response headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1705320000

When rate limit is exceeded, the API returns 429 Too Many Requests:

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded. Try again in 60 seconds.",
    "details": {
      "retryAfter": 60,
      "limit": 1000,
      "window": "1h"
    }
  }
}

Suggested Rate Limits (implementations may vary):

Authenticated requests: 1000 requests per hour per user
Document uploads: 100 per hour per user
Message sends: 500 per hour per user
Search requests: 200 per hour per user

Implementation Notes

For Backend Implementers:

Chunking Strategy - How documents are split into chunks is implementation-specific, but should support citation at chunk level.
Embedding Model - Choice of embedding model is flexible, but dimensions should be consistent within an implementation.
LLM Provider - Any LLM (OpenAI, Anthropic, Gemini, local models) that can handle context and generate responses.
Vector Database - Any vector store that supports similarity search (Pinecone, Weaviate, pgvector, Chroma, etc.).
Streaming - SSE (Server-Sent Events) is recommended for broader compatibility, but WebSockets or framework-native streaming mechanisms are equally acceptable. The requirement is token-by-token delivery with inline citation metadata — not a specific transport protocol.
Authentication - JWT is recommended but any secure token-based auth is acceptable.
Storage - Document files can be stored in any blob storage (S3, local filesystem, etc.).
File Upload - The API examples show direct multipart upload, but implementations may use any upload pattern:
- Direct multipart upload to API endpoint
- Presigned URL upload (client uploads to storage, then notifies API)
- Framework-specific patterns (e.g., Convex generateUploadUrl(), Firebase Storage)
The API contract focuses on the result (document created with processing status) not the upload mechanism.

For Frontend Implementers:

Handle Streaming - Must support the implementation's streaming mechanism (SSE, WebSocket, or framework-native) for real-time responses.
Citations UI - Must display citations with clickable links to source documents.
Loading States - Show appropriate loading/processing states for async operations.
Error Handling - Display user-friendly error messages.
Responsive Design - Work on desktop and mobile devices.

Next: See data-models.md for detailed schema definitions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Specification

Table of Contents

Authentication

POST /api/auth/register

POST /api/auth/login

Documents

POST /api/documents

GET /api/documents

GET /api/documents/:id

PUT /api/documents/:id

DELETE /api/documents/:id

Processing Status

Conversations

POST /api/conversations

GET /api/conversations

GET /api/conversations/:id

PUT /api/conversations/:id

DELETE /api/conversations/:id

Messages

POST /api/conversations/:id/messages

Search

POST /api/search

Summaries

POST /api/documents/:id/summarize

POST /api/summaries

User Profile

GET /api/user

PUT /api/user

Configuration

GET /api/config

Health Check

GET /api/health

Error Handling

Rate Limiting

Implementation Notes

FilesExpand file tree

api-specification.md

Latest commit

History

api-specification.md

File metadata and controls

API Specification

Table of Contents

Authentication

POST /api/auth/register

POST /api/auth/login

Documents

POST /api/documents

GET /api/documents

GET /api/documents/:id

PUT /api/documents/:id

DELETE /api/documents/:id

Processing Status

Conversations

POST /api/conversations

GET /api/conversations

GET /api/conversations/:id

PUT /api/conversations/:id

DELETE /api/conversations/:id

Messages

POST /api/conversations/:id/messages

Search

POST /api/search

Summaries

POST /api/documents/:id/summarize

POST /api/summaries

User Profile

GET /api/user

PUT /api/user

Configuration

GET /api/config

Health Check

GET /api/health

Error Handling

Rate Limiting

Implementation Notes