API Reference

Overview

The backend is implemented in server/index.mjs. It exposes a small HTTP API used by the React frontend and by any external automation that wants to interact with the platform.

Base assumptions:

responses are JSON unless otherwise stated;
there is no authentication layer in the current implementation;
uploads are served from /uploads/*;
the production server also serves the built frontend.

Conventions

Content type for JSON endpoints: application/json
Content type for upload endpoint: multipart/form-data
Error payload shape: { "error": "message" }
Success payloads vary by endpoint

Endpoints

`GET /api/state`

Returns the full current application state required by the frontend.

Response shape:

{
  "project": {
    "title": "Plataforma colaborativa para la historia del cancer de mama",
    "focus": "Encontrar todos los archivos donde se menciona cancer de mama..."
  },
  "terms": [],
  "documents": [],
  "analysis": {
    "summary": {},
    "timeline": [],
    "topTerms": [],
    "cooccurrences": [],
    "mentions": [],
    "documentsWithMentions": []
  }
}

Notes:

documents are serialized server-side to include uploadTextUrl and uploadFileUrl.
analysis is derived at request time from the current runtime corpus.

`POST /api/terms`

Creates a new term.

Request body:

{
  "canonical": "scirrhus of the breast",
  "variants": "scirrhous breast, scirrhus, scirrhous",
  "category": "historical nomenclature",
  "notes": "Used in late eighteenth and nineteenth century surgical writing."
}

Validation rules:

canonical is required.
variants is expected as a comma-separated string.
category defaults to general.
notes is optional.

Success response:

{
  "ok": true
}

Error response:

{
  "error": "canonical es obligatorio"
}

`PUT /api/terms/:termId`

Updates an existing term.

Path parameters:

termId: term identifier such as term-1710419200000

Request body:

{
  "canonical": "mammary tumour",
  "variants": "mammary tumor, tumour of the breast, tumor of the breast",
  "category": "description",
  "notes": "Supports British and US spellings."
}

Validation and behavior:

missing term returns 404;
variants is still a comma-separated string, not an array;
unspecified fields fall back to current stored values.

Success response:

{
  "ok": true
}

Not found response:

{
  "error": "Termino no encontrado"
}

`POST /api/documents`

Creates a new document and optionally stores OCR text and a binary upload.

Content type:

multipart/form-data

Accepted fields:

title
shortTitle
year
place
language
recordType
sourceHost
contributorName
contributorRole
summary
notes
ocrText
file

Validation rules:

title is required;
at least one of the following must be present:
- ocrText
- file with a text/* MIME type
- summary
uploaded file size is limited to 25 MB;
if a text file is uploaded and ocrText is absent, the server extracts text from the uploaded file buffer.

Example with curl:

curl -X POST http://localhost:8080/api/documents \
  -F 'title=Clinical remarks on cancer of the breast' \
  -F 'shortTitle=McGuire, 1882' \
  -F 'year=1882' \
  -F 'place=Richmond' \
  -F 'language=english' \
  -F 'recordType=clinical article' \
  -F 'sourceHost=Internet Archive' \
  -F 'contributorName=Research Team' \
  -F 'contributorRole=editor' \
  -F 'summary=Short editorial summary of the document.' \
  -F 'ocrText=Full OCR text goes here.'

Success response:

{
  "ok": true,
  "id": "doc-1710419200000",
  "document": {
    "id": "doc-1710419200000",
    "title": "Clinical remarks on cancer of the breast",
    "shortTitle": "McGuire, 1882",
    "year": 1882,
    "place": "Richmond",
    "language": "english",
    "recordType": "clinical article",
    "sourceHost": "Internet Archive",
    "contributorName": "Research Team",
    "contributorRole": "editor",
    "notes": "",
    "summary": "Short editorial summary of the document.",
    "textPath": "/absolute/path/in/runtime-data/uploads/...",
    "originalFilePath": "",
    "sourceLinks": [],
    "createdAt": "2026-04-13T00:00:00.000Z",
    "reviewStatus": "nuevo",
    "uploadTextUrl": "/uploads/doc-1710419200000.txt",
    "uploadFileUrl": ""
  }
}

Error responses:

{
  "error": "title es obligatorio"
}

{
  "error": "Se necesita OCR en texto, archivo de texto o un resumen minimo."
}

`GET /api/similar-contexts/:mentionId`

Returns the nearest contextual neighbors for an existing mention.

Path parameters:

mentionId: an ID returned inside the analysis.mentions array from GET /api/state

Typical response shape:

{
  "sourceMention": {},
  "similarContexts": []
}

Behavior:

uses the current corpus and current term configuration;
compares chunk vectors using cosine similarity;
excludes the source chunk itself from the neighbor list;
returns only rows with positive similarity scores.

`POST /api/context-query`

Returns contextual neighbors for arbitrary input text.

Request body:

{
  "text": "The breast became indurated and painful with ulceration following..."
}

Validation rules:

text is required and trimmed;
empty text returns 400.

Success response:

{
  "sourceContext": {},
  "similarContexts": []
}

Error response:

{
  "error": "text es obligatorio"
}

Upload URL Handling

Uploaded OCR text files and uploaded original files are exposed through:

/uploads/<basename>

The server strips directory paths and only publishes the basename through uploadTextUrl and uploadFileUrl.

Response Data Notes

Terms

Terms typically contain:

{
  "id": "cancer-breast",
  "canonical": "cancer of the breast",
  "variants": ["breast cancer", "cancerous breast"],
  "category": "enfermedad",
  "notes": "Formula canonica amplia para detectar menciones directas."
}

Documents

Documents typically contain:

bibliographic metadata;
editorial metadata;
persisted text/file paths;
generated upload URLs;
reviewStatus.

Mentions

Mention rows inside analysis.mentions include:

mention id;
documentId;
documentTitle;
year;
place;
recordType;
termId;
canonicalTerm;
matchedText;
snippet;
chunkId.

Stability Notes

This API is currently internal-to-platform and not versioned. If the project is released for wider external integration, maintainers should consider:

explicit API versioning;
formal schema documentation;
request validation middleware;
authentication and rate limiting;
deletion and moderation endpoints;
structured error codes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Reference

Overview

Conventions

Endpoints

`GET /api/state`

`POST /api/terms`

`PUT /api/terms/:termId`

`POST /api/documents`

`GET /api/similar-contexts/:mentionId`

`POST /api/context-query`

Upload URL Handling

Response Data Notes

Terms

Documents

Mentions

Stability Notes

FilesExpand file tree

api.md

Latest commit

History

api.md

File metadata and controls

API Reference

Overview

Conventions

Endpoints

GET /api/state

POST /api/terms

PUT /api/terms/:termId

POST /api/documents

GET /api/similar-contexts/:mentionId

POST /api/context-query

Upload URL Handling

Response Data Notes

Terms

Documents

Mentions

Stability Notes

`GET /api/state`

`POST /api/terms`

`PUT /api/terms/:termId`

`POST /api/documents`

`GET /api/similar-contexts/:mentionId`

`POST /api/context-query`