The backend is implemented in server/index.mjs. It exposes a small HTTP API used by the React frontend and by any external automation that wants to interact with the platform.
Base assumptions:
- responses are JSON unless otherwise stated;
- there is no authentication layer in the current implementation;
- uploads are served from
/uploads/*; - the production server also serves the built frontend.
- Content type for JSON endpoints:
application/json - Content type for upload endpoint:
multipart/form-data - Error payload shape:
{ "error": "message" } - Success payloads vary by endpoint
Returns the full current application state required by the frontend.
Response shape:
{
"project": {
"title": "Plataforma colaborativa para la historia del cancer de mama",
"focus": "Encontrar todos los archivos donde se menciona cancer de mama..."
},
"terms": [],
"documents": [],
"analysis": {
"summary": {},
"timeline": [],
"topTerms": [],
"cooccurrences": [],
"mentions": [],
"documentsWithMentions": []
}
}Notes:
documentsare serialized server-side to includeuploadTextUrlanduploadFileUrl.analysisis derived at request time from the current runtime corpus.
Creates a new term.
Request body:
{
"canonical": "scirrhus of the breast",
"variants": "scirrhous breast, scirrhus, scirrhous",
"category": "historical nomenclature",
"notes": "Used in late eighteenth and nineteenth century surgical writing."
}Validation rules:
canonicalis required.variantsis expected as a comma-separated string.categorydefaults togeneral.notesis optional.
Success response:
{
"ok": true
}Error response:
{
"error": "canonical es obligatorio"
}Updates an existing term.
Path parameters:
termId: term identifier such asterm-1710419200000
Request body:
{
"canonical": "mammary tumour",
"variants": "mammary tumor, tumour of the breast, tumor of the breast",
"category": "description",
"notes": "Supports British and US spellings."
}Validation and behavior:
- missing term returns
404; variantsis still a comma-separated string, not an array;- unspecified fields fall back to current stored values.
Success response:
{
"ok": true
}Not found response:
{
"error": "Termino no encontrado"
}Creates a new document and optionally stores OCR text and a binary upload.
Content type:
multipart/form-data
Accepted fields:
titleshortTitleyearplacelanguagerecordTypesourceHostcontributorNamecontributorRolesummarynotesocrTextfile
Validation rules:
titleis required;- at least one of the following must be present:
ocrTextfilewith atext/*MIME typesummary
- uploaded file size is limited to 25 MB;
- if a text file is uploaded and
ocrTextis absent, the server extracts text from the uploaded file buffer.
Example with curl:
curl -X POST http://localhost:8080/api/documents \
-F 'title=Clinical remarks on cancer of the breast' \
-F 'shortTitle=McGuire, 1882' \
-F 'year=1882' \
-F 'place=Richmond' \
-F 'language=english' \
-F 'recordType=clinical article' \
-F 'sourceHost=Internet Archive' \
-F 'contributorName=Research Team' \
-F 'contributorRole=editor' \
-F 'summary=Short editorial summary of the document.' \
-F 'ocrText=Full OCR text goes here.'Success response:
{
"ok": true,
"id": "doc-1710419200000",
"document": {
"id": "doc-1710419200000",
"title": "Clinical remarks on cancer of the breast",
"shortTitle": "McGuire, 1882",
"year": 1882,
"place": "Richmond",
"language": "english",
"recordType": "clinical article",
"sourceHost": "Internet Archive",
"contributorName": "Research Team",
"contributorRole": "editor",
"notes": "",
"summary": "Short editorial summary of the document.",
"textPath": "/absolute/path/in/runtime-data/uploads/...",
"originalFilePath": "",
"sourceLinks": [],
"createdAt": "2026-04-13T00:00:00.000Z",
"reviewStatus": "nuevo",
"uploadTextUrl": "/uploads/doc-1710419200000.txt",
"uploadFileUrl": ""
}
}Error responses:
{
"error": "title es obligatorio"
}{
"error": "Se necesita OCR en texto, archivo de texto o un resumen minimo."
}Returns the nearest contextual neighbors for an existing mention.
Path parameters:
mentionId: an ID returned inside theanalysis.mentionsarray fromGET /api/state
Typical response shape:
{
"sourceMention": {},
"similarContexts": []
}Behavior:
- uses the current corpus and current term configuration;
- compares chunk vectors using cosine similarity;
- excludes the source chunk itself from the neighbor list;
- returns only rows with positive similarity scores.
Returns contextual neighbors for arbitrary input text.
Request body:
{
"text": "The breast became indurated and painful with ulceration following..."
}Validation rules:
textis required and trimmed;- empty text returns
400.
Success response:
{
"sourceContext": {},
"similarContexts": []
}Error response:
{
"error": "text es obligatorio"
}Uploaded OCR text files and uploaded original files are exposed through:
/uploads/<basename>
The server strips directory paths and only publishes the basename through uploadTextUrl and uploadFileUrl.
Terms typically contain:
{
"id": "cancer-breast",
"canonical": "cancer of the breast",
"variants": ["breast cancer", "cancerous breast"],
"category": "enfermedad",
"notes": "Formula canonica amplia para detectar menciones directas."
}Documents typically contain:
- bibliographic metadata;
- editorial metadata;
- persisted text/file paths;
- generated upload URLs;
reviewStatus.
Mention rows inside analysis.mentions include:
- mention
id; documentId;documentTitle;year;place;recordType;termId;canonicalTerm;matchedText;snippet;chunkId.
This API is currently internal-to-platform and not versioned. If the project is released for wider external integration, maintainers should consider:
- explicit API versioning;
- formal schema documentation;
- request validation middleware;
- authentication and rate limiting;
- deletion and moderation endpoints;
- structured error codes.