This repo has been archived, for new version of this repo, please check: https://github.com/Quantum3-Labs/icp-coder
A Retrieval-Augmented Generation (RAG) pipeline for Motoko code search and code generation, powered by ChromaDB, local embeddings, and Google Gemini.
Project Demo (Regional Round): hhttps://app.screencastify.com/watch/m1RHKJVmk4bVuKNw2Teh
Motoko Coder is built around an MCP (Model Context Protocol) server that streams Motoko-specific context directly into tools such as Cursor, Claude Desktop, and other MCP-compatible clients. The service sits on top of a local ChromaDB vector store and handles retrieval, formatting, and generation so your editor can deliver context-aware completions in real time.
- Serve Motoko knowledge over HTTP or process-based MCP transports
- Retrieve embeddings from ChromaDB populated with documentation and sample projects
- Generate new Motoko code by orchestrating Google Gemini with retrieved snippets
- Ingests and indexes all Motoko code samples from the
motoko_code_samples/directory - Generates vector embeddings using the local SentenceTransformer model (
all-MiniLM-L6-v2) - End-to-end RAG workflow for Motoko code search and question answering
- Complete MCP server that exposes retrieval and generation tools (process and HTTP modes)
- REST API layer with user authentication and key management
- Supports Google Gemini (SDK or REST API) for code-focused prompts
- ChromaDB-backed storage for metadata and vector search
- Python 3.11+
- ChromaDB
- sentence-transformers
- tqdm (for progress bars)
- python-dotenv (for loading environment variables)
- Google Gemini API key
Run all commands from the project root so the shared ChromaDB instance at chromadb_data/ is detected correctly.
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtCreate a .env file in the project root with your Gemini credentials:
GEMINI_API_KEY=your-gemini-api-key-here
SECRET_KEY=change-mePopulate the vector store before starting the MCP server.
-
Clone official Motoko documentation
python clone_motoko_docs.py
-
Clone Motoko project samples
python clone_motoko_repos.py
This script downloads a curated collection into
motoko_code_samples/and updates.gitignoreautomatically. -
Ingest Motoko documentation
python ingest/motoko_docs_ingester.py
-
Ingest Motoko code samples
python ingest/motoko_samples_ingester.py
All
.moandmops.tomlfiles are embedded and stored in ChromaDB.
The MCP server uses the authentication service to guard access. Create an API key before connecting external clients.
-
Start the authentication API (runs alongside the MCP server):
set PYTHONPATH=. python -m uvicorn API.auth_server:app --reload --port 8001 -
Register a user (once per install):
curl -X POST http://localhost:8001/register \ -H "Content-Type: application/json" \ -d '{ "username": "motoko", "password": "s3cret", "email": "you@example.com" }'
-
Log in to receive a bearer token:
curl -X POST http://localhost:8001/login \ -H "Content-Type: application/json" \ -d '{ "username": "motoko", "password": "s3cret" }'
Copy the
access_tokenfrom the response. -
Create an API key (supply the bearer token from the previous step):
curl -X POST http://localhost:8001/api-keys \ -H "Authorization: Bearer ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "Cursor" }'
The response includes an api_key value--use it in your MCP client configuration.
With the knowledge base prepared, launch the MCP server to make the retrieval tools available:
set PYTHONPATH=.
python MCP_Server/server.py --port 3000Key options:
--port: choose a different HTTP port (default3000)--log-level: adjust logging (DEBUG,INFO, etc.)
If the port is already in use, stop the conflicting service or supply an alternative port.
-
Start the MCP server (see above).
-
In Cursor/VS Code, open the LLM or MCP configuration.
-
Add the Motoko Coder MCP endpoint (replace
YOUR_API_KEYwith one generated below):{ "mcpServers": { "motoko-coder": { "url": "http://localhost:3000/mcp", "headers": { "API_KEY": "YOUR_API_KEY" } } } } -
Restart the client if required. Available tools:
get_motoko_context: retrieves relevant Motoko examplesgenerate_motoko_code: generates Motoko code with RAG context
- User Query: You ask for help with Motoko code, mentioning the MCP tool you want to use.
- Context Retrieval: The server searches ChromaDB for relevant examples.
- Gemini Generation: Gemini combines the retrieved context with your prompt to draft better code.
- Response: The MCP server returns context snippets and/or generated code back to Cursor.
The REST API mirrors the MCP functionality and can be used by external services. Run these processes in separate terminals:
# Terminal 1: Authentication server (port 8001)
set PYTHONPATH=.
python -m uvicorn API.auth_server:app --reload --port 8001
# Terminal 2: RAG API server (port 8000)
set PYTHONPATH=.
python -m uvicorn API.api_server:app --reload --port 8000Example client:
python API/client_example.pyDirect cURL request:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"messages": [
{"role": "user", "content": "How do I write a counter canister in Motoko?"}
]
}'Run the standalone script to experiment with Gemini-powered RAG outside of MCP clients:
python rag/inference_gemini.pyThe automated_ingestion_job scheduler refreshes the ChromaDB database on the 1st of every month at 02:00 UTC. It reclones repositories and rebuilds embeddings to keep suggestions current.
python automated_ingestion_job/scheduler.pyIC-Vibe-Coding-Template-Motoko can be enhanced by feeding it Motoko Coder's RAG context. After setting up this project, follow the installation instructions in that repository to wire in the MCP server and improve code suggestions.
ICP_Coder/
|-- API/
| |-- api_server.py
| |-- auth_server.py
| |-- client_example.py
| |-- database.py
| |-- mcp_api_server.py
| |-- mcp_server.py
| `-- README.md
|-- MCP_Server/
| `-- server.py
|-- automated_ingestion_job/
| `-- scheduler.py
|-- ingest/
| |-- motoko_docs_ingester.py
| `-- motoko_samples_ingester.py
|-- motoko_code_samples/
|-- rag/
| |-- inference_base.py
| `-- inference_gemini.py
|-- chromadb_data/
|-- requirements.txt
|-- RAG_PIPELINE_DIAGRAM.md
|-- RAG_APPROACH_DIAGRAM.md
`-- README.md
- System Architecture:
RAG_PIPELINE_DIAGRAM.md - RAG Approach:
RAG_APPROACH_DIAGRAM.md - API Documentation:
API/README.md - MCP Specification:
API/MCP_SPECIFICATION.md
Build Motoko code assistants with Python, ChromaDB, Gemini, and a first-class MCP workflow.
MIT License
Copyright (c) 2025 Motoko Coder
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.