Motoko Coder

This repo has been archived, for new version of this repo, please check: https://github.com/Quantum3-Labs/icp-coder

Motoko Coder

A Retrieval-Augmented Generation (RAG) pipeline for Motoko code search and code generation, powered by ChromaDB, local embeddings, and Google Gemini.

Project Demo (Regional Round): hhttps://app.screencastify.com/watch/m1RHKJVmk4bVuKNw2Teh

MCP Server Overview

Motoko Coder is built around an MCP (Model Context Protocol) server that streams Motoko-specific context directly into tools such as Cursor, Claude Desktop, and other MCP-compatible clients. The service sits on top of a local ChromaDB vector store and handles retrieval, formatting, and generation so your editor can deliver context-aware completions in real time.

Serve Motoko knowledge over HTTP or process-based MCP transports
Retrieve embeddings from ChromaDB populated with documentation and sample projects
Generate new Motoko code by orchestrating Google Gemini with retrieved snippets

RAG Pipeline

Features

Ingests and indexes all Motoko code samples from the motoko_code_samples/ directory
Generates vector embeddings using the local SentenceTransformer model (all-MiniLM-L6-v2)
End-to-end RAG workflow for Motoko code search and question answering
Complete MCP server that exposes retrieval and generation tools (process and HTTP modes)
REST API layer with user authentication and key management
Supports Google Gemini (SDK or REST API) for code-focused prompts
ChromaDB-backed storage for metadata and vector search

Prerequisites

Python 3.11+
ChromaDB
sentence-transformers
tqdm (for progress bars)
python-dotenv (for loading environment variables)
Google Gemini API key

Setup

Run all commands from the project root so the shared ChromaDB instance at chromadb_data/ is detected correctly.

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Create a .env file in the project root with your Gemini credentials:

GEMINI_API_KEY=your-gemini-api-key-here
SECRET_KEY=change-me

Prepare the Knowledge Base

Populate the vector store before starting the MCP server.

Clone official Motoko documentation
```
python clone_motoko_docs.py
```
Clone Motoko project samples
```
python clone_motoko_repos.py
```
This script downloads a curated collection into motoko_code_samples/ and updates .gitignore automatically.
Ingest Motoko documentation
```
python ingest/motoko_docs_ingester.py
```
Ingest Motoko code samples
```
python ingest/motoko_samples_ingester.py
```
All .mo and mops.toml files are embedded and stored in ChromaDB.

Generate an API Key

The MCP server uses the authentication service to guard access. Create an API key before connecting external clients.

Start the authentication API (runs alongside the MCP server):

set PYTHONPATH=.
python -m uvicorn API.auth_server:app --reload --port 8001

Register a user (once per install):

curl -X POST http://localhost:8001/register \
  -H "Content-Type: application/json" \
  -d '{
    "username": "motoko",
    "password": "s3cret",
    "email": "you@example.com"
  }'

Log in to receive a bearer token:

curl -X POST http://localhost:8001/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "motoko",
    "password": "s3cret"
  }'

Copy the access_token from the response.

Create an API key (supply the bearer token from the previous step):

curl -X POST http://localhost:8001/api-keys \
  -H "Authorization: Bearer ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Cursor"
  }'

The response includes an api_key value--use it in your MCP client configuration.

Run the MCP Server

With the knowledge base prepared, launch the MCP server to make the retrieval tools available:

set PYTHONPATH=.
python MCP_Server/server.py --port 3000

Key options:

--port: choose a different HTTP port (default 3000)
--log-level: adjust logging (DEBUG, INFO, etc.)

If the port is already in use, stop the conflicting service or supply an alternative port.

Connect from Cursor/VS Code

Start the MCP server (see above).
In Cursor/VS Code, open the LLM or MCP configuration.

Add the Motoko Coder MCP endpoint (replace YOUR_API_KEY with one generated below):

{
  "mcpServers": {
    "motoko-coder": {
      "url": "http://localhost:3000/mcp",
      "headers": {
        "API_KEY": "YOUR_API_KEY"
      }
    }
  }
}

Restart the client if required. Available tools:
- get_motoko_context: retrieves relevant Motoko examples
- generate_motoko_code: generates Motoko code with RAG context

How It Works

User Query: You ask for help with Motoko code, mentioning the MCP tool you want to use.
Context Retrieval: The server searches ChromaDB for relevant examples.
Gemini Generation: Gemini combines the retrieved context with your prompt to draft better code.
Response: The MCP server returns context snippets and/or generated code back to Cursor.

Optional Interfaces

REST API Server

The REST API mirrors the MCP functionality and can be used by external services. Run these processes in separate terminals:

# Terminal 1: Authentication server (port 8001)
set PYTHONPATH=.
python -m uvicorn API.auth_server:app --reload --port 8001

# Terminal 2: RAG API server (port 8000)
set PYTHONPATH=.
python -m uvicorn API.api_server:app --reload --port 8000

Test the API

Example client:

python API/client_example.py

Direct cURL request:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "messages": [
      {"role": "user", "content": "How do I write a counter canister in Motoko?"}
    ]
  }'

Direct CLI Inference

Run the standalone script to experiment with Gemini-powered RAG outside of MCP clients:

python rag/inference_gemini.py

Data Refresh Automation

The automated_ingestion_job scheduler refreshes the ChromaDB database on the 1st of every month at 02:00 UTC. It reclones repositories and rebuilds embeddings to keep suggestions current.

python automated_ingestion_job/scheduler.py

Integrations

IC-Vibe-Coding-Template-Motoko can be enhanced by feeding it Motoko Coder's RAG context. After setting up this project, follow the installation instructions in that repository to wire in the MCP server and improve code suggestions.

Project Structure

ICP_Coder/
|-- API/
|   |-- api_server.py
|   |-- auth_server.py
|   |-- client_example.py
|   |-- database.py
|   |-- mcp_api_server.py
|   |-- mcp_server.py
|   `-- README.md
|-- MCP_Server/
|   `-- server.py
|-- automated_ingestion_job/
|   `-- scheduler.py
|-- ingest/
|   |-- motoko_docs_ingester.py
|   `-- motoko_samples_ingester.py
|-- motoko_code_samples/
|-- rag/
|   |-- inference_base.py
|   `-- inference_gemini.py
|-- chromadb_data/
|-- requirements.txt
|-- RAG_PIPELINE_DIAGRAM.md
|-- RAG_APPROACH_DIAGRAM.md
`-- README.md

Documentation

System Architecture: RAG_PIPELINE_DIAGRAM.md
RAG Approach: RAG_APPROACH_DIAGRAM.md
API Documentation: API/README.md
MCP Specification: API/MCP_SPECIFICATION.md

Build Motoko code assistants with Python, ChromaDB, Gemini, and a first-class MCP workflow.

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motoko Coder

MCP Server Overview

RAG Pipeline

Features

Prerequisites

Setup

Prepare the Knowledge Base

Generate an API Key

Run the MCP Server

Connect from Cursor/VS Code

How It Works

Optional Interfaces

REST API Server

Test the API

Direct CLI Inference

Data Refresh Automation

Integrations

Project Structure

Documentation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
API		API
MCP_Server		MCP_Server
automated_ingestion_job		automated_ingestion_job
benchmark		benchmark
ingest		ingest
motoko_code_samples		motoko_code_samples
rag		rag
tool		tool
utils/__pycache__		utils/__pycache__
.env.example		.env.example
.gitignore		.gitignore
RAG_APPROACH_DIAGRAM.md		RAG_APPROACH_DIAGRAM.md
RAG_PIPELINE_DIAGRAM.md		RAG_PIPELINE_DIAGRAM.md
README.md		README.md
clone_motoko_docs.py		clone_motoko_docs.py
clone_motoko_repos.py		clone_motoko_repos.py
dfx.json		dfx.json
inspect_chromadb.py		inspect_chromadb.py
mcp_server_config_example.json		mcp_server_config_example.json
requirements.txt		requirements.txt

DiegoFloresWenHao/ICP_Coder

Folders and files

Latest commit

History

Repository files navigation

Motoko Coder

MCP Server Overview

RAG Pipeline

Features

Prerequisites

Setup

Prepare the Knowledge Base

Generate an API Key

Run the MCP Server

Connect from Cursor/VS Code

How It Works

Optional Interfaces

REST API Server

Test the API

Direct CLI Inference

Data Refresh Automation

Integrations

Project Structure

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages