Skip to content

Discourse forum bot using LocalAI backend with model selection

License

Notifications You must be signed in to change notification settings

scog/discourse-bot-v1

Repository files navigation

Discourse Bot

A Discourse forum bot powered by any OpenAI-compatible LLM backend (LocalAI, Ollama, vLLM, etc.) that responds to mentions and searches external knowledge sources.

Features

  • Responds to @mentions in your Discourse forum
  • Works with any OpenAI-compatible API (LocalAI, Ollama, vLLM, etc.)
  • Automatic model loading with custom settings via LocalAI's /models/apply endpoint
  • Web search integration via Ollama Web Search API
  • Rate limiting to avoid spamming the forum
  • Persists processed notifications to avoid duplicate replies
  • Runs entirely in Docker with LocalAI included

Quick Start

  1. Copy .env.example to .env and configure your settings:

    cp .env.example .env
  2. Build and run with Docker Compose:

    docker-compose up -d
  3. Check the logs:

    docker-compose logs -f

Environment Variables

Required

Variable Description
DISCOURSE_HOST Your Discourse forum URL (e.g., https://forum.example.com)
DISCOURSE_API_KEY API key from Discourse Admin > API
DISCOURSE_USERNAME Bot's username on the forum

Bot Behavior

Variable Default Description
BOT_MENTION_TRIGGERS @discussy Comma-separated list of @mentions the bot responds to
POLL_INTERVAL_MS 30000 How often to check for new mentions (milliseconds)
MIN_REPLY_INTERVAL_MS 120000 Minimum time between replies in milliseconds (prevents spamming)
BOT_MAX_RESPONSE_LENGTH 2000 Max response length in characters
DEBUG_MODE false If true, log responses to console instead of posting to Discourse
BOT_SYSTEM_PROMPT (built-in) Custom system prompt for the bot's personality
BOT_BLOCKED_PATTERNS (built-in) Pipe-separated regex patterns to filter from responses

Thread Mode

Controls how the bot creates and responds to threads.

Variable Default Description
THREAD_MODE any Thread handling mode (see below)
THREAD_CATEGORY 1 Category ID for new threads (defaults to Uncategorized)
THREAD_TITLE Daily Discussion Thread - {date} Title template for new threads
THREAD_CONTENT (built-in) Content for the first post in new threads
THREAD_MAX_AGE_HOURS 48 Only reply to bot-started threads within this age

Thread Mode Options:

Mode Description
startup Creates a new thread when the bot starts. Only replies to threads it started (within 48 hours).
daily Creates a new thread every day at 08:00 UTC. Only replies to threads it started (within 48 hours).
weekly Creates a new thread every Monday at 08:00 UTC. Only replies to threads it started (within 48 hours).
any Replies to any thread where the bot is mentioned (default behavior).

Title Template Placeholders:

  • {date} - Current date in YYYY-MM-DD format
  • {weekday} - Current day of the week (e.g., "Monday")
  • {hhmm} - Current time in HHMM format (UTC, e.g., "1430" for 2:30 PM)

LLM Backend Configuration

The bot works with any OpenAI-compatible API. Settings are passed to LocalAI via the REST API when the model is loaded.

Variable Default Description
LLM_HOST http://localhost:8080/v1 OpenAI-compatible API endpoint
LLM_MODEL gpt-4 Model name to use for API calls
LLM_MODEL_URL - LocalAI model URL for /models/apply (see below)
LLM_API_KEY - API key (optional - LocalAI doesn't require one)
WEB_SEARCH_API_KEY - API key from ollama.com/settings/keys

Generation Parameters

These parameters control how the LLM generates responses. When LLM_MODEL_URL is set, these are passed to LocalAI via the /models/apply REST API endpoint when loading the model.

Variable Default Description
LLM_TEMPERATURE 0.7 Sampling temperature (0.0-2.0). Higher = more creative/random, lower = more focused.
LLM_TOP_P 0.9 Nucleus sampling threshold (0.0-1.0). Lower values = more focused on likely tokens.
LLM_TOP_K 40 Consider only the top K most likely tokens.
LLM_MAX_TOKENS 1024 Maximum number of tokens to generate (0 = unlimited).
LLM_CONTEXT_SIZE 2048 Maximum context window size in tokens. Larger values use more memory but allow longer conversations.
LLM_REPEAT_PENALTY 1.1 Penalty for repeating tokens (1.0 = no penalty). Higher values discourage repetition.
LLM_THREADS 0 CPU threads for inference (0 = auto-detect).

Temperature Guidelines:

  • 0.3-0.5: Very focused, factual responses
  • 0.7: Balanced (default)
  • 0.9-1.2: More creative, playful responses

Model Loading via LocalAI

When LLM_MODEL_URL is set, the bot automatically loads the model on startup using LocalAI's /models/apply endpoint. This allows you to:

  1. Download models from HuggingFace or the LocalAI gallery
  2. Configure the model with your custom system prompt
  3. Set all generation parameters at the model level

Example Model URLs:

# List available models:
curl http://localhost:8080/models/available | jq -r '.[].name'

# LocalAI Gallery models (recommended):
LLM_MODEL_URL=llama-3.2-3b-instruct:q4_k_m   # Small, fast (1.9 GB)
LLM_MODEL_URL=llama-3.2-3b-instruct:q8_0     # Higher quality
LLM_MODEL_URL=llama-3.3-70b-instruct         # Large, requires lots of RAM

How it works:

┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
│   Your .env     │  →   │   Bot startup   │  →   │  LocalAI REST   │
│                 │      │   (Node.js)     │      │  /models/apply  │
└─────────────────┘      └─────────────────┘      └─────────────────┘

The bot reads your .env file and sends a POST request to LocalAI:

{
  "id": "llama-3.2-3b-instruct:q4_k_m",
  "name": "gpt-4",
  "overrides": {
    "system_prompt": "You are a helpful assistant...",
    "temperature": 0.7,
    "top_p": 0.9,
    "top_k": 40,
    "max_tokens": 2048,
    "context_size": 4096,
    "repeat_penalty": 1.1
  }
}

How It Works

  1. Startup: The bot connects to LocalAI and loads the configured model with your settings

  2. Polling: The bot polls Discourse for new notifications every 30 seconds (configurable)

  3. Mention Detection: When someone @mentions the bot (e.g., @discussy what is X?), it:

    • Fetches the post content
    • Gets reply chain context (follows the conversation thread)
    • Searches the forum for relevant existing posts
    • Performs web search for external information (if API key configured and question needs it)
    • Generates a response using the LLM
  4. Rate Limiting: The bot waits between replies to avoid spamming (configurable via MIN_REPLY_INTERVAL_MS)

  5. Persistence: Processed notification IDs are saved to disk, so the bot won't reply twice to the same mention even after restarts

Knowledge Sources

Forum Search

The bot automatically searches your Discourse forum for relevant posts before generating a response. This helps provide context-aware answers based on existing discussions.

Web Search

For questions that may need external knowledge, the bot can search the web using the Ollama Web Search API. Web search is triggered when questions contain patterns like "what is", "how to", "latest", etc. Get your API key from ollama.com/settings/keys and set WEB_SEARCH_API_KEY.

Docker Configuration

Apple Silicon (M1/M2/M3/M4/M5)

The docker-compose.yml is configured for ARM64 architecture:

localai:
  image: localai/localai:latest-aio-cpu
  platform: linux/arm64

Intel/AMD (x86_64)

Remove the platform line or change to:

localai:
  image: localai/localai:latest-aio-cpu
  platform: linux/amd64

Development

Local Build

npm install
npm run build

Run Locally (requires LLM backend running)

npm start

Docker Build

docker-compose build

Project Structure

├── src/
│   ├── index.ts              # Entry point
│   ├── bot.ts                # Main bot logic
│   ├── config.ts             # Configuration loader
│   ├── discourse-client.ts   # Discourse API client
│   └── llm-client.ts         # LLM API client (OpenAI-compatible)
├── .github/workflows/        # CI/CD workflows
├── docker-compose.yml
├── Dockerfile
├── .env.example
└── package.json

License

Read Here

About

Discourse forum bot using LocalAI backend with model selection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 2

  •  
  •