Yul Yen's AI Orchestra

Translation note (2025-10-30): This document is an English translation of docs/de/ReadMe.md at commit 8d8c4b7d30a63adb857a251be6b1331529267e69.

Yul Yen's AI Orchestra is a locally running AI environment that combines multiple personas (Leah, Doris, Peter, Popcorn).

All personas are based on a local LLM (currently via Ollama or compatible backends) and come with their own characters and language styles.

The project supports:

Terminal UI with colored console output & streaming
Web UI built on Gradio (accessible within the local network)
AI dialog (self-talk) between two personas (terminal + web)
Text-to-speech (TTS) with automatic WAV generation in terminal mode
API (FastAPI) for integration into external applications
Wikipedia integration (online or offline via Kiwix proxy)
Security filters (prompt-injection protection & PII detection)
Logging & tests for stable usage

Goals

Provide a private, locally running AI for German-language interaction
Multiple characters with distinct styles:
- Leah: empathetic, friendly
- Doris: sarcastic, humorous, cheeky
- Peter: fact-oriented, analytical
- Popcorn: playful, child-friendly
Extensible foundation for future features (e.g., LoRA fine-tuning, tool use, RAG, STT)
KISS principle: simple, transparent architecture

Architecture overview

Configuration: All settings centrally stored in config.yaml
Core:
- Swappable LLM core (OllamaLLMCore, DummyLLMCore for tests) including YulYenStreamingProvider
- Wikipedia support including a spaCy-based keyword extractor
Personas: System prompts & quirks in src/config/personas.py
UI:
- TerminalUI for the CLI
- WebUI (Gradio) with persona selection & avatars
- Optional ask-all broadcast mode (enable ui.experimental.broadcast_mode) via the Ask-All option in the terminal start menu and the Ask-All card in the web UI
API: FastAPI server (/ask endpoint for one-shot questions)
Logging:
- Chat transcripts and system logs in logs/
- Wiki proxy writes separate log files

Prerequisites

Python 3.10+
Ollama (or another compatible backend) with an installed model, for example:
```
ollama pull leo-hessianai-13b-chat:Q5
```
For tests without Ollama you can set core.backend: "dummy" – the echo backend requires no additional downloads and is suitable for CI or quick prototyping.
Optional for offline wiki usage:
- Kiwix + German ZIM archive

Installation

git clone https://github.com/YulYen/YulYens_AI.git
cd YulYens_AI

# Create virtual environment
python -m venv .venv
source .venv/bin/activate   # Linux/macOS
.venv\Scripts\activate      # Windows

# Install dependencies
pip install -r requirements.txt

Language model for spaCy

The Wikipedia integration requires a spaCy model that matches your configured language. The keyword finder now looks up the correct package via the combination of language and wiki.spacy_model_variant, using the mapping in wiki.spacy_model_map inside config.yaml. This keeps the model choice entirely in configuration, without hard-coded defaults.

Example:

language: "en"
wiki:
  spacy_model_variant: "medium"
  spacy_model_map:
    en:
      medium: "en_core_web_md"
      large:  "en_core_web_lg"

Additionally, you have to install the corresponding model manually:

# Medium model (balance between size and accuracy)
python -m spacy download en_core_web_md

# Large model (more accurate, but slower and uses more memory)
python -m spacy download en_core_web_lg

Usage

Configuration (`config.yaml`)

All central settings are controlled through config.yaml. Important toggles:

language: controls UI texts and persona prompts ("de" or "en").
ui.type: selects the interface ("terminal", "web", or null for API only).
tts.enabled: enables/disables text-to-speech.
tts.features.terminal_auto_create_wav: attempts to create one WAV file per reply in terminal mode (currently Windows-only due to winsound dependency in tts.audio_player).

Example:

language: "de"
core:
  # Choose backend: "ollama" (default) or "dummy" (echo backend for tests)
  backend: "ollama"
  # Default model for Ollama
  model_name: "leo-hessianai-13b-chat.Q5"
  # URL of the locally running Ollama server (protocol + host + port).
  # This value must be set explicitly – there is no silent default.
  ollama_url: "http://127.0.0.1:11434"
  # Warm-up: whether to send a dummy call to the model at startup.
  warm_up: false

ui:
  type: "terminal"   # Alternatives: "web" or null (API only)
  web:
    host: "0.0.0.0"
    port: 7860
    share: false       # Optional Gradio share (requires username/password)

wiki:
  mode: "offline"    # "offline", "online" or false (disabled)
  spacy_model_variant: "large"  # Alternatives: "medium" or direct model name
  proxy_port: 8042
  snippet_limit: 1600           # Maximum length of a single snippet in characters
  max_wiki_snippets: 2          # Cap for how many different snippets can be injected per question

LLM backends

The key core.backend determines which LLM core is used:

ollama (default) integrates a running Ollama server. The Python package ollama needs to be installed (e.g., via pip install ollama), and core.ollama_url must point to the Ollama instance.
dummy uses the DummyLLMCore, which returns each input as ECHO: .... This is ideal for unit tests, continuous integration, or demos without an available LLM. In this mode a placeholder for core.ollama_url is sufficient; neither a running Ollama server nor the Python package is required.

Security guard

The security section selects the guard for input and output checks:

security.guard: "BasicGuard" (default) loads the built-in base protection. The toggles prompt_injection_protection, pii_protection, and output_blocklist control which checks are active.
security.guard: "DisabledGuard" disables the checks via a stub. The aliases "disabled", "none", and "off" are accepted as well.
security.enabled: false disables the guard logic entirely, regardless of the selected name.

Wikipedia (proxy & autostart)

In offline mode (wiki.mode: "offline"), kiwix-serve can be started automatically when wiki.offline.autostart: true is set.
wiki.max_wiki_snippets controls how many distinct Wikipedia excerpts may enter the prompt (default: 2), so multiple hits are useful without overloading the context.

Launch

python src/launch.py -e classic

The --ensemble (short -e) parameter selects which ensemble definition to start. classic is the default choice for the regular experience. You can try another ensemble, such as the spaceship_crew example, by running:

python src/launch.py -e examples/spaceship_crew

For a complete walkthrough on building your own ensemble, see Adding a custom ensemble.

On Windows, replace / with \ (examples\spaceship_crew).

You can optionally pass an alternative configuration file via --config (short -c) alongside the ensemble parameter, for example:

python src/launch.py -e classic --config path/to/config.yaml

Terminal UI
- Use in the terminal when ui.type: "terminal"
- Input: simply type your questions
- Commands: exit (quit), clear (start a new conversation)
Web UI
- With ui.type: "web", a web interface starts automatically
- Open in the browser: http://<host>:<port> according to the ui.web settings (default: http://127.0.0.1:7860)
- Optional: enable Gradio share via ui.web.share: true; credentials come from ui.web.share_auth
- Pick a persona and start chatting
API only (no UI)
- Set ui.type: null – FastAPI keeps running and serves /ask

API (FastAPI)

Automatically active when api.enabled: true

Example request using curl:

curl -X POST http://127.0.0.1:8013/ask \
     -H "Content-Type: application/json" \
     -d '{"question":"Who developed the theory of relativity?", "persona":"LEAH"}'

Example

Question (Leah):

Who is Angela Merkel?

Answer (streamed):

Angela Merkel is a German politician (CDU) who served as the Chancellor of the Federal Republic of Germany from 2005 to 2021. …

Tests

Run with pytest:

pytest tests/

Status

🚧 Work in progress – stable to use, but under active development (including initial LoRA fine-tuning experiments). Private project, not intended for production use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yul Yen's AI Orchestra

Goals

Architecture overview

Prerequisites

Installation

Language model for spaCy

Usage

Configuration (`config.yaml`)

LLM backends

Security guard

Wikipedia (proxy & autostart)

Launch

Example

Tests

Status

FilesExpand file tree

ReadMe.md

Latest commit

History

ReadMe.md

File metadata and controls

Yul Yen's AI Orchestra

Goals

Architecture overview

Prerequisites

Installation

Language model for spaCy

Usage

Configuration (config.yaml)

LLM backends

Security guard

Wikipedia (proxy & autostart)

Launch

Example

Tests

Status

Configuration (`config.yaml`)